VocaTalk Free Background album is now available for download

June 26, 2010 Leave a comment

I’ve been working on the free background album for quite some time. The purpose is to provide a variety of different genres of music for VocaTalk users to test. You can download the free album 1 here.

This album includes the following tracks, all of which were composed by me:

  • Ambient Relaxation – Night is short
  • Ambient Rhythmic – Antrophic principle
  • Breakout
  • Drift
  • Existance the Sweet
  • Exoplanet hunter
  • First Step
  • Interstellar
  • KuiperBelt
  • Miracle
  • Movie Score – Memorable times
  • Transbeatix 1
  • War of the Worlds

Now, I’m not a musician.  So please keep that in mind if you want to critisize.  Again, the point is to give users DRM free access to some music usable as background for VocaTalk podcasts.

Initially, I was thinking of making only ambient music.  But as I experimented with other genres, I decided that they are also good for listening with speech.  Keep in mind, the more music you have in your library, the richer the experience.

Have fun!


Craig Venter’s synthetic life and its implications

June 25, 2010 Leave a comment

If you haven’t watched Craig Venter’s Synthetic Life videos yet, you must!  This is one of those historic moments like the invention of transistor which led to today’s computers.  There’s a lot of controversy and speculation around this concept.  I think some people don’t understand the implications of this technology.

This breakthrough has philosophical as well as practical implications.  Life is treated like any other machine.  It is stored in digital media, and then produced to come alive again.

In about 10-20 years, life has become the domain of engineering from something beyond, something untouchable.  I think, this last invention really is the killer app of all times, and perhaps human history.  I’m serious about this.  It’s probably even bigger than computer revolution.  Because, for the first time in the history of earth, and probably the whole universe, a discontinuation in life’s chain has occurred.  A cell without a parent cell has been born.  As Venter puts it, ‘a cell whose parent is a computer’.  This means, life was created by just putting the right chemicals in the right order.  There was no magical ‘life force’, no special ‘touch’ from above.  It was human designed computer and just chemicals to start with.  Well, the information came from a real cell.  But it came as ‘information’ only, not in any physical form.  So a living being was created only based on the data in the computer.  You can imagine, that data could also be manipulated before it is printed out and inject into a cell.

Venter’s team printed out a full DNA from a yeast cell and injected it into a living empty cell.  The cellular machinery booted up and started functioning just like a normal cell, but it was using the manufactured artificial DNA.   Although the information contained in the DNA was mostly the same as the original yeast, Venter’s team also encoded some signature into it so it can be identified and used as a verification that the technology worked.  They even put their names, and an email to contact if someone decodes the special signature sequence.  The technique they used prevents those encoded extra data from being treated as protein formula and is just ignored.

Now, I’m an engineer, not a biologist, but I think I can see some interesting uses other than the usual stuff that everyone talks about.  Since I’m not in a position to actually implement any of these, I’ll use my freedom to improvise without current technical limitations, assuming they will sooner or later be overcome.

1. Just like computer publishing, this will open up an accelerated new era of biological engineering.  You can easily modify genes in the computer and probably even simulate what they’ll produce once the necessary cellular machinery is fully understood.  You can basically design living things and once you like the design, you can just print out the DNA and produce the living being.  There’s no limit to what you could design, but probably the first things will be, medicine, custom made bacteria to do environmental cleanup, reversal of pollution, or produce cheap fuel.  Then the bigger projects will slowly start appearing:  custom made plants that were designed by accelerated evolutionary algorithms on super computers.  I’m not sure how long it will take until they can design a custom made human being.  But they can start by simulating crossing in computer environment to produce custom made animals.  I think, we’ll need real big supercomputers to simulate evolutionary algorithms for complex animals.

2. You could scan the DNA of each different tissue in your body as it is now, and store it in digital media.  When you have any illness, or when you get old, just print the previously taken DNA’s to generate any tissue in your body.  This could even create a whole new industry of life extension.

3. Use this technology for interstellar travel.  Instead of sending mature humans, just send digitally encoded DNA sets for humans and all the animals, plants and bacteria to create a biosphere from just ‘information’ once a small ship reaches its destination.  Or just send machines to produce cells, and beam the information up.  Until the ship gets to its destination, we’ll have enough time to design life that’s adapt to the final destination.

4. Send such ships to many star systems, search for life, or produce life.  Hey, maybe that’s how it all started on earth!  But that’s a separate discussion..

5. Beam living things in long distance. But it’s not like the teleportation in Start Trek or ‘The Fly’ because the original copy is not lost.

6. You can pretty much replicate whole living things if you can scan all cells, or at least the necessary minimum.  Only, you have to figure out how to assemble the produced tissues.

7. Design DNA based cellular machines that do not try to survive, but instead compute.  You could pretty much produce any cellular machinery that will do some computation and run many such computations in parallel.   Real massive parallelism as taught by mother nature.

Some say this technology can wipe out humanity. OK, but wait, an asteroid could wipe out humanity as well.  Unless we can somehow go to space and unchain ourselves from earthbound life, we’ll be waiting for our killer asteroid with our name on it.  How about human made other technologies?  How about environmental disasters or climate change that may also have a dramatic impact on human future.  All technology has good and bad uses.  But mother nature is the most graceful friend and sometimes the most cruel enemy of man.  This type of technological advancement can and should help humanity survive both its own failures and mother nature’s wrath.  There will come a day, when almost all people benefit from it, but still continue to discuss whether it’s good or bad.

What I did today while assembling my Ikea furniture at home

June 24, 2010 Leave a comment

I’m on vacation and decided to stay home this time.  Well, not all vacations are created for rushing to get some rest.  The only thing that woke me up was the new Ikea furniture I had to assemble.  But wait, I’ve got stuff to read and have some fun personal time listening music.  How about getting some sun bath by the pool?  Apparently, I have to do the assembly today and my wife says in fact, I have to do it now!

But wait, I have my iPod.  And not only iPod, I have VocaTalk Personal Podcast!  So I just turned the stuff that I was going to read into a podcast episode with some music in the background.  That way, I could work, listen music, and listen to the text at the same time.  It was so fun, I don’t even know how much time it took to assemble the whole thing.  It was a quite big one with lots of pieces to put together.  So I had to take some rest and drink my beer while our little cute kitten ‘Cumin’ was jumping around to get some attention.  While going through all that, I finished my reading and switched to another podcast that I had generated previously.  It was a public domain novel.  The atmospheric background music made me feel like I’m in a movie theater.

When the assembly finished, I was still listening to my podcast.  Well I have an infinite number of them.  Because I can generate them myself, based on any content I like; articles, blogs, ebooks, pdf files, word documents, or any other text.  VocaTalk generates the speech, puts music in the background and some other sound effects that make listening even more fun. It’s like no other text-to-speech app.  It’s so fun to listen to these podcasts, I can’t stop listening the whole day!

Maybe tomorrow, I can get back to my lazy mode and go to the pool while still listening to my podcats… I mean podcasts…

Giving Text-to-speech a soul

June 21, 2010 Leave a comment

According to Wikipedia, the first computer based complete speech synthesis was created in 1968. After more than 40 years of evolution, text-to-speech is still not as natural as it should sound. Why do we still feel the lack of human soul behind the monotonous voice? I guess it’s because all speech systems today do a direct translation from text to voice. There’s almost no intermediary level of abstraction that analyzes the meaning and the context. Speech synthesizers use a database of phonemes and common speech patterns. They may also have a pronunciation dictionary for words that have exceptional reading rules. Imagine a human being who can read the text pretty good, but has no idea what the text is talking about. Ask your 11 year old to read an article from Scientific American or some other text with heavy technical language. That’s pretty much what you get from today’s speech synthesizers. They literally do not know what they’re talking about!

It may still be possible to achieve a better quality by just increasing the database sizes. This approach, although practical, has a fundamental problem: Such systems are still rule-based and rigid, and for some reason, human brain is adapt to distinguishing things that are organic and alive from things that are fixed and monotonous.

How can speech synthesis be made organic? I don’t know the exact answer to this question. Maybe something like a neural network layer controlling synthesis parameters plus some random aspect could make the output imperfect and hide the monotony of underlying rigid rule-based computation. Or a simulation of human vocal tract with partly neural network based control may make the voice more natural sounding.

But still, the artificial speaker will have no idea of the context of the speech. For that, a model of what is meant in the text has to be created and used as a minimal version of simulated consciousness. OK, that’s a bit too ambitious way to put it. But at least, such a system could know who is talking in a novel, the gender of the speaker, relationships of entities, their connections to human concern or even emotional states. This kind of information cannot be directly found at the phoneme, word, phrase, or even at the sentence level. It requires a deeper semantic analysis.

I don’t mean the system has to really understand everything in the text. All I’m saying is it has to have some simulation of understanding by having an internal model of what may be the meaning so that it can use that information to drive the speech synthesis parameters.  This could also be based on neural network and trained on a preanalyzed text.

Speech technology is an active field that is advancing rapidly. So we can expect to see more and more voices, newer technologies to generate them, and much more intelligible reading.  Speech synthesis have almost come to a maturity level where we’re seeing the diminishing returns from classical approaches.  Something new is needed to give it the human soul so people can use it more to replace reading and unchain themselves from the computer desk.

My theory is, no matter how good speech synthesis gets, we’ll still need something to make it more fun for longer listening experience. You can use a very advanced speech synthesis technology to answer phones, give short messages on some devices, or read your emails.  But to read a whole article or a book without getting bored, you still need more than mere intelligibility.  Even audiobooks read by professional speakers require some fancy extras to be bearable. A sci-fi book read with the proper mood on an atmospheric background music and some sound effects is much more preferable than a solo human reader.

Good news is, we can do that sort of improvements to any text-to-speech output even today.   Here’s a sample text-to-speech reading of ‘The War of the Worlds’ with background music and some sound affects that you don’t normally hear in a usual text-to-speech application.  It was generated by VocaTalk Personal Podcast. So why don’t everybody do this?  Partly because the tools didn’t exist for the end user to do it, partly because some think it makes speech unintelligible.  Well, if that’s the case how can we listen to documentary soundtracks or musical audiobooks?  How can we hear multiple people speaking on a movie sound track with additional loud sound effects and background music?  How can we hear a singer in the background of so many instruments playing?  Some can even distinguish individual instruments.  We can, and we prefer to listen with music because it takes the experience to a whole new level.  Plus you don’t have to spare separate time for music and text listening.

VocaTalk includes other improvements to make the speech more bearable for longer listening.  For instance, it puts more periods of silence between paragraphs to give you time for thinking about what you just heard.  It’s much more comfortable to digest spoken text with extended periods of silence.  This also gives you time to enjoy the music.  Other features are; changing voice at every paragraph, changing voice position, changing voice pitch and dynamically changing echo effects.  The ultimate goal is to create a richer and more fun experience for today’s text-to-speech users.  One step beyond the usual.

Once you experience it, it may change the way you keep up with information. It’s really information delivered to you whereever you are and whenever you want it.  It’s more alive than just a bunch of characters on the screen or just a bunch of monotonously spoken words.  No matter how good speech synthesis gets, we’ll still need these little improvements;  because if words talk to your reason, music talks to your soul.

Ultimate tool for the ultimate geek

June 20, 2010 1 comment

So you’re a geek, huh? Well, nice to meet you then!

Are you reading sci-fi books, sci-am, electronics design or programming magazines, thinking and discussing about crazy sounding but OK ideas that may change humanity’s future? You must have heard about the latest discovery at LHC! Well, I haven’t yet 🙂 So tell me if they found the Higgs boson dude.

Being geek is actually OK, and I know you know that! That’s why I’m asking you to stay that way. But you could also be a cool geek. A geek that does outdoor activities like biking, exercising, jogging. But for that, you must first start looking as if you don’t read that much and have enough time to do all those, right? It’s not that hard really. All you need to do is, just start listening instead of reading. That’s it. You already know and probably used text-to-speech tools to convert some text into mp3 and listen on your iPod or Zune, right? If not, I doubt you’re geek enough to be called geek. But if you did, stay tuned! Because I’m just about to change your whole experience and all you know about text-to-speech.
What if you could rip your CD music and play on the background of speech, use multiple voices in the generated speech, split huge files into multiple parts and generate local podcasts for you to easily manage an endless stream of such podcast episodes? There’s a tool designed just for that. It is specially designed to fold your reading and learning habit and turn hours and hours of reading and learning into a fun and enjoyable one. Hours lost in commute, at gym, shopping, driving can be reclaimed just by using this unique tool.

That tool is, VocaTalk Personal Podcast. It generates local podcasts based on the content you provide. What makes it unique is its specially designed features like putting background music to speech, changing the speaker voice at each paragraph, moving speaker’s voice position and modulating voice pitch. It’ll generate an endless stream of podcast episodes for you. Turn ebooks, web content, pdf files, blogs, word documents and more into a audio documentary like recording, while also enjoying the music in the background.
Well, it’s hard to explain the whole experience. You’ve got to see it, or rather hear it for yourself. You can listen to sample episodes of it, or just download and request a free beta license.

Here’s how it looks on my iPod Touch:

VocaTalk podcasts on iPod Touch

It also embeds album art and the original text content into the media file, so you can view it on your player.

As the author of the tool, I’ve created a personal learning experience to bring the information to me whereever I am and whenever I want.  I listen almost anything on my iPod anywhere, anytime.  I folded my reading habit probably 10 times.  I can read things that I have never thought I could have the time for.  It changed my life.  It will change yours too!

Make it fun!

Categories: VocaTalk

Welcome To VocaTalk Personal Podcast Blog

June 18, 2010 Leave a comment


I’m the author of VocaTalk Personal Podcast, an application that turns any given content into a high quality podcast or audiobook using text-to-speech, background music of your choice, and other fun audio fx. I’ve been using text-to-speech, since the Commodore 64 SAM. To give you an idea about what it is, I just did a quick search on youtube. And sure enough, it was there 🙂

Thanks redrumloa!

And then, Sound Blaster’s Monologue seemed to me like a quantum leap back then. I used to listen all the help files and whatever I could find online using Monologue. It was a powerful tool to those who could bear the monotony. I guess the name should have been Monotonous Monologue or something. However boring it sounded, it was definetely way beyond anything you could find on the PCs.

Nowadays, the speech technology has advanced so much that it is hardly noticeable that it is synthesized by computer. OK, If you’re new to this, you’d probably notice it quicker 😉 Anyway, you got the point, to me it’s really good compared to 10 years ago. And it is still advancing at an ever greater speed.

However, you’ll still feel the monotony after listening to the speech for more than 5 minutes. Today’s text-to-speech, is still not really advanced enough to sound as natural as human speech.

Starting from this fact, I decided to create something that’ll reduce that monotony and improve the experience. I started with putting background music to the speech. And guess what, it was a whole new experience now! Yes, just by putting some background music, the overall experience becomes so much more fun, you can listen to generated speech for hours and hours.

VocaTalk started as a simple windows application that just put music in the background. Then I rewrote the whole thing again on Microsoft’s new UI technology WPF and added other features that elevated the experience to a whole new level. To see what I mean, you really have to test it for yourself. Just to give you some idea, here are some key features that VocaTalk supports today and no other application does as of today:

– background music,
– binaural beats, brainwave entrainment,
– digital signal processing, audio fx (echo, positional audio, voice modulator, and more),
– local podcasting,
– direct integration with iTunes and Zune,
– saving original documents in rich format and queue them for later generation,
– ability to rip CD music to put to background,
– ability to put album art,
– ability to split large files into episodes (you can even turn a whole book into multiple podcast episodes or audiobooks),
– background generation,

With all these features and upcoming new ones, I’m trying to create a unique experience that is specifically designed for people who read and learn a lot. Using VocaTalk, you’ll be able to catch up with information and even find time to read things that you didn’t imagine you could find the time for. Since I am the first and the primary user of my application, I am designing a personal experience and want to share it with everyone who loves learning.

To give you an idea bout what it sounds like, watch the ‘War of the Worlds synthesized by VocaTalk’:

There are some other samples too, and you can even download a sample episode to listen on your iPod, Zune or other mp3 player. You can also download the app and use the current beta for free by simply requesting a beta license.

In this blog, I’ll keep posting interesting articles, tips and tricks, and other useful information that I think people like me would like to know about.

To learn more about VocaTalk, watch the videos on VocaTalk’s youtube channel: http://www.youtube.com/user/ibenian
and check out the homepage: http://www.vocamedia.com

Categories: VocaTalk