Speech Technology Magazine SpeechTEK Conference
 
Adam B.   —   March 4, 2010 @ 11:24 am

Hey Speech-Heads,

Check out this great use of TTS:

Movie Critic Roger Ebert–who is recovering from a serious bout with thyroid cancer that rendered him speechless–has been using TTS to communicate.  However, Ebert’s TTS recently got a new voice–his own.  His new voice by CereProc is programmed from collected audio clips and snippets from the movie critic’s many television appearances and DVD commentaries.

Check out this Esquire Article for more information.

And check out Ebert’s appearance on Oprah for his 2010 Oscar Picks.

Eric B.   —   May 8, 2009 @ 7:35 pm

Kindle, didn't you study for this exam at all?Today, The New York Times reported that Amazon’s Kindle 2 much vaunted text-to-speech (TTS) capabilities, provided by Nuance Communications, came up short when trying to pronounce President Barack Obama’s name. The device uttered something closer to Baah-raah-k O-baah-maah (closer to the sounds in “black” and “Alabama,” the Times said. The paper adds that the problem has since been corrected. Obama’s name has added to the Kindle’s TTS dictionary and will be included in the next wireless update.

The Kindle TTS misfire came to prominence as many news organizations began openly speculating on whether subsequent versions of the Kindle could create a viable non-paper-based means of distribution. Wired, for instance was running the headline “How the Next Kindle Could Save the Newspaper Business” in stories about partnerships the The New York Times and Washington Post were looking to hatch, while mediabistro.com pondered, “Can The Kindle Save Newspapers?” Whether any of that’s true, the failure of Kindle’s TTS to pronounce things like the President’s name correctly may put at least a temporary crimp in any role speech might in any Kindle paper-saving venture.

When it comes to that though, don’t blame Nuance. (more…)

Adam B.   —   April 30, 2009 @ 9:36 am

Do you know what time it is Speech-Heads?

Time for the latest installment of Talking Gadget Theater.  This time, Kindle 2 TTS and iPod shuffle TTS are performing a scene from The Empire Strikes Back.

As my Speech Brother Eric B. said: “This may be their best work to date!  Also: I love sandwiches.”

Adam B.   —   April 28, 2009 @ 9:42 am

Hi Speech-Heads,

If you enjoyed my last post, then you will definitely want to check out the next installment of Talking Gadget Theater.

This time, Kindle 2 TTS and iPod shuffle TTS perform a scene from Star Trek II: The Wrath of Khan.

When I told my Speech Brother Eric B. about this one, he tore off his necktie, popped some corn and shouted: “Now, let’s Head For The Mountains!

Eric B.   —   March 25, 2009 @ 10:42 am

Que pasa, baby?In our continuing examination of speech technology in popular culture, I bring to you one of the weirder things we’ve stumbled across.

You’re all doubtlessly familiar with Loquendo TTS work my brother, Adam B., does every week for our news items on the mother site, but apparently its popular use runs far deeper. Our investigations show that there is an entire community of Spanish-speaking YouTube users who are using Loquendo’s TTS to make video críticas or “criticisms. The críticas are rants chockfull of curses and insults leveled on their subject which range from Dragonball Z to emo kids, a subculture of much maligned, droopy-haired teeners who patron a genre of existentially sentimental rock music also known as “emo,” that are delivered by Loquendo TTS products.

The  videos are pretty similar to one another and vary only in target. All, as best we can tell, use the Castilian Spanish male voice font, “Jorge,” and many seem to take advantage of the free demo that Loquendo offers on their website, marked as such because they have this creepy music that Loquendo puts in the background of their demo files–a chorus of synthetic TTS sirens singing Loquendo! over and over again. You’re going to want to check that out for yourself.

For the most part,  críticas level their attacks at popular television institutions House, Pokemon, and the Disney Channel (there seems to be an entire subgenre of just Disney Channel críticasI found heaps of them on YouTube), but there are also some vaguely offensive works like “How to Seduce a Woman,” (in which the narrator explains the importance of body language) and other such lessons.

Here’s a typical specimen:

[youtube]http://www.youtube.com/watch?v=ERueevnv-YU&feature=related[/youtube]

While these things are easy to write off as just a bunch of immature adolescents or, at worst, adults slagging around, críticas still offer some insights about the future of speech.

We mostly think of Loquendo’s TTS offerings as being business oriented, allowing companies to generate a spoken interface on-the-fly in IVR phone systems or whatever. In these instances, the software essentially allows a big company, an abstract collective, to give a single voice, separate from any real living entity in the world, to itself. This, however, cuts right back in the other direction, and allows individuals to deliver spoken messages anonymously, assuming a synthetic and collectively created voice. Individuals can hide their gender, their age, their nationality-any number of things which might be inadvertently revealed in the expression of their biological voice.

The anonymity in críticas, pretty much authorizes users to curse left and right and traffic in the most backward kind of homophobia. That is, it lets them spout all the words they wouldn’t normally say in public–like a vocalized internet flame–so you get a lot of puta madre this and that. In point of fact, just about every other word in these videos is puta, a Spanish insult for a female sex-worker. The TTS writers get pretty creative with it, transmogrifying the word into just about every noun, adjective, and verb form imaginable. Putanizada, putilla, and putón are just some of the kinds of the flourishes that they luxuriate in.

Also interesting, the otherwise coarse language is couched in fairly complex grammatical structures. The work of one 2Alfredo2, in particular, makes heavy use of interjected clauses. When combined with colorful grammatical plays on common Spanish insult words like jilipolla, the overall effect is both formal and pruriently vulgar. It kind of sounds like a high school English essay gone wrong.

They say Montana is the "Cyrus State."Granted, this work is pretty limited in scope to say the least. There’s only so many kicks you can get out of listening to a machine tell off Hannah Montana with every curse word out of the Real Academia Diccionario de Palabortas. But TTS is an artistic medium in its infancy. There is real potential for the anonymity afforded to users to do good.

TTS might allow human rights activists under repressive regimes, and other marginalized voices, to express their deepest feelings without compromising themselves. It, moreover, gives them access to auditory-dependent media infrastructures like podcasting. Likewise, the cadences in TTS are still, for all Loquendo’s immense advancements in recent years, still sometimes jerky, and, in their strangeness, embody a certain set of aesthetic values that can probably be capitalized by artists willing to engage with them.

There are also potentially harmful effects. The technology could be used to anonymize all sorts of ill-intent and to dissemination any hateful message one might care to pass along. TTS is just a tool, like any other, but as speech starts making its way out of the rarified corridors of business, it is likely to begin to be plied for all sorts of artistic and political ends.

If you have your own TTS art you’d like to share, please leave us a comment!

Eric B.   —   February 12, 2009 @ 12:32 pm

A little birdie told me...In our continuing efforts to ferret out speech solutions that help you express l’amour to that special Speech Head in your life, we stumbled on this.

EasyJet, a British, low-cost, airline carrier, for reasons that are scarcely apparent to my brother, Adam B., or even me, is offering a cute, little, and free text-to-speech message service in French, Message d’Amour. When you get one of these cards, you see a bucolic field. A little rotund orange bird flies in, center screen, and speaks your message in childlike French.

Message is part ad firm 1000mercis’s campaign to ingratiate EasyJet to the ranks of French flyers. I guess they’re hoping if you send a message to your far-off, forever crush–that girl you loved in your Marseilles school days, but who went to London to work as some kind of banker–will get that tender TTS message and write back (in her own TTS, of course) saying, yes, she harbors reciprocal feelings, yes, she’s loved you for your awkward inathletic stumblings on the soccer field since those halcyon école secondaire days, and that you, smitten with the heavy clobber of love’s promise, will remember the good turn EasyJet did you and hop their first flight to London, regardless of cost, and form a long-distance relationship that requires an endless booking of trans-European flights.

A beautiful dream, no?

Under the hood, Message is powered by Acapela’s TTS engine, which is available in 25 languages in 50 voices, both male and female. EasyJet, despite having websites in English, German, and Italian, in addition to French, and despite Acapela being able to service those languages, hasn’t created analogous service for any of them. Only French, which I suppose is wholly appropriate for Valentine’s Day, it being the “language of love.”

Popular rumor even has it that the Parisian municipal government pays attractive young couples to neck in public so as to retain the city’s image as the “World Capital of Love.” I haven’t been able to corroborate that rumor with any evidence, by the way, but even if it’s not true, it goes to a kind of truism about French PDA. If you’ve ever been to Paris and seen these ubiquitous couples on benches, you know what I’m talking about.

In any case, even if you don’t speak French, Message d’Amour can still provide you with hours of fun because the engine is so good that if you cheat the spelling, you can send English messages in outrageous French accent. Just remember to separate the phonemes with commas and spaces when needed and to spell things like dear as dee air.

Try your own! And let us know about it.

Bon apetit, Speech Heads.

Eric B.   —   January 30, 2009 @ 4:16 pm

Two thumbs up! Way up!Speech Heads, I know we don’t usually blog here on Fridays, but I just caught wind of a text-to-speech flash greeting card generator even weirder than that vaguely horrifying thing that Nuance put together for Christmas:

The CareerBuilder Anonymous Tip Builder

The builder is Careerbuilder.com’s of a sick joke. It allows users the ability to “anonymously” send TTS tips to coworkers without them knowing who sent it. You can say things like, “Please do something about the stench the emanates from the billowy folds in your shirt,” or “Be sure to refill the coffee pot when you finish it off, you sower of human misery,” and they will never be the wiser about who sent it.

Your personalized message is delivered by one of four terror-inspiring avatars framed in office tableaus. Choose from this nipple-shirt guy you see featured here, or an alligator in a suit, or a floating zombie business woman, or a strange salesman playing the piano with his feet. From that selection, you’ll be whisked away to another menu where you can input whatever message you like. You can pick from three male voices or three female voices.

Go on. Try it HERE!

I’ve already sent one to my brother, Adam B., advising him that if he continues to read from my journal unknown paid assailants may very well render him limb-from-limb.

As for the app itself, we aren’t sure what speech provider powers the underlying capabilities of this little TTS dandy, but you better bet we’re working overtime to figure it out.

In the meanwhile, enjoy THIS little tip we’ve put together for you!

Adam B.   —   January 27, 2009 @ 12:33 pm

loquendo logoYou may have read the recent news brief online at Speech Technology about Loquendo adding yet another voice–this time that of Mikko–to their already vast Text-To-Speech Family.

And while any Speech-Head worth her salt is already well aware of Loquendo’s TTS, many of us–my Speech Brother Eric B included–are unaware of the Loquendo TTS Family Tree.

All told, Loquendo offers sixty-two different voices from an array of different countries in what amounts to an United Nations of Speech Technology.

Among those voice are:

And while we’re talking TTS, don’t forget to check out the TTS version of our daily News Features starring the aforementioned Allison–a feature that we recently expanded to let us deliver even more Speechified News.

Eric B.   —   January 15, 2009 @ 12:28 pm

Deals—that’s what, Speech Heads! I love the smell of PayPal in the morning.

Today, we took a break from our well-worn beat of covering the major speech vendors to explore the smaller, individual speech market. We hit the streets of eBay lookin’ for a big score of little speech, and let me tell you: we were not disappointed by our findings.

On the higher end of things, we found this Dynavox augmented speech device that turns a Palm 3 into a portable TTS solution. Particularly handy for the impaired, the little guy is able to speak a number of useful pre-loaded phrases for common day-to-day scenarios. Add your own for that personal touch!

In a similar vein, a watch that really tells the time to the blind. This little beaut’ talks all the times from AM to PM.

For the kids, check out this Donald Trump doll speech solution, or this highly educational Winston Churchill solution!

Finally and most excitingly: A TALKING WASHINGTON SILVER DOLLAR.

O’ Speech Heads, never in all my years did I think I would see speech-enabled currency. But seller, directmktg, has ushered one mostly-kinda silver dollar right into our hands.

He/she trumpets out: “Hear the actual words spoken by the father of our country, President George Washington, 200 years ago. Just press the shield on the reverse and listen.”

The best part though, the very best part: the dollar is actual LEGAL TENDER (albeit in Liberia). You could go to the store, use the coin to play actual words spoken by George Washington, and then use that same coin to buy a bag Doritos Extreme Kickin’ Chili chips (albeit in Liberia).

Speech Heads, these auctions end soon, so don’t miss out! I know my brother Adam B. and I won’t!

Eric B.   —   January 7, 2009 @ 5:08 pm

Volume 2: Synthesize it Loud, Synthesize it Proud

Speak up!This installment of our ongoing series in the history of speech is sure to bring nostalgic remembrances to all you Speech Heads born in the late 70s to early 80s. Just a little more than thirty years ago, Texas Instruments brought us an important development would change many a childhood. No. I’m not talking about the TI-89 calculator with your copy of “Drugwars” surreptitiously installed so you could slack off in the back of pre-calculus. I’m talking about the Speak & Spell.

Speak & Spell

I can see some speech-eyes rolling. “Really, Eric?” you’re asking, but hear me out. Despite it’s humble size, The Speak & Spell played an important role in Speech History. It was one of the first highly accurate and widely available text-to-speech products—really one of the first practical applications of speech synthesis for a consumer market.

The toy was a direct outgrowth of Texas Instrument’s bizarre 1970s experiments in speech synthesis. The world had just seen man create the tech required to reproduce human speech with tuned voices stored on ROMs. Seeing the potential of those speech fruits, Paul Breedlove, a TI engineer, began development of the Speak & Spell in 1976 with a paltry $25,000 budget. Yes, even then it seems that the world callously and stupidly turned a cold shoulder to speech. Breedlove, however, would be vindicated. Within two short years, the Speak & Spell was flying off the 1978 shelves.

Breedlove’s completed proof incorporated TI’s trademarked Solid State Speech technology, which stored full words in solid state the way calculators of those halcyon 1970s days stored numbers. The Speak & Spell even had a slot for “expansion module” cartridges, which could be inserted to beef up the onboard vocabulary. O’ the foresight of those Texas men! You can see the very same principles at work at today’s speech solutions, like with Nuance and their specific expansion vocabularies for radiology, or orthopedics, or (hopefully in the future) trucking—Nuance, if you’re reading this, I know that there’s at least one boy who’d like to see a CB trucker vocabulary for his Dragon Naturally Speaking rig next Christmas.

The Speak & Spell had its limitations though; limitations that in many ways highlighted some of the persistent problems of building vocabularies that have dogged us in speech.

Love the stache.In my own bucolic childhood, my friends and I would use the old S&S to try and spell dirty words we had found in the dictionary. I’m sure some of you Speech Heads out there did the same, only to find, with the same disappointing results my brother Adam B. and I saw, that words like “wiener” and “scuzzbucket” were not included in the machine’s rather limited vocabulary. Come to think of it, you couldn’t even find the latter term in the limited vocabulary of a late 80s dictionary, either.

Still, the Speak & Spell had great staying power. The machine was produced for nearly twenty years and saw many improvements over its 1978-1992 run. Its vacuum florescent display was replaced with liquid crystal, it was given a membrane keyboard (which in turn was changed from ABC to standard QWERTY layout), and it saw several releases in different languages.

The Germans, no kidding, called theirs the “Das Büddy;” the French “La Super Dictée Magique,” the Spanish “El Loro Parlanchín,” and the Italians “Il Grillo Parlante Piu,” which inexplicably translates to “The Speaking Grill Plus.”

Special fun fact about the different languages: there’s no regional lockout on the expansions, so you can plug a German cartridge into your English Speak & Spell and confuse the b’jeepers out of your friends.

Hey fellas, don't hog that Speak & Spell!More important than its technological significance though, is its impact on our cultural memory. The Speak & Spell, perhaps more than any other speech solution, has made its way into popular discourse. Various works of art make reference to it. Kraftwerk sampled it in their seminal work Computer World, E.T. famously used one to phone home, there’s one in Toy Story, Chucky played with one in Bride of Chucky, and Dane Cook (who isn’t funny at all) apparently has a shtick about it on his album Harmful if Swallowed. And these are just a few. A lot of musicians use modified Speak & Spells with bent circuts as instruments.

All this talk is probably getting you Speech Heads worked up into a heat. You’re probably just itching to visit mom and dad, and spend six hours trying to fish your old Speak & Spell from your childhood closet. You don’t have to, though. There are a bunch of emulators on the Internet for you to play with without having to suffer one of your father’s fishing stories or your mother’s constant criticism about your hygiene.

Just click here for a taste!

Also, click here to see the insides of the machine!

Anyhow, that’s all for this installment. So, to Texas Instrument’s Speak & Spell, Speech Tech Blog salutes you!

Next Page »
Previous Posts
Keyword Tags
Archives
© 2008 - 2010 Speech Technology Media, a division of Information Today, Inc. About/Contacts | PRIVACY POLICY