Speech Technology Magazine SpeechTEK Conference
 
Adam B.   —   April 30, 2009 @ 9:36 am

Do you know what time it is Speech-Heads?

Time for the latest installment of Talking Gadget Theater.  This time, Kindle 2 TTS and iPod shuffle TTS are performing a scene from The Empire Strikes Back.

As my Speech Brother Eric B. said: “This may be their best work to date!  Also: I love sandwiches.”

Eric B.   —   April 29, 2009 @ 1:16 pm

APPLY DIRECTLY TO THE FOREHEAD!So I was talking to Bill Scholz recently, and we got on the topic of how speech is portrayed in the “mainstream media.” Bill feels that part of the resistance to speech is the negative press that the modality has gotten. He said that for a while, whenever you heard about speech on the news it was usually with reference to some disastrous failure. He pointed to that infamous Windows Vista incident where some hapless Microsoft worker was demoing the speech capabilities in Word and couldn’t get the system to recognize a single thing.

Incidents like those tend to tarnish the whole of speech technology. When coupled with the nightmare IVR experiences that we’ve all had it leads some to nihilism and conclusions like, “the only future in speech is silence.” Plus, these things all have great stickiness on the internet, because who doesn’t want to watch something crash and burn? It’s not like we can go down to the Coliseum and watch enemies of the Empire get eaten by lions or watch grown men wail on each other with a trident a net. Even bullfighting has become immoral. Speech disasters are the only such kicks the modern world can comfortably afford us.

So, because it’s a slow news day, in that spirit of mainstream media gawkery, I present you the following disaster from 1992, that’s at least new to me: the robotic demise of Abraham Lincoln.

Apparently, back then, the audioanimatronic Lincoln at Disney World’s Hall of Presidents slowly crumpled before a horrified audience’s eyes. My brother Adam B. was reduced to tears, actually.

The robot in question is a storied machine. Walt Disney had the robotic Lincoln built long before he conceived of an entire Hall of Presidents. The machine was built for the 1964 World’s Fair and was the first (and here’s how it thinly relates to speech) audioanimatronic. The production was a kind of trainwreck, according to Disney biographer Neal Gabler. It leaked oil, it sparked horrible fire from its mouth, set backs prevented it from opening on time. In the end though, it premiered at the Illinois pavillion to great success and was one of Disney’s major lifetime achievements.

Watch it go to pieces here:

[youtube]http://www.youtube.com/watch?v=YF0j69pAM7g[/youtube]

Adam B.   —   April 28, 2009 @ 9:42 am

Hi Speech-Heads,

If you enjoyed my last post, then you will definitely want to check out the next installment of Talking Gadget Theater.

This time, Kindle 2 TTS and iPod shuffle TTS perform a scene from Star Trek II: The Wrath of Khan.

When I told my Speech Brother Eric B. about this one, he tore off his necktie, popped some corn and shouted: “Now, let’s Head For The Mountains!

Eric B.   —   April 27, 2009 @ 4:15 pm

"No, man. I said, 'Extra large WITH pepperoni.'"Mint, the Wall Street Journal’s media arm in India, aimed at the nation’s growing ultra-rich class, reported yesterday that annoying outbound political IVR calls have finally made their way to the budding world power.

This is by no means a new development. In 2004, then Prime Minister, Atal Bihari Vajpayee, asked voters to vote for his Bharatiya Janata Party and its ruling coalition the National Democratic Alliance in a pre-recorded message. Despite the fact that both the BJP and the NDA were expected to sweep the polls and retain the lead it’d held firmly since 1999 after a brief stumble in ‘98. The NDA, however, lost to the Indian National Congress led by Sonia Gandhi, widow of former PM Rajiv Gandhi. Some attributed the loss to increasing unrest among Muslim Indians or the administration’s failure to make meaningful bread-and-butter changes for average citizens, compounded by the INC’s appeals for the “common man,” while others still (and you can count me in this contingent) believe voters may have turned on the NDA because they were annoyed by constant automated calls.

In my case, Speech Heads, annoying IVR has certainly turned me away from many a political candidate. In my hometown for instance, this obsequiously voiced woman was running from City Parks Comptroller. Every day I’d get dinner time calls about how she was going to bring the “winds of reform” to the way swing sets are funded across the city. Her opponent was swindling taxpayers with his boondoggle closed slide projects, nature trail, and dog park. Well, after three of those I was ready to vote for the incumbent and let him fleece the municipal property owners of all they had to build whatever Shangri-La caught his idle fancy.

Despite the conventional wisdom against such IVR calls, however, India seems dead set on proceeding, Mint reports.

“Parties are embracing telecom technologies with greater enthusiasm to connect with the electorate in the 15th Lok Sabha polls,” an unnamed staff reporter writes.

“Five years since the 2004 elections, India’s phone base, including mobile phones and phones of the fixed-line variety, has jumped nearly six times to nearly 430 million, up from some 75 million at the end of March 2004,” he adds.

The article goes on the quote a Mr. Vineet Kaul, vice-president of One97 Communications as saying, “Some parties [declining to name them] have decided to create a so-called IVR and toll-free IVR numbers that a voter can dial and get more information about the local candidates.”

Toll-free numbers? This won’t end well. When you start making automated calls, Speech Heads, you just invite the opposition to build $600,000 playgrounds. As my brother Adam B. says, punching a black leather gloved fist into an open black leather gloved hand, “You wanna move some votes, you need a personal touch.”

Full article here.

Adam B.   —   April 24, 2009 @ 12:07 pm

Hey Speech-Heads,

Lat night, my Speech Brother Eric B. and I were enjoying an evening at the local watering hole, talking about the good old days, when we came upon a True Speech Gem:

The Kindle 2 TTS and the iPod shuffle TTS performing a scene from Blade Runner.

Let me just say this: It is Speech Technology At Its Finest!

Eric B.   —   April 22, 2009 @ 4:03 pm

Is the Internet really just a vapor cloud labeled "Inernet"?

Lunacy! Sheer lunacy some will say! Specifically, my brother Adam B. will say.

TringMe, a Bangalore-based company that takes advantage of many carriers unlimited data plans (or at least a neighbor’s passwordless wireless access) and lets users use GoogleTalk or other VoIP services directly from a mobile phone for free, is looking to stick it to VoiceXML.

It’s announced the birth of VoicePHP: 12 pounds, healthy, no notable defects, and looks just like its daddy.

VoicePHP is PHP-based voice protocol that seeks to replace the XML-based VoiceXML format. The company describes the language as “the same old PHP which now enables you to create voice applications,” but also cautions that “It’s not an extension to PHP; in fact [sic] it’s the same PHP which now outputs voice instead of text and also takes input as voice instead of text.”

I know what you’re thinking, Speech Heads: Why in blazes would anyone want to abandon VoiceXML just as we seemed to have reached an industry consensus about using it?

Because, if you believe TringMe, XML is misapplied as a programming language for voice. While conceding that XML is a good’un for data storage and transmission it’s programming complex logic is not intuitive and appears “forced” or “hacked” when applied to voice programming. It fails, the company says, to achieve the power of a “real” programming language like C or PHP. Moreover, the development tools/environments for VoiceXML-based applications are limited, reminiscent of proprietary IVR development tools of yesteryear. Old-time. So why not PHP? Most programmers know PHP, right?

But wait a minute, what’s in it for TringMe. Not all that much, if you believe what they say in their “How much does it cost me?” FAQ. While the company says they aren’t looking to necessarily “cash out” on VoicePHP, the thing is powered by TringMe’s platform, Voice 2.0, so, at least ostensibly, if everyone started using VoicePHP overnight the company could be looking at beaucoup bucks. Furthermore, even if VoicePHP never gets off the ground, the Indian startup, unknown to at least me before this little stunt, will have generated a lot more hits and harvested some mindshare.

Methinks the company is probably looking to cash out in on VoicePHP, if indirectly, afterall. Even Tring seems to recognize that publicly. In its cost FAQ, they make rare press release use of a winking emoticon. Coy, coy TringMe.

A lesson Speech Heads: Be wary of any man who calls himself an agnostic technologist.

Eric B.   —   April 21, 2009 @ 11:59 am

"Lean on me."O’ fabulous day, Speech Heads! The good folks down at ArsTechnica have uncovered evidence that the newest iteration of the iPhone OS, 3.0, is going to come with new voice control features. The project, codenamed “Jibbler,” (sounds like a discontinued candy from the 80s) is said to have NOT ONLY voice synthesis-whoa!-but voice recognition!

According to Ars’ sources, Jibbler seems to be SpringBoard application enhancement. SpringBoard is similar to Apple’s OSX Finder app. It acts as a launcher and will support the newly announced 3.0 Spotlight search. Ars has a bunch more details that you shouldn’t miss out on (plus this terrifying picture), but we thought we’d just give our impressions here.

It seems like every little damn thing is getting speech-enabled these days. You may remember our groundbreaking reports on the GirlTech Password Journal, or the Moshi clock, talking toilet rolls, and creepy robots galore, all perfect examples of my point. All these signs seem to be pointing greater mass acceptance of speech recognition. Companies are looking, harder now that ever before, on how to make some fast bucks off getting speech into our daily lives. We’re still, of course, in the early stages where the technology is so new that we get all manner of strange things. The boundaries of what speech can and should do haven’t been entirely defined-at least not out of more traditional applications like IVRs-so people are looking to try and hype anything.

You don’t have to look any further than my brother Adam B.’s recent posts about speech-to-Twitter apps to see what I’m talking about. His two posts on our humble site have yielded a torrential flood of other speech-to-Twitter firms emailing him and commenting on his posts, trying to get him to look at their offerings. This even as Twitter, though popular and finding some legitimate CRM uses, hasn’t found a way to monetize itself. Talk about building a house on a shaky foundation.

If you’ll allow me to wax incoherent through a string of vague journalistic tritery and mixed metaphors, it’s like California 1849 out there and everybody is jumping head first in the pool, just looking for that pot of gelt at the end of the rainbow that stretches over gold-paved streets and promising rags-to-riches fortunes to all comers and investors. But you can’t put a baby in the oven and make it biscuits, unless of course you’ve ground the bones to make devil’s cake. No, no. The business of business is business and the first rule of business, business, business is location, location, location. I guess, Speech Heads, what I’m really trying to say here is that a rolling stone gathers no moss, and golly! Them stones is rollin’!

Adam B.   —   April 14, 2009 @ 2:47 pm

More speech for you and for me!!!!Hey Speech-Heads,

After my post about TweetCall, I got an email from Dial2Do, another company that offers a voice integration for TwitterAND FOR 50 OTHER SERVICES!!!!

When I told my Speech Brother Eric B. about this, he started dancing about the Home Office to Big Band hits of the 1920s and 30s.  For those of you who don’t know Eric B., nothing gets him that worked up except for his Book Collection.

So, in the coming days, check back in with us at Speech Tech Blog for more about Dail2Go.

Eric B.   —   April 14, 2009 @ 2:18 pm

SpinVox on the compassionate samurai's mind.Speech Heads, it’s a beautiful thing when speech providers compete.

Post our review on Nuance’s VM2T, SpinVox (perhaps a little jealous) wrote me to ask me to the dance—-the speech dance that is. In a matter of days, I’ll begin trying out their service. We’re going to be giving it the full treatment, subjecting it to rigors of my brother Adam B.’s near insane rambling messages, putting it in the wringer of Shakespearean English, pushing it through the Danger Room of the busy New York City streets, and springing a couple of unexpected tests on it.

Full results to come, Speech Heads! Stay tuned! Same speech-time, same speech-channel!

Eric B.   —   April 13, 2009 @ 10:47 am

NLS in da house.In the last issue of Speech Tech Magazine I had an article about natural language systems (NLS) that served as an overview of the technology. Shortly after the issue went live on the web, we got a blog response from Philip Hunter, the vice president of the Voice Interaction Group at SpeechCycle. I interviewed and quoted him throughout the piece. While praising the story overall, he took exception to a few points, making clarifications, etc.

He, for instance, felt I mischaracterized his views when I paraphrased him as having said, “that callers shouldn’t be exposed to a hierarchy of more than five categories.”

Hunter writes, “I didn’t actually assert “that callers shouldn’t be exposed to a hierarchy of more than five categories.” I do think menus structured like that can be problematic and are frequently done poorly, but research (Hura & McKienzie) and deployments (McKienzie, Levine) have shown that the right combination of wording and delivery can allow menus to be fairly lengthy and still be effective. I agree with those findings.”

For the record, the direct quote from our interview was, “The maximum I’d be comfortable with is maybe two menus of four of five things. So that really is going to cut down on the number of things that you can expose to callers.”

In the interest of keeping discussion dialectic, I thought I’d post his response and give our readers a chance to look it over–especially given Philip’s expertise. He has a lot of salient things to say about natural language design in post, as he did in our interview.

Really, getting a conversation started about natural language was the whole point. The article’s final thoughts, that good natural language design is made harder by pervasive more cheaply and poorly designed speech-enabled IVRs, drives the need for a wider discussion home, I think. IVR domains are tied to each other by the quality of overall user experience. When a caller enters a system, they have no idea of what’s driving the underlying technology. It all looks the same.Given the nature of that beast, anything that can be done to improve IVR quality overall will go a long way to winning over caller confidence, and that’s gotta begin with dialogue (no pun intended).

So, in case you missed it upstairs, here’s Mr. Hunter’s post again!

We’d, of course, love to get your comments here and get a conversation going. Really, me and my brother Adam B. appreciate any response we get. Sometimes it feels like we’re slaving in sensory deprivation tank. So if there’s anything any of you Speech Heads out in Speechlandia would like to add, feel free to drop us a line any time.

Next Page »
Previous Posts
Keyword Tags
Archives
© 2008 - 2010 Speech Technology Media, a division of Information Today, Inc. About/Contacts | PRIVACY POLICY