Speech Technology Magazine SpeechTEK Conference
 
Eric B.   —   April 13, 2009 @ 10:47 am

NLS in da house.In the last issue of Speech Tech Magazine I had an article about natural language systems (NLS) that served as an overview of the technology. Shortly after the issue went live on the web, we got a blog response from Philip Hunter, the vice president of the Voice Interaction Group at SpeechCycle. I interviewed and quoted him throughout the piece. While praising the story overall, he took exception to a few points, making clarifications, etc.

He, for instance, felt I mischaracterized his views when I paraphrased him as having said, “that callers shouldn’t be exposed to a hierarchy of more than five categories.”

Hunter writes, “I didn’t actually assert “that callers shouldn’t be exposed to a hierarchy of more than five categories.” I do think menus structured like that can be problematic and are frequently done poorly, but research (Hura & McKienzie) and deployments (McKienzie, Levine) have shown that the right combination of wording and delivery can allow menus to be fairly lengthy and still be effective. I agree with those findings.”

For the record, the direct quote from our interview was, “The maximum I’d be comfortable with is maybe two menus of four of five things. So that really is going to cut down on the number of things that you can expose to callers.”

In the interest of keeping discussion dialectic, I thought I’d post his response and give our readers a chance to look it over–especially given Philip’s expertise. He has a lot of salient things to say about natural language design in post, as he did in our interview.

Really, getting a conversation started about natural language was the whole point. The article’s final thoughts, that good natural language design is made harder by pervasive more cheaply and poorly designed speech-enabled IVRs, drives the need for a wider discussion home, I think. IVR domains are tied to each other by the quality of overall user experience. When a caller enters a system, they have no idea of what’s driving the underlying technology. It all looks the same.Given the nature of that beast, anything that can be done to improve IVR quality overall will go a long way to winning over caller confidence, and that’s gotta begin with dialogue (no pun intended).

So, in case you missed it upstairs, here’s Mr. Hunter’s post again!

We’d, of course, love to get your comments here and get a conversation going. Really, me and my brother Adam B. appreciate any response we get. Sometimes it feels like we’re slaving in sensory deprivation tank. So if there’s anything any of you Speech Heads out in Speechlandia would like to add, feel free to drop us a line any time.

Eric B.   —   March 9, 2009 @ 5:52 pm

Speechin' on the bayNow back in the loving embrace of our New York offices, I thought I’d take a look back at Voice Search and give you Speech Heads out there some final views.

Like all trade shows, there was of course a fair amount of wheeling and dealing-companies ponying up to each other, seeing if they could hew together some kind of symbiotic relationship that would produce some killer solution capable of reaping mega profits. Sort of like a Power Ranger’s Megazord, those giant fighting robots the Rangers had that were made up of various other smaller robots.

In all that hubbub, it was pretty clear that there were three companies that everyone was looking to try and integrate their offerings into: Google, Yahoo, and Microsoft.

There was hardly a minute between sessions I didn’t see Michael Cohen from Google or the gaggle of Microsoft folks not surrounded by eager speech impresarios. Marc Davis from Yahoo, who was only in town for a couple of hours to boost oneSearch at his keynote, was literally deluged by a crush of people wanting to exchange business cards (full disclosure: me too!) before he had to jet back to San Francisco.

The prevailing feeling at the conference, as I described in my last dispatch, was that mobile voice search is  where it was at; that there we would see real and massive growth for speech in the coming years. All heads were turned to giants like Google and Microsoft to lead the way, too. They, many feel, could provide the shake up that speech has really needed.

The field has been kind of limited in scope for the last pack of years. Until late, it hadn’t really expanded too far beyond the places it’s traditionally been found: call centers, command-and-control functionality, and dictation. Without new territory, speech has plugged along without ever seeing explosive growth. With the entrance of Google, Microsoft, and Yahoo into voice search, the speech community seems to be excited by the possibilities, and, though they might be reluctant to say it on the record, some of the potential changes in players.

It’s no state secret that Nuance has been dominating speech, acquiring technologies like IBM’s patents, or Philips’ speech, and a slew of others. In the process, as you might find in any aggressive climb to the top, it’s stepped on quite a few toes getting there and has no shortage of discontents. You don’t have to push too hard to get people griping about Nuance in San Diego.

“In a market where there hasn’t been a big brother, [Nuance] rolled up into one,” Joseph Bentzel, chief marketing officer for SpeechCycle and, it should be noted, a competitor, told me. “But in a market where there are bigger brothers doing it for free and virally…” he added before trailing off with half a smile and letting his pause sketch out the possibilities.

While Nuance has cast a large shadow over speech, acquiring its way to the top, building a strong speech provider out of a company that originally just handled OCR scanner software, ScanSoft, Mr. Bentzel thinks it’s reached the end of the line as far as being the undisputed king of speech. By his account, voice search will grow the market and create a space outside of Nuance’s purview.

“Nuance will not exist as a leader in 24 months unless Paul Ricci [Nuance's CEO] reads this article and hires me,” Mr. Bentzel jokes.

Part of Nuance’s problem, as he sees it, is that they’ve tried to become the one-stop solution for all speech needs. They’ve tried to control the process from the ground up, acquiring and integrating technologies into their own banner. This has had the effect of freezing other companies out, and, in some cases, making them hostile.

May the Speech be with you.“This is the Rebel Alliance,” Mr. Bentzel says of Voice Search. “This is the Luke Skywalker Show. We’re on the ice planet and they’ve ignored us.”

While he seems totally at ease comparing Nuance to the Empire from Star Wars, Mr. Bentzel is also quick to say that everyone in speech ought to “thank Paul Ricci for putting speech on the map.”

“I’m not one of these Nuance haters,” he insists. He says he’s more or less agnostic and only sees problems where market growth is impeded, so forget about thinking he views Ricci as some kind of Darth Vader force-choking everyone at the table.

In fact, he suggests that there wouldn’t be much speech out there without Nuance’s drive to make it a big business.

Mr. Bentzel’s position (and others like his) represents an attitudinal shift in how the field has come to view itself. If I, or anyone else for that matter, made the mistake of saying “speech industry,” there were a group of people on hand, just ready pounce, saying, “Speech isn’t an industry, it’s a tool.” Speech is starting to see itself as a subordinate modality to larger functionality, not an end in and of itself the way it has been viewed in its more academic roots.

If you don’t believe me, just try saying “speech industry” for yourself at SpeechTek in August. When you walk into that trap, they’ll whip out that little tool mantra like it were a brand new gun they’d just been itching use and you were the hapless mugger who made the mistake of trying something today.

It’s a crazy mixed up world out there, Speech Heads. Even without the recession, everything is in flux and it seems like everyone is trying something today. Carry a speech-gun and watch your back is my advice.

***SPECIAL NOTE: Due to an oversight entirely on my part, we had erroniously reported that Nuance didn’t have much of a presence at Voice Search. In fact, they did. Brad Bargan, Nuance’s VP of product development, participated in several events. My most humble apologies to them and to our readers.***

Eric B.   —   January 28, 2009 @ 4:21 pm

The Dream lives on...It seems like the whole speech industry is just a titter with acquisitions and buyouts these days. The big are getting too-big-to-fail, and the small are getting sucked up like plankton through baleen. Heck, it’s not even the small these days. Just in the last two weeks, we saw Nuance gobbling up patents and licenses from IBM like a fat king on a turkey leg and SVOX gorging itself on Siemens’ speech unit.

Back in December, Roberto Pieraccini from SpeechCycle told me that mergers and acquisitions were happening so fast that even he couldn’t keep track of them.

So far, a lot of the action has been all within the confines of the speech world, but all these acquisitions got me and my brother Adam B. thinking about Japan. So often the Japanese see mergings of the two unlikeliest companies: the Lucky toothpaste company and Goldstar electronics firm to form LG; the Yamaha musical instrument company buying up a motorcycle manufacturer to form the perplexing giant we all know today; or even good ol’ Nintendo, which sold card games in the 60s, but branched out to run a chain of “love hotels” and a cab company.

We were wondering, what if the same happened in speech? What if speech just trounced all over sensible vertical market expansion? What would be some unlikely mergers we’d like to see? Dare we imagine? Yes. We dare.

Behold, Speech Heads! The 2009 Speech Technology Dream Team-Ups:


1.)    Nuance merges with the New Balance shoe corporation to form Nuance Balance.

Nuance will be looking to expand their reach, and this one just made the most horse sense in the world; sympathetic corporate cultures, practically rhyming names, the growing need for a top-rate, speech-enabled shoe corporation. Have you ever been running a cross-country meet and just felt an overwhelming compulsion to dictate your memoirs? Wish away fruitlessly no more. Nuance Balance has a solution for that.

2.)     Avaya acquires Dairy Queen in a hostile take over.

All hail the Queen!Avaya more or less lets Dairy Queen continue as its own separate brand, but begins incorporating free ice cream into its IVR call-routing systems.

Imagine this, as a caller becomes frustrated with a system, unable to get the service he desires, but rather than being transferred to a domain’s underpaid operator who will likely hand him off to someone else in the domain who can’t help him, he is instead routed to a free and delicious DQ Blizzard—vanilla blended with Oreos! Talk about tasty CRM; that caller has probably just forgotten the outstanding payment he was calling about in the first place. Banking error in this domain’s favor…

3.)    Nexidia buys up the controlling shares of the WWE wrestling corporation.

Give him the left, Jimmy!You know those long-winded monologues wrestlers deliver before a big fight? The ones where they swear to break this, and smash that, and clothesline a fella so hard his ancestors will feel it in organs they weren’t aware they even had? Those little pep talks are all very theatrical and great fun—we know that—but there’s not a whole lot of accountability in them is there? Who knows if the promised pile-driver Macho Man Randy Savage menaced on a Monday Night Raw is delivered to Nature Boy Rick Flair on a Tuesday Night Titans? Well, prepare for a new era of accountability.

In this dream match-up, Nexidia applies its video search tech on-the-fly to WWE events; tagging and tying the pain a wrestler guaranteed outside the ring to his actions in the ring. A ticker at the bottom of your screen lets you know in real time if wrestler Jimmy “The Mouth of the South” Hart is delivering on that smackdown he promised last week.

4.)    SpeechCycle acquires the patent to the Foreman Grill.

Not satisfied with the mixed results of traditional consumer grilling, SpeechCycle decided that it was time to provide world consumers with the grilled food they’ve longed for, at least statistically speaking. Using their data-driven approach, the SpeechCycle Foreman Grill uses aggregated data to provide us with the median steak of our collective dreams.

Just put your dinner in and the grill does the rest. In order to ensure the best results, the device is constantly acquiring data based on a number of metrics. The grill is speech-enabled to recognize utterances like: mmm, tasty, delicious, or ugh, putrid, and This is the most foul meal I’ve known in all my years. If you aren’t satisfied with your meal, don’t blame SpeechCycle. Blame the sum total of human desire.

5.)    PerSay partners with the Cornell University Department of Animal Husbandry.

You've got to be yoking.Looking to patch a number of glaring security problems (the recent theft of several heads of cattle; the spate of sabotage that has hit a number of Cornell’s beasts of burden, including a prize ox; and the vandalism of two dozen carrier pigeons) Cornell gives the Israeli biometric giant, PerSay, administrative control of its department.

PerSay overhauls Cornell’s stable of animals, limiting access to only authorized users who can pass their 96 percent effective voice verification process. After implementation, the department sees a drastic cut in its farm-crime rates; however, some problems do persist. University investigators find that the acts of sabotage were actually being carried out by Animal Husbandry faculty. An inside job! Arrests are made and one, Professor Newman Von Heidleborg, the ring leader, is prosecuted to the fullest extent of the law.

STM Blog   —   April 10, 2008 @ 1:18 pm

Since I enjoy writing “Crushes & Hexes” so much, in the coming weeks, the blog will continue to feature breaking news updates from Ryan, while I focus only on regular features and product reviews. The newest addition to our features is “Round Up & Release,” a compilation of the biggest stories and developments from the speech tech world. While “Crushes & Hexes” focuses on the tech community as a whole, RR&R is just about speech. I hope you like it – it will appear every Thursday on the blog. As always, keep the comments coming, and send us feedback! Seriously, Ryan and I get all giddy when our readers comment. Sad but true — it’s the small things. Full post after the jump!

(more…)

Previous Posts
Keyword Tags
Archives
© 2008 - 2010 Speech Technology Media, a division of Information Today, Inc. About/Contacts | PRIVACY POLICY