Speech Technology Magazine SpeechTEK Conference
 
Eric B.   —   August 13, 2009 @ 12:32 pm

Apparently, we don't speak it very well.Nuance has announced the results of its “I Speak Dragon” contest, Speech Heads—a contest aimed at asking Dragon users to tell the company how the software has successfully impacted their lives. The awards were given in five categories: educational, personal, social, professional, and legal.

The winners Mike Fejes, a public school teacher; Bob Bieber, a sufferer of rheumatoid arthritis; Ronald W. Banks a psychologist; Shirley Bowman, an alternate media specialist; and attorney Judy Chorlog don’t, as you can plainly see, include me or my brother Adam B.

Adam has taken it particularly hard. He is totally unresponsive to our pleas. He’s just repeatedly slamming his head into the desk.

A Nuance spokesperson has tried to soothe us by telling us that there were nearly 1,000 entries this year from across the company’s customer base, but Adam is inconsolable.

“How could I have made better, more life changing, more positive. use of Dragon? Just tell me! How?” he rages at no one in particular. Our editor has been all choked up himself. He can’t even meet Adam’s eyes when he walks past our desks on his way to the water cooler.

To find out how Adam could have used Dragon better himself, check out the winners from Nuance.

Eric B.   —   July 15, 2009 @ 5:49 pm

"Holy mergers and acquistions, Batman!"Speech Heads, if you caught my brother Adam B.’s article today, Nuance has acquired Jott for an undisclosed amount.

The deal is apparently a month old, and was only announced after a web page from Ackerly Partners, one of Jott’s investors, noted that the deal had been made in June. From there, according to Brier Dudley’s Seattle Times blog, the news burbled up on TechFlash.

The acquisition, however, signals that Nuance is serious about its place in the mobile space. As we’ve reported before in our review of Nuance’s voicemail-to-text offering, VM2T, and subsequent articles, the company’s entire business proposition is OEM and carrier-facing. Nuance has not made direct-to-consumer plays, letting its partners—many of them already big household names—face the public with their offerings. Jott, by contrast, makes direct-to-consumer bids.

Given all the fanfare about carrier deals from some of Nuance’s competitors, the acquisition of Jott has gotten some thinking that this might be a shift in direction for Nunace. Not so, though, says Mike Thompson, senior vice president for Nuance Mobile.

“Our primary customers are operators, OEMs, and enterprise organizations. That’s who Nuance sells software applications and services to, and that will continue to be the highest priority,” he says.

He adds, however, that Nuance does “do consumer work for a variety of reasons in certain parts of our business. Being very close to consumers allows for rapid innovation and lots of interesting things that you can learn.”

He also asserts that the purchase of Jott is not a reactive gesture to happenings in the mobile market at all, praising its new property as being strong and innovative. Nuance has no plans to scrap Jott’s direct consumer customers, nor have its strategy do an about face. Rather, it plans to build on Jott’s strengths with its own.

“As a small start up, Jott’s strategy was selling direct to consumers,” writes Datamonitor associate analyst Aphrodite Brinsmead in an email to Speech Technology. “Nuance will continue to support and target customers directly but its key focus will be in gaining carrier relationships. Carriers have a large, diverse user base and the ability to bring speech-to-text to many new customers.”

She points to Jotts offerings like Jott Assistant which handles voice reminders, texts, emails, etc. as value that Jott brings to Nuance.

“Nuance will gain a stronger position against growing competitors, such as SpinVox and Google Voice, by adding extra features like these to its service,” Brinsmead says. “Nuance is ramping up its mobile portfolio and aims to automate all mobile interactions with speech.”

Eric B.   —   May 8, 2009 @ 7:35 pm

Kindle, didn't you study for this exam at all?Today, The New York Times reported that Amazon’s Kindle 2 much vaunted text-to-speech (TTS) capabilities, provided by Nuance Communications, came up short when trying to pronounce President Barack Obama’s name. The device uttered something closer to Baah-raah-k O-baah-maah (closer to the sounds in “black” and “Alabama,” the Times said. The paper adds that the problem has since been corrected. Obama’s name has added to the Kindle’s TTS dictionary and will be included in the next wireless update.

The Kindle TTS misfire came to prominence as many news organizations began openly speculating on whether subsequent versions of the Kindle could create a viable non-paper-based means of distribution. Wired, for instance was running the headline “How the Next Kindle Could Save the Newspaper Business” in stories about partnerships the The New York Times and Washington Post were looking to hatch, while mediabistro.com pondered, “Can The Kindle Save Newspapers?” Whether any of that’s true, the failure of Kindle’s TTS to pronounce things like the President’s name correctly may put at least a temporary crimp in any role speech might in any Kindle paper-saving venture.

When it comes to that though, don’t blame Nuance. (more…)

Eric B.   —   May 7, 2009 @ 9:05 am

Yeah that's a golden gun. Nick Cage is good like that.Speech Heads, I don’t know how many of you are also readers of our sister site, DestinationCRM, but if you aren’t you might have missed this little tag-team approach my collegue Chris Musico and I had going on over there. We both covered Microsoft/TellMe’s recent launch of some speech-enabled functionality to their enterprise cloud-based offering.

Chris chatted the distance the venerable Elizabeth Herrell, vice president at Forrester Research, while I yaked it up longtime with the honorable Daniel Hong, lead analyst at Datamonitor, and the results couldn’t have been any more different. While both agreed that IVR was underutilized in the enterprise space, they had divergent views on what Microsoft’s more aggressive pursuit of speech meant for speech big dog Nuance.

While Ms. Herrell seems to think that Nuance better watch its eggs, Mr. Hong sees the releases as less significant and doesn’t think it will make a spit’s worth of difference to Nuance’s nest. Watch the sparks fly HERE and HERE.

“You won’t want to miss this clash of the titans,” says my brother Adam B.

Eric B.   —   April 8, 2009 @ 1:01 pm

Voicemail-to-text powered by me! DRAGON!Speech Heads, after many a voicemail message and perilously rigorous scientific testing, we’re finally ready to give you STBlog’s assessment of Nuance’s VM2T (voicemail-to-text) client.

How the review breaks down

A couple weeks ago, I had a briefing with Nuance Communications about setting the service up where they explained the lay of the land. They explained that version I would be testing is a little bit different from the one you’ll find out in the wilds of market. As we’ve mentioned before, Nuance’s marketing strategy with VM2T is to distribute through its partners–in this case carriers. Nuance provides the underlying technology to its partners, but each iteration is likely to look a little different according to those partners’ needs. The version I was using was hosted directly by Nuance, so interface specifics probably wouldn’t bear any relation to what most end-users will see.

For one, I had to set up a forwarding service to use it which an end-user would never have to do. For two, all of the messages were emailed to me rather than sent as text messages. In real deployments, Sean Brown, product manager for mobile applications at Nuance, assured me the messages will be sent as SMS texts under most carriers. Also varying from provider to provider are settings for live agent intervention. Depending on what a provider wants to pay for/provide they may bring in real people to clean up the texts if a message scores low-confidence.

All that said, the recognition engine (Dragon 10) is identical to the one that carriers will be using, so we focussed on that for the purposes of this review.

The process began when I set up my account, dialing a number that would, from that brave moment on, forward all my voicemails past my provider’s system to Nuance VM2T HQ. There, they’d be subjected to pinch-and-pull of Nuance’s automated recognition, possible human oversight depending on the strength or weakness of  confidence scores, and spat back out to my email as a text with a .wav of the message attached for review. If the system was unable recognize what was said, it would be indicated this with [...]. Likewise, if it didn’t have high confidence and guessed a word it would write [?] after it.

The results

(more…)

Eric B.   —   March 31, 2009 @ 6:18 pm

Jenga!Yesterday, the Gerson Lehrman Group (GLG) provided analysis of a joint study between Harvard University and Warwick University. The results, they suggest, put a damper on the unspoken implications of a 2008 Nuance study that found using speech recognition was safer than using tactile controls.

The Harvard/Warwick study, which had a quick rundown in Wired magazine last December, found that “The worst results came from the subjects tasked with listening to a list of words and then speaking new words that began with the same letters as each word on the list. Those ‘drivers’ had a 480 millisecond delay, which at 60 miles per hour would mean 42.3 additional feet traveled before applying the brakes.”

This, GLG extrapolates this to mean that voice command-and-control will have similar results.

“This task is similar to using an in-vehicle system for command and control purposes.  The driver is speaking to the system and then waiting for [its] response and possibly speaking again,” it writes.

It’s quick to add, however, that speech interactive systems often offer shortcuts and reduce the amount of time require to engage with them, possibly mitigating some of the risk.

It should also be noted that these results seem to collude with a AAA study we reported on last month on the main site, that concluded that the danger to drivers in using wireless devices was not primarily the use of their hands, but the use of their cognitive attentions. Where strict safety is concerned, really drivers shouldn’t even been listening to music, much less doing anything more complicated.

The conclusion that GLG comes to is that voice command-and-control while safer are not safe. It suggests that Nuance’s report has some limitations. This isn’t the first time it’s questioned the 2008 report. In July of 2008, GLG questioned the significance of the sample size, thirty participants, and how accurate a study in an artificial simulated scenario would be in the real world.

Perhaps somewhat derisively, it writes,“Nuance recently released the results of a study that claims to “prove” that speech recognition used in-vehicle while driving increases driving safety. I’m sure that the results of the study are right, to the extent that Nuance is releasing any data and conclusions.”

Responding to the concerns raised by GLG in yesterday’s analysis, Michael Thompson, senior vice president and general manager of Nuance Mobile, says, “The results of last year’s study demonstrated that speech-powered systems in vehicles help reduce driver distractions posed by manually entering information into navigation systems, entering music selections via mp3 players, making and receiving phone calls, and so on.  Clearly, the safest option is for drivers to simply refrain from using these devices and applications, but for those who insist on using them, the study showed that a hands-free, eyes-free option provided by speech is the next best alternative.”

Perhaps, Thompson is right. Who, for instance, is going to forgo listening to music in the car? On the other hand, one might argue that it isn’t enough for any manufacturer, developer, or even person to take morally neutral stands, reconciling ourselves to saying people oughtn’t do it, but we may as well make it safer. That’s perhaps too easy an answer. But then, what can you do? If Nuance doesn’t do it, some might say, someone else will, and then they will have ceded important business ground, really the existential foundation of their entire venture into automotive work. If there is a demand, are companies responsible first to some arguably tentative moral stand (after all who is authorized to make decisions for people unilaterally?) or the market?

And there is a market. My brother Adam B. for instance, will never stop using speech in the car. He moonlights as a NYC cabdriver–one of the 5% of cabbies in the City without a driver’s license I may add. His cab is so speech-enabled that it won’t even start unless he politely says “Good morning, Mackie”– Mackie’s the cab’s name.

For dangerous speech-enabled drivers like him, there’s just no reformin’.

Eric B.   —   March 23, 2009 @ 5:01 pm

Enter the Dragon!First of all, I know what you’re probably thinking out there in Speechlandia. Where’s this much talked, much huffed about Dragon 10 review that the Brothers B. have been promising? Well, we’ve had to keep mum about this because of an embargo, but we’re finally unfettered. Shortly after we began our review process, we got a phone call from Nuance HQ.

Hold the presses!

They told us that they were going to be releasing 10.1 and did we want to review it? Did we want to review it? Did we want to review it? Sheeyeah we wanted to review it. We were promised some copies and are now just patiently awaiting to begin the process anew with the latest version.

Apparently, the biggest update is that 10.1 is compatible with the 64-bit version of Windows Vista. Likewise, it has a couple of fixes.

But, wait! That’s not all!

(more…)

Eric B.   —   March 9, 2009 @ 5:52 pm

Speechin' on the bayNow back in the loving embrace of our New York offices, I thought I’d take a look back at Voice Search and give you Speech Heads out there some final views.

Like all trade shows, there was of course a fair amount of wheeling and dealing-companies ponying up to each other, seeing if they could hew together some kind of symbiotic relationship that would produce some killer solution capable of reaping mega profits. Sort of like a Power Ranger’s Megazord, those giant fighting robots the Rangers had that were made up of various other smaller robots.

In all that hubbub, it was pretty clear that there were three companies that everyone was looking to try and integrate their offerings into: Google, Yahoo, and Microsoft.

There was hardly a minute between sessions I didn’t see Michael Cohen from Google or the gaggle of Microsoft folks not surrounded by eager speech impresarios. Marc Davis from Yahoo, who was only in town for a couple of hours to boost oneSearch at his keynote, was literally deluged by a crush of people wanting to exchange business cards (full disclosure: me too!) before he had to jet back to San Francisco.

The prevailing feeling at the conference, as I described in my last dispatch, was that mobile voice search is  where it was at; that there we would see real and massive growth for speech in the coming years. All heads were turned to giants like Google and Microsoft to lead the way, too. They, many feel, could provide the shake up that speech has really needed.

The field has been kind of limited in scope for the last pack of years. Until late, it hadn’t really expanded too far beyond the places it’s traditionally been found: call centers, command-and-control functionality, and dictation. Without new territory, speech has plugged along without ever seeing explosive growth. With the entrance of Google, Microsoft, and Yahoo into voice search, the speech community seems to be excited by the possibilities, and, though they might be reluctant to say it on the record, some of the potential changes in players.

It’s no state secret that Nuance has been dominating speech, acquiring technologies like IBM’s patents, or Philips’ speech, and a slew of others. In the process, as you might find in any aggressive climb to the top, it’s stepped on quite a few toes getting there and has no shortage of discontents. You don’t have to push too hard to get people griping about Nuance in San Diego.

“In a market where there hasn’t been a big brother, [Nuance] rolled up into one,” Joseph Bentzel, chief marketing officer for SpeechCycle and, it should be noted, a competitor, told me. “But in a market where there are bigger brothers doing it for free and virally…” he added before trailing off with half a smile and letting his pause sketch out the possibilities.

While Nuance has cast a large shadow over speech, acquiring its way to the top, building a strong speech provider out of a company that originally just handled OCR scanner software, ScanSoft, Mr. Bentzel thinks it’s reached the end of the line as far as being the undisputed king of speech. By his account, voice search will grow the market and create a space outside of Nuance’s purview.

“Nuance will not exist as a leader in 24 months unless Paul Ricci [Nuance's CEO] reads this article and hires me,” Mr. Bentzel jokes.

Part of Nuance’s problem, as he sees it, is that they’ve tried to become the one-stop solution for all speech needs. They’ve tried to control the process from the ground up, acquiring and integrating technologies into their own banner. This has had the effect of freezing other companies out, and, in some cases, making them hostile.

May the Speech be with you.“This is the Rebel Alliance,” Mr. Bentzel says of Voice Search. “This is the Luke Skywalker Show. We’re on the ice planet and they’ve ignored us.”

While he seems totally at ease comparing Nuance to the Empire from Star Wars, Mr. Bentzel is also quick to say that everyone in speech ought to “thank Paul Ricci for putting speech on the map.”

“I’m not one of these Nuance haters,” he insists. He says he’s more or less agnostic and only sees problems where market growth is impeded, so forget about thinking he views Ricci as some kind of Darth Vader force-choking everyone at the table.

In fact, he suggests that there wouldn’t be much speech out there without Nuance’s drive to make it a big business.

Mr. Bentzel’s position (and others like his) represents an attitudinal shift in how the field has come to view itself. If I, or anyone else for that matter, made the mistake of saying “speech industry,” there were a group of people on hand, just ready pounce, saying, “Speech isn’t an industry, it’s a tool.” Speech is starting to see itself as a subordinate modality to larger functionality, not an end in and of itself the way it has been viewed in its more academic roots.

If you don’t believe me, just try saying “speech industry” for yourself at SpeechTek in August. When you walk into that trap, they’ll whip out that little tool mantra like it were a brand new gun they’d just been itching use and you were the hapless mugger who made the mistake of trying something today.

It’s a crazy mixed up world out there, Speech Heads. Even without the recession, everything is in flux and it seems like everyone is trying something today. Carry a speech-gun and watch your back is my advice.

***SPECIAL NOTE: Due to an oversight entirely on my part, we had erroniously reported that Nuance didn’t have much of a presence at Voice Search. In fact, they did. Brad Bargan, Nuance’s VP of product development, participated in several events. My most humble apologies to them and to our readers.***

Eric B.   —   March 9, 2009 @ 4:43 pm

Dragon says, "Roar! Roar! Race car!"Speech Heads, as promised we are beginning our scientifically rigorous examination of Dragon NaturallySpeaking 10. My brother Adam B. and I have it installed on our respective computers and have been monkeying with it for much of the day. In the interests of fairness and science, we won’t be sharing any judgments just yet, but as a demonstration the rest of today’s post will be transcribed by their engine, UNEDITED, for your edification.

In addition to our review of Dragon NaturallySpeaking 10, my brother Adam B. and I also be taking a look at IBM’s via voice for Windows version 10, which Nuance was good enough to provide us with a copy of as well. We’re hoping that these two reviews will really get the ball rolling on her speech recognition engine review series, says he got injured out there a you’d like us to take a look, at leave a comment and we’d be happy to arrange for it.

Every yours,

Eric B. and Adam B.

STM Blog   —   May 13, 2008 @ 12:15 pm

So it turns out I could have cashed in my government bonds and bought EDS, guys. Because I own almost $14 billion in bonds, right? Sure. Anyway, HP announced today that it’s buying Texas-based Electronic Data Systems. You might recognize EDS from our own little magazine–EDS’ Alex Halikias writes a column for us called “Inside Outsourcing.”

The announcement means HP will be able to stake a claim in the technology outsourcing space, and directly compete with IBM. As more ginormous companies and government agencies turn to outsourcing tech projects, the market is expected to grow, according to analysts quoted in the news article. The acquisition means HP will now have 210,000 employees in 80 countries. Whoa — we have three editors at Speech Tech.

HP makes the big bucks in selling printers, PCs, and servers, but also made $16.4 billion in revenue in business consulting. The EDS acquisition will only further strengthen the company’s grip on business and technology consulting. We’ll keep you updated with news when we hear more. Also, EDS’ CEO, Ron Rittenmeyer, will stay on board with the same title. [MercuryNews.com]

In other news– Though Nuance ended its second fiscal quarter with revenue above expectations, the stock has been sliding. Goldman Sachs analyst Derek Bingham is quoted in the article as saying:

“Nuance’s March report showed that the company’s Network Speech business is not immune from macro slowing, consistent with slowdowns we’ve seen in other large-deal areas of software.”

The company’s stock is down almost 7 percent today. [Barron's]

[Image: LearnMergers.com]

Next Page »
Previous Posts
Keyword Tags
Archives
© 2008 - 2010 Speech Technology Media, a division of Information Today, Inc. About/Contacts | PRIVACY POLICY