Speech Technology Magazine SpeechTEK Conference
 
Eric B.   —   March 31, 2009 @ 6:18 pm

Jenga!Yesterday, the Gerson Lehrman Group (GLG) provided analysis of a joint study between Harvard University and Warwick University. The results, they suggest, put a damper on the unspoken implications of a 2008 Nuance study that found using speech recognition was safer than using tactile controls.

The Harvard/Warwick study, which had a quick rundown in Wired magazine last December, found that “The worst results came from the subjects tasked with listening to a list of words and then speaking new words that began with the same letters as each word on the list. Those ‘drivers’ had a 480 millisecond delay, which at 60 miles per hour would mean 42.3 additional feet traveled before applying the brakes.”

This, GLG extrapolates this to mean that voice command-and-control will have similar results.

“This task is similar to using an in-vehicle system for command and control purposes.  The driver is speaking to the system and then waiting for [its] response and possibly speaking again,” it writes.

It’s quick to add, however, that speech interactive systems often offer shortcuts and reduce the amount of time require to engage with them, possibly mitigating some of the risk.

It should also be noted that these results seem to collude with a AAA study we reported on last month on the main site, that concluded that the danger to drivers in using wireless devices was not primarily the use of their hands, but the use of their cognitive attentions. Where strict safety is concerned, really drivers shouldn’t even been listening to music, much less doing anything more complicated.

The conclusion that GLG comes to is that voice command-and-control while safer are not safe. It suggests that Nuance’s report has some limitations. This isn’t the first time it’s questioned the 2008 report. In July of 2008, GLG questioned the significance of the sample size, thirty participants, and how accurate a study in an artificial simulated scenario would be in the real world.

Perhaps somewhat derisively, it writes,“Nuance recently released the results of a study that claims to “prove” that speech recognition used in-vehicle while driving increases driving safety. I’m sure that the results of the study are right, to the extent that Nuance is releasing any data and conclusions.”

Responding to the concerns raised by GLG in yesterday’s analysis, Michael Thompson, senior vice president and general manager of Nuance Mobile, says, “The results of last year’s study demonstrated that speech-powered systems in vehicles help reduce driver distractions posed by manually entering information into navigation systems, entering music selections via mp3 players, making and receiving phone calls, and so on.  Clearly, the safest option is for drivers to simply refrain from using these devices and applications, but for those who insist on using them, the study showed that a hands-free, eyes-free option provided by speech is the next best alternative.”

Perhaps, Thompson is right. Who, for instance, is going to forgo listening to music in the car? On the other hand, one might argue that it isn’t enough for any manufacturer, developer, or even person to take morally neutral stands, reconciling ourselves to saying people oughtn’t do it, but we may as well make it safer. That’s perhaps too easy an answer. But then, what can you do? If Nuance doesn’t do it, some might say, someone else will, and then they will have ceded important business ground, really the existential foundation of their entire venture into automotive work. If there is a demand, are companies responsible first to some arguably tentative moral stand (after all who is authorized to make decisions for people unilaterally?) or the market?

And there is a market. My brother Adam B. for instance, will never stop using speech in the car. He moonlights as a NYC cabdriver–one of the 5% of cabbies in the City without a driver’s license I may add. His cab is so speech-enabled that it won’t even start unless he politely says “Good morning, Mackie”– Mackie’s the cab’s name.

For dangerous speech-enabled drivers like him, there’s just no reformin’.

4 Comments

  1. The question is not if voice apps are distracting but WHAT about using voice in the car is distracting. Counter intuitively, simpler limited command systems can impose heavier cognitive loads. (#@&*!??? what was that command?)

    Why is talking to a person next to you in the car so different? Partly because you are not forced to use an arbitrary and small vocabulary. Years later, I still say to my cell’s voice dialer “Call Chris at the office” when it can only understand “Call Chris at work”. I make the mistake most dialing “hands free” while driving.

    Another, and very BIG issue is that today’s voice apps are not really “there with you”. (see http://ejtalk.com/wordpress/?p=10)

    As an after thought, let’s not forget that there are just a lot of voice interfaces that are just … well … after thoughts.

    Comment by Emmett Coin — April 2, 2009 @ 9:25 am

  2. I wrote those analysis of those studies b/c the preponderance of articles that came out when the study results were released proclaimed things like “Nuance says that speech rec in the car is safe!”.

    It’s not.

    We’re getting to a point where embedded speech rec can accommodate larger grammars, but there’s still a pretty long lag between when new technology is developed and when it’s deployed in hardware. For an automobile, it can be up to 5 years.

    I agree with Emmett’s analysis, BTW.

    I prefer to remain anonymous because I used to work at Nuance.

    Comment by anonymous — April 14, 2009 @ 2:14 pm

  3. It’s an interesting discussion we’ve got going on here.

    I am by no means a psychologist, nor have I done any of my own research into the subject. I should note though that the conversations I’ve had with experts, particularly from AAA, seem to suggest that the danger in using a phone in the car, or the radio, or any other device for that matter, stems from the cognitive load they impose and the fact that those devices do not respond to road conditions. I’m not sure if a limited vocabulary would contribute to that, but intuitively I feel like that makes sense.

    I wanted to add, though, that when I spoke to Fairley Mahlum, the director of communications for the AAA Foundation for Traffic Safety, she suggested that the difference with talking to someone on the phone versus in the car next to you is that the person next to you has some idea of what’s going on around the vehicle. They know when an accident is about to happen and can react. Someone on the phone has a very limited idea, and our technology for that matter has no to very little idea as well.

    I imagine that the more we see studies conducted around these issues the more we’ll understand the nature of the problem more precisely.

    Comment by Eric B. — April 14, 2009 @ 2:37 pm

  4. Hi Eric,

    A limited command set could impose a larger cognitive load on the user, as they will have to remember the exact syntax / command structure that they would need to use for the application. If it’s for features that they don’t use that often, I would expect that the cognitive load would be higher.

    Here’s a Wired article discussing your point about conversations w/ a person in-car vs on the phone.

    http://blog.wired.com/cars/2008/12/new-study-confi.html

    I would expect that a speech app would have more characteristics of a phone conversation vs an in-car one. That is, unless the automation gets hooked up to the sensors in the car to be able to detect sudden braking or something like that to shut it up in an emergency. Of course, that brings up other issues.

    Comment by anonymous — April 19, 2009 @ 12:06 pm

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Previous Posts
Keyword Tags
Archives
© 2008 - 2010 Speech Technology Media, a division of Information Today, Inc. About/Contacts | PRIVACY POLICY