Speech Technology Magazine SpeechTEK Conference
 

Exciting New Things

STM Blog @ 12:30 pm

I originally interviewed Mike Wehrs, Nuance’s vice president of evangelism and industry affairs for an FYI about vSearch in the July/Aug issue of Speech Technology Magazine.

Unfortunately, the time crunch was such that we weren’t able to slot the quotes into the story.

(Editor: You really need to meet your deadlines.

Ryan: I’ll work on that later.)

Mike gave some substantive quotes, however, and since it can’t get into the magazine proper, the blog is a good forum to host the interview transcript. Read it after the jump.

(more…)

VoiceSearch 08 : Final-Final Thoughts From Jim Larson

STM Blog @ 12:19 pm

James Larson, Ph.D., is co-program chair for the SpeechTEK 2008 Conference, co-chair of the World Wide Web Consortium’s Voice Browser Working Group, and author of the home-study guide The VoiceXML Guide. He can be reached at jim@larson-tech.com. He was kind enough to submit some thoughts on the recent Voice Search Conference in San Diego.

1. Voice search can be defined as (a) using voice to search text information, and (b) using voice to search voice information. There was little discussion the second type of voice search. There were many talks about the first type of voice search, especially for directory assistance, customer info lookup, and music “jukebox” applications.

2. While much of the conference dealt with voice search, several sessions addressed other speech technology topics. For example, 5. The folks form Spoken talked about Secret Agents. A secret agent is a human who monitors several ongoing IVR dialogs. The agent is notified when the speech recognition engine fails to understand what the user said. The user’s utterance is replayed to the agent, who selects the appropriate word from the grammar, or causes the dialog to transfer to a regular human agent. The overall effect to the user is the dialog works better.

I note that AT&T did this some time ago for directory assistance calls.

The goal of secret agents is to contain the user within the automated IVR system. As we saw from Paul English and the gethuman.com web site, users hate containment, especially if they have a difficult request that they feel can only be handled by a live agent. I wonder how these users will feel if they knew that a secret agent is listening to them but is not allowed to speak directly with them.

3. Mike Phillips, Vlingo, has a nice demonstration for accessing textual data by voice. Vlingo has done a lot of usability testing, and it shows when you use the UI, which I think is very good. Check out the UI by going to http://www.vlingo.com/ and clicking “watch the demo.”

4. Three hot topics of discussion were:

(1) multimodal user interfaces

(2) analytics

(3) video and voice dialog. Most conversations delt with how cool these new technologies were and how to make money using them.

5. I had a chat with David Thomson, who gave a talk about how phones can be used in social web sites. We see opportunities for speech technology in social web sites:

(1) Provide simple authoring tools so teen can create speech dialogs to their portrayed personas.

(2) Viewers could call a phone number and leave messages, which could be converted to text by general purpose dictation recognition.

(3) The virtual equivalent of an answering machine that could accept VoIP calls, filter them, and route them according to instructions by the social web site owner. I think there are many opportunities for speech technologies in social web sites.

Previous Posts
Keyword Tags
Archives
© 2008 Speech Technology Media, a division of Information Today, Inc. About/Contacts | PRIVACY POLICY