I originally interviewed Mike Wehrs, Nuance’s vice president of evangelism and industry affairs for an FYI about vSearch in the July/Aug issue of Speech Technology Magazine.
Unfortunately, the time crunch was such that we weren’t able to slot the quotes into the story.
(Editor: You really need to meet your deadlines.
Ryan: I’ll work on that later.)
Mike gave some substantive quotes, however, and since it can’t get into the magazine proper, the blog is a good forum to host the interview transcript. Read it after the jump.
Speech Tech: Will vSearch on iPhone be available to consumers?
Mike Wehrs: Yes, we absolutely have plans to make it available. The method of how it will be available is something we have not completely locked on yet. Options available to us are: downloadable through the Apple store if we wanted to go through all those agreements. We could make it available via a Nuance download web site, or make it available through partners. But yes, we are on path, we absolutely are going to make it available.
Regarding delivery methods, are you leaning in one direction over another?
There’s also a lot of pressure that’s on some of the partners as well as on Apple to answer one of the key questions: What kind of speech support if any will be in the next iPhone? And I’d say that until some of that is clarified, it’s a little premature to talk about how we’d choose to distribute in that kind of environment.
Will you need Apple to endorse vSearch?
They have a process you go through on that. You fill out the applications, you test it, you’re following the APIs documented in their SDK and you’ve done a certain level of testing for compatibility and robustness and so forth. It’s a relatively straightforward procedure to get onto the Apple Store based on what they’ve disclosed at their developer conference.
Does being in the Apple Store essentially indicate an endorsement from Apple?
The store isn’t up yet. Until Apple’s ready to announce what they want to do in that regard, none of us are completely clear if they’re going to exercise any significant judgment or editorial comment on it.
When would you project its availability?
I am anticipating, and we’re on a development plan that the application should be available in the early Fall of this year.
To what extent do you need AT&T’s support?
There are aspects of things you might want to ask for in a search that AT&T may have relationships with or may have information on their portal that they would prefer that results resolve that way. That’s kind of a secondary component of it. Right now, what apps are on the iPhone that ship from Apple is a discussion Apple has with AT&T and would not necessarily involve us. We’re not involved with that specifically.
However, why I brought up the first point is that there’s a secondary component of attractive value that we could bring to AT&T, and that would be a discussion we would have with AT&T if we ended up distributing the app not through Apple directly.
So like marketing and targeted ads?
Marketing, targeted ads, services they offer.
If someone wants to hear a song, maybe it comes up on AT&T’s portal site where they have their music available rather than going to another location. Because it’s an AT&T subscriber, would they want it to go that way instead? I think those are negotiations that are standard in the normal course of business. We do have an ongoing, good relationship with AT&T on a number of fronts with Nuance. I’m not pre-announcing anything, just saying there’s clearly opportunity where AT&T would be excited about seeing this application available, and helping on their commercial side as well.
Nuance doesn’t do a lot of consumer marketing that I know of. To what extent will that be a problem?
I’d go through the analysis on how to bring out that application, whether it’s going to be partner-launched or available as part of a storefront that has marketing that goes along with it and it’s us contributing dollars to a marketing program. Or whether we pick a vehicle that distributed it directly. Each of those decisions has associated with it: Do we have the necessary infrastructure? Are we putting in the necessary resources to be successful with that strategy? If we decided to distribute on our own, then yes, one of the things that’d be commensurate with that decision would be to put the necessary marketing dollars to make people aware of what it is, how it works, have it downloaded and if they have problems to provide the necessary customer support.
Those things are factors of the decision, but we have not made the decision that we’ll go direct.
So we might see Apple vSearch sort of like how we see Ford Sync.
If Apple came in and wanted to do it that way, we’d be very open to that kind of discussion, but I can’t pre-announce something like that.
How does vSearch differ from vlingo’s application?
The fundamental difference you’re getting on vSearch is it’s a hybrid client. There’s component of it that runs and does the processing on the hardware of the iPhone. There’s also a component that runs on the backend. That allows for more predictable and reliable performance, it also balances the difference between what you want to execute on local resources and when you need more robust and capable speech recognition that’s better provided by a server infrastructure.
We’ve had that server infrastructure for years. It’s part of our IVR, our DA stuff, it’s part of what we’ve been shipping for Nuance Voice Control as the backend support. The server capability is in there. The embedded speech support came when we bought VoiceSignal. We had some embedded capability ourselves, but the demonstration that we did with the iPhone is the first time we’ve seen a full hybrid client, where there’s certain elements of the speech recognition that depending on the condition you’re in will actually take place on the device. There are other things you’ll ask it to do, based on certain conditions, will do the speech recognition on the server. You’re not getting the absolute requirement that you have data connectivity at a high-level of reliability at a high bandwidth, which are some of the constraint that other implementations would place on the end user.
Could you go into specific advantages with embedded speech rec?
This is where you have to have the balance. Most smartphones have a significant amount of processing power and memory. So doing things like recognizing a person’s name and dialing the phone, that doesn’t require any network resource. What that allows for is I don’t need the processing delay of bringing up the network connection, sending the request to the server to look up a number that’s already on the phone. And then send the commands back to the device to dial it. We can execute all of that locally so the response time could be as fast as four or five seconds faster.
Is vSearch on the Blackberry?
We’ve been shipping Nuance Voice Control 1.0. That will morph into what Steve (Chambers, president of Mobile and Consumer Services Division) has been referring to as vSearch.
So it’s two separate products you want to consolidate into one?
I’d say that what we did first with Nuance Voice Control with both the Blackberry and Palm platform is we provided a network-based capability to access a set of services, some of which are things like dictating an SMS or dictating an email. Some of which are more search-related. What’s the news? What’s the traffic? What’s the weather?
We’ve always had plans to do the next release of that, which everybody has been calling Nuance Voice Control 2.0. That is something that is going to be available in the Fall.
(When we) demonstrated on the Apple iPhone platform, we stressed the search components. VSearch is the way you can colloquially refer to that. Since we haven’t announced a product specific to the iPhone at this point, I can’t go into the naming architecture. But it’s safe for you to assume that any of the smartphone platforms will have a robust, embedded-plus-supported-by-network-capabilities for search and for access to features.
In a large degree, we’re trying to start with two very, very important things that people spend a lot of time doing. So a set of features around messaging, a set of features around search. You can extend from that. Certainly the architecture we have allows for accessing many more features of the phone, should we voice-enable them. The first thing you should look from us is this general category of search and this general category of messaging that will be very robust.
So in the Fall, we’ll see Voice Control 2.0 on Blackberry and Palm. And we should also see vSearch on the iPhone.
That’s accurate. And you could say the backend infrastructure associated with supporting both of those products will allow for a very flexible kind of searching experience, and will be common between the vSearch application on the iPhone and between what comes out as our NVC or vSuite client that supports those features in other platforms.
~
Of course, we all know that Nuance has filed a patent infringement lawsuit against vlingo. In the interests of fairness, here’s an interview technology and business periodical Xconomy conducted with vlingo CEO David Grannan. He also addresses how vlingo’s technology differs from Nuance’s.
~
Finally, I mentioned in the previous entry a New Yorker speech tech article by John Seabrook. I got this message in an email a few days ago. I thought it was funny, so I’m posting it:
I would very much like to smack Seabrook—he brought a puppy to the (New Yorker) spring books party….who does that?! You know it’s just an affectation, so he can hit on female interns. At least that’s my professional diagnosis.

STM Blog —
June 27, 2008 @ 12:30 pm