Speech Technology Magazine SpeechTEK Conference
 
Eric B.   —   July 29, 2009 @ 11:37 am

What I want to know is where you can buy just jacket sleeves.Strike up the ominous-sound string section, Speech Heads. Invoking its ban against apps that “duplicate features of the iPhone,” Apple has rejected Google Voice’s application to be offered through the App Store—dum dah!

The app, which has already been launched on Android and BlackBerry handsets, has been “under review” for the last six weeks for iPhone release and was roundly rejected  yesterday. Despite Apple’s officially stated reasons, prevailing wisdom on Wired, Gizmodo, and PCWorld maintains that Apple made its decision to shield its partner AT&T from having to compete with Google.

Among the features that Google Voice offers its users are:

  • The ability to allow users to share a single number across different phones (the Grand Central functionality);
  • SMS;
  • Voicemail;
  • Voicemail-to-text; and
  • Cheap-o international calls

The kicker here is that Google provides all this for free (except the international calls) in opposition to, say, AT&T’s pay services. Most worrying for the provider, some speculate, is the Grand Central functionality, because it has the capacity to make phone numbers (one of the main holds telcos have over customers) irrelevant. The move might have been expected considering how disruptive Google can be, and given that Apple has total control over what apps make it on to an iPhone, and double given that Apple profits best when AT&T stands to make gains.

The move, however, has opened the monarchically-run App Store to increased criticism—and from a powerful voices (no pun intended) like Google. This is by no means new. A chorus has been growing. Back in April, Skype and a number of other parties, asked federal authorities to enforce its broadband (read: net neutrality) rules in the wireless space, and prevent telcos and OEMs from blocking or restricting access to features and services. The feds, however, haven’t made any moves, so it’s still unclear which way this will swing.

As my brother Adam B. has said while polishing his Chicago typewriter, “The only thing that can be certain about the future is that men will die. Awful and bloody deaths.”

Eric B.   —   July 27, 2009 @ 1:01 pm

"Hey! That robot is wearing a skirt!"Speech Heads, it seems like even researchers and technologists from the Association for the Advancement of Artificial Intelligence (AAAI) have also been struck with this summer’s creepy talking robot fever.

According to a New York Times article today, a group of a-list scientists that included Eric Horvitz, researcher from Microsoft, and William Joy, the co-founder of Sun Microsystems, convened at the Asilomar Conference Grounds on Monterey Bay in California to do nothing but talk about robots. At the center of their conversation: whether or not robotics was advancing to a point at which they might encroach on human liberty and build smarter-than-man selves to exterminate the scattered fleshbags that populate Earth in twisted Karel Capek-like scenarios. Among such scenarios “What could a criminal do with a speech synthesis system that could masquerade as a human being?

They pondered heavily over everything from predator drones to robots which can find plugs to recharge themselves, trying to outline recommendations that could prevent a terminator-like post-apocalyptic existence.

Click here for full details.

This week’s New York Times robot piece follows a brilliant expose the paper did last week about Japanese men who are in love with body pillows.

Incidentally, my brother Adam B.’s first love was actually an old futon mattress, which raised my suspicions. A robot story? A man loves pillow story? Was he secretly moonlighting for the Times, pushing his agenda on its readership? Alas, no, Speech Heads. Turns out his subscription to the  paper is going to be up at the end of the month and the paper is just making a desperate bid to get him to renew by targeting articles at him. The newspaper industry must be in a real bad way…

Eric B.   —   July 24, 2009 @ 11:17 am

What's all this then? We'll have no trouble here. This is a local speech vendor for local people!A series of recent postings from the BBC have questioned the validity of the SpinVox as a “speech technology company.” The posts (one news item and a blog post) raise questions around whether there’s an actual speech engine powering the SpinVox service or just an elaborate archipelago of call centers flung around the world, transcribing messages by hand, as some have accused.

The BBC cites photos from a Facebook group created by staff at an Egyptian call center, RAYA, which used to work for Spinvox, “containing what appears to be sensitive commercial information” as an indication that the human transcription may play a bigger role than SpinVox would like to let on. The blog especially seems to hint that almost all of SpinVox’s transcriptions are done by hand. The company, for its part, has fired back, calling the BBC’s reporting “incorrect” and “inaccurate,” in its own blog response.

Christina Domecq, SpinVox CEO, goes one step further in her response to the Guardian. Domecq says, “The majority of calls are fully automated.”

I was actually able to find the Facebook group that the BBC mentions, “Sp!nVox R@Y@,” pretty easily. It hasn’t been updated in about a year, which may mitigate its importance since SpinVox claims to be upping the proportion of automation behind its service. That said, among the claims made in the group’s description are: “we work behind the scenes…we are the invisible heroes[sic]… we are Tenzing… we are unknown and they deny our existence[sic] !!”

The photo which BBC alleges might contain sensitive commercial information is pictured below.

SpinVox dummy message

“The photo is from a training session – that is a dummy message,” claims Rachael Lyons, director of North American communications for SpinVox, in an email to Speech Tech–in other words, a fake message that isn’t sensitive at all.

“[The RAYA agents] were using training data – a model system that SpinVox uses to evaluate the quality of call centre support before contracting with the supplier to handle real user data,” writes SpinVox on its blog. “The training system will require individuals to convert full messages in order to establish their speed and accuracy.”

“This would not be the case on a live customer system where should the VMCS system need assistance in learning, operators would only be presented with portions of any message for assisted learning,” the blog adds.

To prove the company’s point, Ms. Lyons cites a number of items from the group’s photos:

  • Pictures show training screens you can tell that because they have other apps running – doesn’t happen in live systems
  • Screens have whole messages on them – doesn’t happen in Live systems
  • Pictures show training presentations and materials
  • Audio is known training audio we use all the time
  • Dates on the site were when SpinVox was testing RAYA
  • SpinVox hasn’t required for the site to be taken down because it doesn’t breach security

Subsequent to the trial, SpinVox says that the RAYA call center was not retained to handle live data. It says the same for Kencall in Kenya, which failed to meet criteria and, incidentally, also has a Facebook group that complains about working for SpinVox that also hasn’t seen an update for almost a year.

The BBC’s only named source in the article is Kareem Lucilius who said he worked for six months at the call center, alongside as many as 150 others. He said that after initial training, he went on to transcribe live messages.

“It was done 100% by people,” he claims, adding, “We heard the message from the very beginning to the very end. Love messages, secret messages, messages with sexual content, even people threatening to kill each other.”

Mr. Lucilius, however, seems to have worked for SpinVox under the auspices of RAYA. A Facebook account by that name has commented on the group page (this may be how the BBC reached him) and a “Kareem” is pictured sleeping or pretending to sleep in a photo. If he did work at RAYA, this raises some questions because SpinVox claims that company never saw any live messages.

On the other hand, a post on the Facebook page from April 2008 reads, “Mabrook for those who made it to the Live Session..You guys kick ass!” A second post in June 2008, from the same person, clearly identifying himself as being from RAYA, asks if the UK workers if their servers are down, implying that RAYA may have been processing live messages.

Whatever the situation, it is not helped by the fact that SpinVox’s new and rapidly expanding technology has some secrecy surrounding its center, leaving room for doubts, speculation, and innuendo.

The company won’t say, for instance, how much human intervention is involved in transcription, describing the actual proportion of messages automatically converted as “highly confidential and sensitive.” Rather, it says that it requires only a few hundred agents per market to convert messages without learning assistance. In Argentina, where SpinVox has 10 million customers, it has less than 70 call center staff, suggesting it’s not feasible for SpinVox to operate without some level of automated speech recognition, the Guardian suggests.

Ms. Lyons adds that SpinVox has “reduced human assisted learning to just 2% of what it was when we started.”

What that means without a clearer view of how many messages SpinVox processes with human assitance, however, leaves much to be answered. As my brother Adam B. would say, “This is a real brain buster!”

Adam B.   —   July 22, 2009 @ 10:42 am

Hey Speech-Heads,

As you are by now doubtlessly aware, my Speech Brother Eric B. and I are gearing up in a BIG way for SpeechTEK 2009.

This year’s SpeechTEK is being held in NYC at the Marriott Marquis in Times Square.  And just like everyone else here at The Home Office, we want SpeechTEK 2009 to be THE BEST EVER.  To that end, Eric B. and I have been meeting with our Editorial Team, tossing around ideas, pitching concepts, suggesting changes, talking shop, taking the 30,000 foot view, mind showering, and participating in brain dumps.

Some of our ideas have turned out great.

Some of our ideas…not so much.

So, in the interests of Full Disclosure, we present you with the following:

TOP TEN REJECTED IDEAS FOR SPEECHTEK 2009

10. VUI Designer Kissing Booth

9. “Pants Optional” Breakout Sessions

8. Welcome to SpeechTEK Key Party Keynote

7. Mandatory “Do A Shot Every Time Someone Uses an Acronym’” Policy

6. CEO Dunk Tank

5. The First Annual IVRave

4. Free Tiger Cubs To The First 100 Registered Attendees

3. Conference Track Dedicated to Hearing Aid Technology

2. Economic Downturn Pancake Breakfast Buffet

1. Mergers and Acquisitions Petting Zoo

Adam B.   —   July 16, 2009 @ 9:43 am

Hey Speech-Heads,

If you’re anything like my Speech Brother Eric B., you’re already looking ahead to SpeechTEK 2009.  And if you’re already looking ahead to SpeechTEK 2009, that means you’ve probably got a case of Keynote Fever.

This year’s Conference is going to be better than ever, with a host of speakers and some really great Keynotes:

On Monday, August 24th, Paul Greenberg, president of The 56 Group, will deliver a keynote Voice of the Customer.

On Tuesday, August 25th, Jeffrey F. Rayport, founder and chairman of Marketspace, LLC, will deliver the keynote Best Voice Forward

And, on Wednesday, August 26th,  a special Keynote Panel will address the topic of SaaS in Speech: “The low-cost, high-value of the Software-as-a-Service (SaaS) model, which has already revolutionized the CRM market, is now making inroads into the speech industry. Hear what some of the leading SaaS speech technology vendors have to say about SaaS in the speech technology industry.”

Eric B.   —   July 15, 2009 @ 5:49 pm

"Holy mergers and acquistions, Batman!"Speech Heads, if you caught my brother Adam B.’s article today, Nuance has acquired Jott for an undisclosed amount.

The deal is apparently a month old, and was only announced after a web page from Ackerly Partners, one of Jott’s investors, noted that the deal had been made in June. From there, according to Brier Dudley’s Seattle Times blog, the news burbled up on TechFlash.

The acquisition, however, signals that Nuance is serious about its place in the mobile space. As we’ve reported before in our review of Nuance’s voicemail-to-text offering, VM2T, and subsequent articles, the company’s entire business proposition is OEM and carrier-facing. Nuance has not made direct-to-consumer plays, letting its partners—many of them already big household names—face the public with their offerings. Jott, by contrast, makes direct-to-consumer bids.

Given all the fanfare about carrier deals from some of Nuance’s competitors, the acquisition of Jott has gotten some thinking that this might be a shift in direction for Nunace. Not so, though, says Mike Thompson, senior vice president for Nuance Mobile.

“Our primary customers are operators, OEMs, and enterprise organizations. That’s who Nuance sells software applications and services to, and that will continue to be the highest priority,” he says.

He adds, however, that Nuance does “do consumer work for a variety of reasons in certain parts of our business. Being very close to consumers allows for rapid innovation and lots of interesting things that you can learn.”

He also asserts that the purchase of Jott is not a reactive gesture to happenings in the mobile market at all, praising its new property as being strong and innovative. Nuance has no plans to scrap Jott’s direct consumer customers, nor have its strategy do an about face. Rather, it plans to build on Jott’s strengths with its own.

“As a small start up, Jott’s strategy was selling direct to consumers,” writes Datamonitor associate analyst Aphrodite Brinsmead in an email to Speech Technology. “Nuance will continue to support and target customers directly but its key focus will be in gaining carrier relationships. Carriers have a large, diverse user base and the ability to bring speech-to-text to many new customers.”

She points to Jotts offerings like Jott Assistant which handles voice reminders, texts, emails, etc. as value that Jott brings to Nuance.

“Nuance will gain a stronger position against growing competitors, such as SpinVox and Google Voice, by adding extra features like these to its service,” Brinsmead says. “Nuance is ramping up its mobile portfolio and aims to automate all mobile interactions with speech.”

Adam B.   —   July 14, 2009 @ 10:20 am

Speech-Heads: The hour is nigh: SpeechTEK 2009 is fast approaching.  Now is The Quickening.

Here at Speech Tech HQ, my Speech Brother Eric B. and I are literally going mad with excitement: speakers, demos, STT, exhibitors, ASR, keynote addresses, TTS and much, much more.

Check out the Advance Program for all the speechy details.  I think I speak for all of us here when I say: MISS SPEECHTEK 2009 AND RISK TOTAL OSTRACISM FROM THE SPEECH WORLD.

Still not convinced?

Here is a Video to help get you amped up for SpeechTEK 2009:

Eric B.   —   July 1, 2009 @ 11:21 am

"I sentence this blog to a new post for failure to fully explore the matter!"Hey, Speech Heads. If you caught yesterday’s post, an omitted sidebar from my article, you likely saw a response from Walt Testchner. He felt that I had taken what he said out of context.

I wrote that he said that felt that technologists were the best gauges of knowing whether their art is infringing or something truly novel—after all, they are, at least ostensibly, masters of their craft. They should be in a position to know.

Walt, however, made a second point that we hadn’t included in the sidebar. He reproduced it his response, but I’m going to quote it more visibly up here in the interest of fairness.

He wrote:

I really believe that using lawyers to do the search for prior art is a huge part of the reason that we have so many patents issued that are bogus. If the inventors did the search themselves they would be less likely to submit a fraudulent patent application. As I recommended in the article that I sent you, the way to reduce the number of fraudulent patents that are issued is to make knowingly submitting a fraudulent patent application equivalent to lying under oath.

What Walt’s arguments seem to suggest (and I’m sure he’ll correct me if I’m reading this wrong) that the patent system should be driven by a community of inventors that make up their given art, and that to ensure it is, practitioners skilled in their art need to take more personal responsibility in pursuing their own patents.

When he talks about egregious examples of “fraudulent patents” or advocates for a more inventor-driven process, the sum total of his arguments suggest that bringing outside forces (read: lawyers) into the patent process has perverted it. The process has been taken away from a community of practitioners skilled in the art and turned into something else, perhaps a moneymill for lawyers.

In full interest of fairness to attorneys, I think many of them would argue that they are not outside forces; that a patent is a legal protection with legal jeopardy attached to it. They might say that a patent is the, at best, uncomfortable marriage of technology and law and they are inherently a part of the process, not an extraneous force. At least that’s what I read in Gregory A. Nelson’s comments. He suggests that if a patent is worth pursuing, it’s worth pursuing with professional guidance.

I certainly won’t weigh in on this personally. I am neither a technologist nor an attorney, but I think this a really interesting debate. It gets to the core of not only what a patent is, but what it ought to be and the real world constraints that impinge on the ideal.

When I wrote this sidebar, my hope was to engage readers in a debate about the fundamental core of patents. I think there is a very vital and interesting debate at the center of what Walt and Gregory are saying. In the original side bar posted yesterday, because of space constraints for print, much of what I wanted to get into was cut. The discussion suffered; I’m sure.

In retrospect, I should have overhauled it for the blog. This is the perfect venue for a longer more nuanced discussion. My apologies to both Walt and Gregory on that count. I am sincerely glad that Walt wrote back, though. I think this has become to perfect opportunity to open the floor for a wider discussion. I’d like to throw this back to Walt, Gregory, other technologists in speech, and the legal community. If you have anything you’d like to post, feel free to comment or write me directly. I’d like to make the blog a space to talk about this.

Previous Posts
Keyword Tags
Archives
© 2008 - 2010 Speech Technology Media, a division of Information Today, Inc. About/Contacts | PRIVACY POLICY