Speech Heads, I know we don’t usually blog here on Fridays, but I just caught wind of a text-to-speech flash greeting card generator even weirder than that vaguely horrifying thing that Nuance put together for Christmas:
The CareerBuilder Anonymous Tip Builder
The builder is Careerbuilder.com’s of a sick joke. It allows users the ability to “anonymously” send TTS tips to coworkers without them knowing who sent it. You can say things like, “Please do something about the stench the emanates from the billowy folds in your shirt,” or “Be sure to refill the coffee pot when you finish it off, you sower of human misery,” and they will never be the wiser about who sent it.
Your personalized message is delivered by one of four terror-inspiring avatars framed in office tableaus. Choose from this nipple-shirt guy you see featured here, or an alligator in a suit, or a floating zombie business woman, or a strange salesman playing the piano with his feet. From that selection, you’ll be whisked away to another menu where you can input whatever message you like. You can pick from three male voices or three female voices.
I’ve already sent one to my brother, Adam B., advising him that if he continues to read from my journal unknown paid assailants may very well render him limb-from-limb.
As for the app itself, we aren’t sure what speech provider powers the underlying capabilities of this little TTS dandy, but you better bet we’re working overtime to figure it out.
In the meanwhile, enjoy THIS little tip we’ve put together for you!
Well, the Moshi IVR Alarm Clock is still making Speech Waves.
The Wonder Foundation–founded by Stevie Wonder–and the Sendero Group this month awarded Moshi with the Vision Free Award at the Consumer Electronics Show.
The Vision Free Awards program recognizes companies that create user-friendly products and services designed with the visually impaired in mind.
The IVR Clock features technology that utilizes an unique hybrid model of Neural Net and HMM Phonetic Speaker Independent (SI) Recognition software. It allows for large vocabulary recognition that is highly accurate, noise robust and capable of accommodating a wide variety of accents. And, without any pre-programming, each Moshi IVR Alarm Clock can recognize every user’s voice right out of the box.
So go out and get yourself one today. I have one. And so does my Speech Brother Eric B.
It seems like the whole speech industry is just a titter with acquisitions and buyouts these days. The big are getting too-big-to-fail, and the small are getting sucked up like plankton through baleen. Heck, it’s not even the small these days. Just in the last two weeks, we saw Nuance gobbling up patents and licenses from IBM like a fat king on a turkey leg and SVOX gorging itself on Siemens’ speech unit.
Back in December, Roberto Pieraccini from SpeechCycle told me that mergers and acquisitions were happening so fast that even he couldn’t keep track of them.
So far, a lot of the action has been all within the confines of the speech world, but all these acquisitions got me and my brother Adam B. thinking about Japan. So often the Japanese see mergings of the two unlikeliest companies: the Lucky toothpaste company and Goldstar electronics firm to form LG; the Yamaha musical instrument company buying up a motorcycle manufacturer to form the perplexing giant we all know today; or even good ol’ Nintendo, which sold card games in the 60s, but branched out to run a chain of “love hotels” and a cab company.
We were wondering, what if the same happened in speech? What if speech just trounced all over sensible vertical market expansion? What would be some unlikely mergers we’d like to see? Dare we imagine? Yes. We dare.
Behold, Speech Heads! The 2009 Speech Technology Dream Team-Ups:
1.) Nuance merges with the New Balance shoe corporation to form Nuance Balance.
Nuance will be looking to expand their reach, and this one just made the most horse sense in the world; sympathetic corporate cultures, practically rhyming names, the growing need for a top-rate, speech-enabled shoe corporation. Have you ever been running a cross-country meet and just felt an overwhelming compulsion to dictate your memoirs? Wish away fruitlessly no more. Nuance Balance has a solution for that.
2.) Avaya acquires Dairy Queen in a hostile take over.
Avaya more or less lets Dairy Queen continue as its own separate brand, but begins incorporating free ice cream into its IVR call-routing systems.
Imagine this, as a caller becomes frustrated with a system, unable to get the service he desires, but rather than being transferred to a domain’s underpaid operator who will likely hand him off to someone else in the domain who can’t help him, he is instead routed to a free and delicious DQ Blizzard—vanilla blended with Oreos! Talk about tasty CRM; that caller has probably just forgotten the outstanding payment he was calling about in the first place. Banking error in this domain’s favor…
3.) Nexidia buys up the controlling shares of the WWE wrestling corporation.
You know those long-winded monologues wrestlers deliver before a big fight? The ones where they swear to break this, and smash that, and clothesline a fella so hard his ancestors will feel it in organs they weren’t aware they even had? Those little pep talks are all very theatrical and great fun—we know that—but there’s not a whole lot of accountability in them is there? Who knows if the promised pile-driver Macho Man Randy Savage menaced on a Monday Night Raw is delivered to Nature Boy Rick Flair on a Tuesday Night Titans? Well, prepare for a new era of accountability.
In this dream match-up, Nexidia applies its video search tech on-the-fly to WWE events; tagging and tying the pain a wrestler guaranteed outside the ring to his actions in the ring. A ticker at the bottom of your screen lets you know in real time if wrestler Jimmy “The Mouth of the South” Hart is delivering on that smackdown he promised last week.
4.) SpeechCycle acquires the patent to the Foreman Grill.
Not satisfied with the mixed results of traditional consumer grilling, SpeechCycle decided that it was time to provide world consumers with the grilled food they’ve longed for, at least statistically speaking. Using their data-driven approach, the SpeechCycle Foreman Grill uses aggregated data to provide us with the median steak of our collective dreams.
Just put your dinner in and the grill does the rest. In order to ensure the best results, the device is constantly acquiring data based on a number of metrics. The grill is speech-enabled to recognize utterances like: mmm, tasty, delicious, or ugh, putrid, and This is the most foul meal I’ve known in all my years. If you aren’t satisfied with your meal, don’t blame SpeechCycle. Blame the sum total of human desire.
5.) PerSay partners with the Cornell University Department of Animal Husbandry.
Looking to patch a number of glaring security problems (the recent theft of several heads of cattle; the spate of sabotage that has hit a number of Cornell’s beasts of burden, including a prize ox; and the vandalism of two dozen carrier pigeons) Cornell gives the Israeli biometric giant, PerSay, administrative control of its department.
PerSay overhauls Cornell’s stable of animals, limiting access to only authorized users who can pass their 96 percent effective voice verification process. After implementation, the department sees a drastic cut in its farm-crime rates; however, some problems do persist. University investigators find that the acts of sabotage were actually being carried out by Animal Husbandry faculty. An inside job! Arrests are made and one, Professor Newman Von Heidleborg, the ring leader, is prosecuted to the fullest extent of the law.
You may have read the recent news brief online at Speech Technology about Loquendo adding yet another voice–this time that of Mikko–to their already vast Text-To-Speech Family.
And while any Speech-Head worth her salt is already well aware of Loquendo’s TTS, many of us–my Speech Brother Eric B included–are unaware of the Loquendo TTS Family Tree.
All told, Loquendo offers sixty-two different voices from an array of different countries in what amounts to an United Nations of Speech Technology.
And while we’re talking TTS, don’t forget to check out the TTS version of our daily News Features starring the aforementioned Allison–a feature that we recently expanded to let us deliver even more Speechified News.
We recently got a comment about the blog, asking us why we don’t cover big headline items like SVOX buying out the Siemens speech patents or Nuance buying IBMs speech patents last week in this space. We also got a comment wanting to know why we insisted on using the word “Speech Head” and seemed to take some umbrage with the general tone here.
For the most part, the feedback we’ve gotten has been positive. In fact, this is the first time we’ve heard back from someone in the community taking issue with what we do on the blog.
With regard to the content of the blog, we see it as just one component in a much bigger enterprise. The main part of what we do is, of course, the print magazine itself. It’s the mothership from which all other work we do flows. In the monthly print version of Speech Technology, we take on bigger picture industry items. For our January/December issue, which should be hitting the streets and our main page any day now, I, for instance, wrote a feature on trends to expect in the industry in 2009. I spoke with several experts to get some perspective about where things are heading, with particular regard to how the recession is going to be affecting speech over the next 12 months—where there’s still room for growth, where things are expected move more slowly. I know my colleague, Adam, worked on a feature for that same issue concerning how to assess the need for speech.
We also have some in-depth articles about natural language (both in theory and practice) that you can expect to see in March.
In addition to our print mothership, we have the Speech Technology website where Adam, Lenny, and I all post daily about important news in the industry as it evolves. Both of the news that our commenter noted, for instance, can be found there. Adam covered the Nuance/IBM story last week and actually worked on the Siemens story for today’s news. We’re constantly talking to vendors, analysts, and users for those stories and we work hard to bring you perspectives and content that goes beyond what you might find elsewhere.
We try to keep the tone light, fun, and informal. Our use of the word “speech head” and even our half-baked catch phrases like “Get speechy with it,” have been part of our earnest (if misguided—actually, definitely misguided, but true-hearted nevertheless) attempt to make humor. We say “Speech Head” with all the love and affection in the world. Let it be known that Adam and I would be the first to self-apply the term. We’ve actually even been talking about making t-shirts for ourselves.
Really, the idea behind this blog is just to serve you. To get you, the reader, talking, moving, interacting. The blog is a much newer feature of the Speech Tech Media Empire and we’re still trying to figure out what the best thing to do with it is. If there’s something you want to see us doing, please comment. Let us know. We’ll be glad to take anything under consideration. We’re doing this for you.
So please, if there’s something you want or you just want to tell us how much you love the blog, post us some comments. Or email me at ebarkin@infotoday.com or Adam at aboretz@infotoday.com Let’s get some discussion going, Speech Heads.
Speech Heads, I’m sure some of you out there are well familiar with today’s highlighted technology: The Talking Bottle Opener. Any one of you with a speechy sports fan in your life who loves knocking back a cold one while they watch the big game has doubtlessly seen one of these.
Personally, I can say that my father has a drawer full of these things in his kitchen. There’s the one that plays the University of Miami fight song; another that features Curly Howard (of Three Stooges fame) saying, How about a beer? Nyuk, nyuk, nyuck; one that has Homer Simpson saying, Ooo, beer!; and a number of other ones that I’ve more successfully suppressed from my memory.
My father, it should be noted, doesn’t even really drink. He’s one of these types that constantly tries to foist a bottle on his visitors without ever actually imbibing himself. Often times, I suspect he’s just trying to create opportunities to use his varied speech-enabled openers and only stocks his refrigerator with boxes of Costco-bought “Beers of the World” to use them.
My dad is pretty much just a big speech head with a real flair for novelty. In fact, he is probably where I inherited my own insatiable love of speech tech from.
Back to today’s tech, though. The talking bottle opener is a pretty straightforward device. It takes advantage of the natural conductive properties of most bottle caps. The opener’s metal teeth are essentially a broken electrical circuit. When you press the opener to the cap, the circuit between the teeth is completed by the metal cap, thus activating the speech and making the handheld device issue forth its pre-programmed utterance.
To the best of our knowledge at STB, none of the bottle openers have an onboard dynamic speech interface, though. Now, this may just be my pint of view, but it sure seems like some enterprising Speech Head could rig a beer opener with a TTS engine to make one, if she were so inclined.
Confession time, Speech Heads!
The reason I’m really writing this post is that I would like to solicit the speechmunity to produce a Speech Technology Blog talking bottle opener; one that uses my brother Adam B.’s own voice to utter some of his favorite catch phrases like Get speechy with it! or That’s speechlarious! or This IVR is bustin’ my brains!
Please, please, oh please. My birthday is in just a couple of weeks. Make my birthday wish come true, speech heads.
Delve Networks has just released a little app that uses their video search tech to navigate Barack Obama’s inauguration speech.
The engine is pretty keen. It pulls up all sorts of key terms like America, blood, dirt, and freedom—in short, everything we love in the U.S.A, my brother Adam B. notes.
Still, the engine has some limitations. It can’t pull up any word in the speech, just some key words that are, to the best of our knowledge, defined by Delve itself. You can’t, for instance, pull up every iteration of the word “and.” Aw shucks, right?
There is also at least one very glaring omission from the keywords, too. A commentator from Blog Le Monde complains that while you can search for terms like “Muslims” and “nation,” when he searched for “nonbelievers,” a word spoken for the first time on Tuesday in an inaugural speech, the machine was [notably] silent.” I feel like there’s some played out joke about French atheism in here, but Le Monde’s man has a point.
When I tried to replicate his results, I was unable to find the word “nonbeliever,” “non-believer,” the term “non believer,” or even just “believer.” I eventually found the passage in question by searching “non” by itself. The French paper’s blog seems to suggest there’s some foul play at hand, but at the very least it is a curious oversight.
See if you can find any other glaring lapses and let us know, Speech Heads.
As you know, with our Speech Technology Deadlines approaching My Speech Brother Eric B. and I can’t bring you as many Speechy Updates as we’d like.
But, when I came across a remote control toy helicopter that obeys voice commands in both English and Japanese, I knew I was duty bound to report it to the Speechmunity.
Check out this link and this link for more details.
As you may know, the Higher Ups here at Speech Technology are still accepting speaking proposals for SpeechTEK 2009. The deadline was January 19th, but we’re extending it until next week.
I have been told that customer case studies are preferred. And I have been told that you should check out the link to our Call For Participation Page!
As my Speech Brother Eric B. would say: “Speech is power: speech is to persuade, to convert, to compel.”
Speech Heads, if you haven’t caught BBC 4’s Fonejacker, you are missing on on some speech hilarity. Fonejacker is the brain child of British-Iranian comedian, Kayvan Novak. Building on the rich cultural tradition of acts like The Jerky Boys, the show revolves around a slew of characters that Novak voices and uses to prank call unsuspecting Britons.
Of special note to us Speechies, Novak has a routine where he calls live operators and harangues them as if they were IVRs, reminding us of the pitfalls of the increasingly complicated natural language systems we love so well.
Talk about some comedy gold, my brother, Adam B., says this stuff is so speech-larious that it makes milk come out his nose–and he can’t even drink milk. He’s heinously lactose intolerant!