Does Apple’s Siri Signal a Turning Point for Speech Recognition Services?
The release of Apple’s iPhone 4S, which comes loaded with the Siri ‘Assistant’ iOS app, is a boost for voice and text-to-speech-based operations that was so far playing catch up with messaging and text inputs. Voice Service Communication is definitely on the comeback trail.
A few weeks ago I read a status update on Facebook that went something like this, “…at the rate at which texting and chatting are taking over our lives, human voice will soon be extinct”. It was meant to be a joke, a self-deprecating commentary (poking fun at self) on how obsessed we are all becoming with texting. But up until a few years ago, this status message would actually seem prophetic.
Consider this 2008 report by market research firm Nielsen: The number of voice calls being made has remained steady over the past two years, but text messages sent and received have increased by a staggering 450 percent (Wired.com report). The trend was only on the increase, with text messaging seeing a jaw-dropping usage of 3,000 messages (average) per month (Nielsen 2010 report.) So, do we bid the vocal chords good bye yet? Not so fast…
2011 came with its own set of startling revelations: First, the world watched with wonder as Watson defeated the ‘human’ contestants on the American quiz show ‘Jeopardy!’ The super computer engineered by the scientists at IBM was miles ahead of similar-artificial intelligence programs, in that it could now process human speech or what is known as ‘natural language processing’.
We all know what a funny language English can be. People across the world speak it, however, each of us lends it our own expressions, style and phrasing. The phrase, “I am feeling blue” for instance means “I am feeling sad / depressed / without energy”. Imagine though, if a computer could understand the same phrase without relating “feeling blue” to the colour blue! That’s the breakthrough that IBM has made with Watson. It applies advanced natural language processing, information retrieval, knowledge representation and reasoning, and machine learning technologies to answering questions. It signals a landmark turning point in how we can interact and engage with computers in the coming decade. It’s like turning on your computer and telling it to start Word, type a note, open Email clients, message someone on Facebook, plot a graph on MS Excel … and also give it problem solving scenarios by feeding it the right data. You get the picture?
It is the perfect time to pay tribute to not just Steve Jobs, who Steven Spielberg quoted as saying is the “greatest inventor since Thomas Edison”, but also the father of modern communication – Alexander Graham Bell, the inventor behind the telephone, which actually made it possible for two people to talk to each other.
Voice recognition will soon come to be used in the aviation industry, defense and military sectors where pilots can simply use voice command to control a flight’s trajectory. You want to give MTN a “piece of your mind” for bad service, a customer care “robot” will be more than happy to listen to you without feeling hurt or angry (Read: Will you treat your robot kindly?). In two decades or more, voice-based operations will also be used in the entertainment, hospitality and leisure industries, as well as advertising, marketing and media. Governance, healthcare, education, and developmental fields won’t be far behind in employing this technology as the ‘service face’ of the company.
The Android Market also has a speech recognition app, VLingo Virtual Assistant that seems similar to Siri, but at the moment, the iPhone 4S has an edge. Why? Simply because no one can beat Apple in brand promotion and marketing. It can push sales of a product simply by promoting the ‘cool’ factor of a feature (no matter if the feature was allegedly inspired by a rival company).
Voice recognition also caters to two other market segments – persons with disabilities, those who are deaf or hearing impaired or have partial to total vision loss, and secondly, the senior citizens. Also, people who have poor typing skills and poor written English skill can take full advantage of ‘talking’ to the phone. It’s worth the wait to see if Apple can accommodate non-English language recognition into Siri’s repertoire. The possibilities of access and inclusion are endless.
Voice-based services doesn’t refer to “customer care” or voice calls alone. It refers to the specific Artificial Intelligence that makes it possible for a machine to interact with humans in an organic and natural way. No more Select Option 1, 2 or 3.
When Voice-Based A.I. become mainstream, it will obviously allow corporations to spend less on hiring customer care executives and more on Research and Development, Engineering and Production, and Marketing.
Text will not replace talk or voice chat rather it will augment messaging and other text-based services. I also envision future mobile communication devices to incorporate understanding of hand gestures – just like Microsoft’s X-Box does now, eye movements, thought-control, and other non-textual inputs and clicks. Do you think computers in the future will respond to all our instructions with: “Your Wish is my Command!” Wouldn’t be able to tell the difference between machine and genie, no?Hits:1046