Artificial Intelligence

The Pros and Cons of Speech Recognition and Virtual Assistants

William Goddard
September 18, 2018
Reading Time: 5 minutes

Pocket-sized digital devices with speech recognition and talking virtual assistants are becoming as commonplace as keyrings or Swiss Army knives used to be.

Table of Contents

Speech recognition technology has come a long way since its mass-market introduction in the 1980s. For example, you no longer need to shout into the microphone, or spend two hours waiting for the software to compile a database of your voice patterns, before you can tell it to “Open Microsoft Word” – ten times, without effect.

Cloud-hosted databases now allow for on-the-fly recognition of commands and comments in conversational language. And the virtual assistants which respond to these commands have natural-sounding voices, with nuance and personality.

But the technology is still developing, and while there are clear advantages to using speech recognition apps and devices with virtual assistants, there are some downsides too.

We’ll begin by presenting the case for the defense.

Accessibility Options for Mobility and the Visually Impaired

For those with mobility restrictions or the visually impaired, speech recognition has always been a big help, extending the accessibility of computer-based technologies to a much wider audience.

The proliferation of voice-responsive virtual assistants, capable of engaging with users in conversational language, has been a significant step forward. So, too, is the continuing integration of hardware, tools, and consumer goods with Internet of Things (IoT) technology and cloud resources.

The still evolving technology of Artificial Intelligence (AI) has yet to achieve its promise in this area, but the potential for using adaptive learning and prescriptive or predictive analytics modules in virtual assistant packages for accessibility is staggering.

Co-ordination of IoT Devices

For the consumer market, the expansion of IoT has created entire new generations of (to varying degrees) smart objects, appliances, and accessories. Some are too tiny to even feature buttons, much less screens or keyboards. Others are designed to do jobs that don’t require manual activation or guidance.

Speech recognition is a practical and logical option which can enable users to control such objects from a distance, often via a connected virtual assistant device.

Commercial virtual assistants like Google Home and Amazon Echo already use standardized communication protocols, enabling them to interact with a range of IoT devices – in some cases simultaneously. Coupled with speech recognition, this makes them the ideal remote partners in managing our increasingly sophisticated smart homes.

Adding Personality to Virtual Shopping

Convenience is undoubtedly the primary driving factor behind the rise of ecommerce. However, the sector has yet to figure out a way to make up for the lack of personal interaction consumers experience when selecting goods, asking questions, or making purchases. People naturally like to engage with a flesh and blood assistant or customer service agent at some point in the buying process.

Once again, AI is coming to the rescue – this time, in the form of virtual shopping assistants. The AI-powered shopping assistant learns about your tastes and interests while shopping online, and then uses this data to inform you about products you might like to purchase, creating a virtual experience that’s more like shopping in a virtual store.

Just as speech recognition has come a long way, natural voice technologies (which are typically modelled on recordings of the vocal patterns of real people) have made harsh, robotic-sounding voices the exception rather than the norm. The result is virtual assistants fielding queries on high-end commercial websites becoming increasingly indistinguishable from human beings.

Reducing Our Dependence on Screens

When it was first introduced at consumer level, speech recognition was said to herald the death of the keyboard – but the technology at the time just wasn’t good enough.

However, today’s more advanced recognition algorithms do make the technology a viable alternative to hardware-based input devices like keyboards or touchscreens.

An interesting side effect of this is that speech recognition with interactive voice assistants can help reduce our dependence on screens – an obsession that’s become potentially unhealthy and of concern to any parent of a teenager with a tablet or mobile phone.

The defense rests. Now, sounding off for the prosecution, let’s look at some of the potential issues with virtual assistants and speech recognition.

A Reluctance to Comprehend

There’s a line from an old Sci-Fi series on television, where one of the main characters says, “She understands. She doesn’t comprehend.”

This fine distinction speaks to one of the biggest limitations of current speech recognition technology and the responses coded into the personalities of virtual assistants.

On the recognition side, there’s a delay when first using a system as it accustoms itself to your unique speech patterns. This might be anywhere from a few seconds to a few hours (or even days), depending on the sophistication of the underlying software, the capabilities of your hardware, and the strength and speed of your internet connection.

Recognition performance should improve over time. However, this is still dependent on those underlying factors.

An Inability to Hear

Reports suggest that there will be somewhere in the region of 24 billion IoT connected devices around the globe by 2020 – that’s about four devices each for every person on the planet. The concern is that all the wearable gadgets, smart consumer appliances, intelligent vehicles and connected infrastructure will cause localized pools of interference, and a continuous struggle for bandwidth.

Factors like this may have a negative impact on the ability of speech recognition systems to access their databases and perform their primary function.

Another impediment which can be demonstrated in the here and now is the effect of extraneous noise on voice-activated systems. With the current state of technology, it’s not uncommon for the speech recognition system powering a smart home to pay equal attention to the people on the TV or radio as it does to the owner of the house.

Dumbing Down Offline

Smartphone users who’ve become accustomed to the speech recognition functionality of OK, Google on the Android platform, or Apple’s Siri, will know that a fast and stable internet connection is essential to the operation of these virtual assistants.

Offline recognition generally isn’t considered a necessity in the design of these systems. Any support for it is usually bolted on as an afterthought and involves the downloading of an immense recognition file to the user’s device.

Given that fast, stable internet isn’t as common around the world as many developers might assume, this effectively excludes speech recognition and virtual assistants as viable options in many parts of the globe.

A Lack of Initiative

Finally, there are the issues of intelligence and initiative. Though much has been made of the potential of AI, adaptive learning, and prescriptive/predictive analytics in powering virtual assistants, the practical applications have yet to truly manifest.

For the moment, virtual assistants are largely responsive – sitting silent (or occasionally piping up with a pre-programmed conversational gambit) until the user issues a fresh command or query.

Though there’s a fine line to be drawn between making a system intrusive and having it behave in an advisory capacity based on past observations and incoming data, there’s certainly room for the next generation of virtual assistants to display more initiative if the technology is to truly become an integral part of domestic and business life.

Summary:

Speech Recognition and Virtual Assistant

Speech recognition technology has come a long way since its mass-market introduction in the 1980s. Cloud-hosted databases now allow for on-the-fly recognition of commands and comments in conversational language. And the virtual assistants which respond to these commands have natural-sounding voices, with nuance and personality. But the technology is still developing, and while there are clear advantages to using speech recognition apps and devices with virtual assistants, there are some downsides too. For those with mobility restrictions or the visually impaired, speech recognition has always been a big help, extending the accessibility of computer-based technologies to a much wider audience. For the consumer market, the expansion of IoT has created entire new generations of (to varying degrees) smart objects, appliances, and accessories. Some are too tiny to even feature buttons, much less screens or keyboards. Others are designed to do jobs that don’t require manual activation or guidance. Speech recognition is a practical and logical option which can enable users to control such objects from a distance, often via a connected virtual assistant device. Convenience is undoubtedly the primary driving factor behind the rise of ecommerce. However, the sector has yet to figure out a way to make up for the lack of personal interaction consumers experience when selecting goods, asking questions, or making purchases. People naturally like to engage with a flesh and blood assistant or customer service agent at some point in the buying process. Once again, AI is coming to the rescue – this time, in the form of virtual shopping assistants.

TAGS :

Speech Recognition, virtual assistants

William Goddard

William Goddard is the founder and Chief Motivator at IT Chronicles. His passion for anything remotely associated with IT and the value it delivers to the business through people and technology is almost like a sickness. He gets it! And wants the world to understand the value of being a technology focused business in a technological world.