Earlier this year at WWDC, Apple announced that they will allow third-party developers to integrate with Siri – an intelligent assistant that interacts with users through voice. While Siri is not the only voice-activated device currently available, it is one of the most likely ones to be immediately integrated with enterprise software due to high prevalence of Apple devices and custom apps in the enterprise.
With phones, tablets, and other portable voice assistants like Amazon Echo on the market, the presence of voice interactions in the business realm is no longer a futuristic ideal, but a natural evolution. Leaders in voice recognition predict that this emerging technology is the new touchscreen.
Current usage data support such claims. One company reports triple increase in voice commands over the last year, and quadruple increase over the last two years. Google voice search queries have grown 35 times since 2008, and 7 times since 2010. Mobile voice assistant usage has also increased in the US from 30 percent in 2013 to 65 percent in 2015.
Advances in voice recognition technology are undoubtedly behind the increased usage. As Andrew Ng puts it, “Ninety-five percent voice recognition accuracy is good, 99 percent accuracy is a game changer.” While technology is improving, users are also becoming more aware of its capabilities and transformative effects on daily tasks. Before we know it, voice interactions will be integral to business operations.
There are two major challenges to overcome for using voice interactions in the enterprise space.
The first is form. Developers don’t speak the same language as users. Production managers communicating with line production staff, oil rig workers sending photos of malfunctioning rigs, consumers messaging customer service– they are all using domain-specific language and slang that developers are typically not aware of. Hence, going out into the field and identifying how and what your users say when performing these activities is absolutely critical to successful integration of voice technology within the app.
The second challenge is function, specifically for SiriKit. SiriKit only works for “domains” of messaging, VoIP calling, payments, workouts, ride booking and photo search features for applications. How do you fit your enterprise applications that require more functionality, such as taking notes and submitting documents, into these domains? This is a big question for developers looking to add this functionality to their existing applications.
The potential is there. For example, an industrial worker performs meter readings from a piece of equipment to determine if replacement is needed. By giving the meter readings via voice-based interactions to a chat bot within a SiriKit-integrated messaging feature of an application, the bot would analyze and provide an answer whether you need to repair or replace.
The potential for voice to have serious impact is enough to invest in overcoming these hurdles. Here are two examples of how voice recognition technology will transform enterprise:
- Voice interactions reduce time to perform tasks. Speaking is faster than typing the same sentence. Looking up information in a massive directory or finding and launching files are very obvious examples of saved time. Imagine telling your computer to “find the Statement of Work for Client X from last June” and seeing the correct document on your screen a second later. This interaction finds the document user intends without knowing a specific filename and completely eliminates various clicks and open windows while navigating through a directory to find the desired document.While time savings from reducing browser windows and clicks may seem negligible in an office environment scenario, seconds and minutes shaved off of pulling up a customer account at a service call center can convert to millions of dollars in annual savings.
- Voice interactions is the new on-the-job training. In another example, industrial worker performing maintenance on an oil well has his hands occupied with gloves and tools. A smartphone synced with headphones and microphone can dictate to the worker step-by-step instructions to perform his tasks. Additionally, the user can instruct his device to record various actions he took, send notes to his supervisors, fill out safety checklists and file another work order on the equipment. This scenario takes advantage of voice interactions that completely free up user’s hands and eyes for the primary task at hand – performing maintenance. A voice assistant that can walk through procedures step by step at a user’s pace and repeat instructions or pull up some additional information about an issue on the fly is the new on-the-job training for junior employees. Filling out checklists, scheduling follow up maintenance, requesting additional help, ordering materials and supplies — all while on a work site, will provide faster and much improved transparency about what is happening on the ground floor in real time.
The scenarios presented here may have been a dream of fiction as little as 10 years ago. Yet recent advances in voice recognition technology inspire giants like Rolls Royce to invest in this new tool to harness its possibilities of improving how people work and drive businesses forward.
Nonetheless, while technology is advancing forward we must also not lose focus of the users of these tools. A device that listens and speaks to the user, as technologically sophisticated as it is, can still fail its users. Like all technology, developing a product without identifying the users’ needs and context leads to low user adoption rate. View your users as a jury, and the performance and usability of the app, especially with a new feature like voice interactions, as the main argument. It needs to be convincing enough for users to overcome their initial reservation about new technology. Make sure that your investment in voice recognition includes user researchers and designers.