Why Voice Computers Always Fail: Unveiling the Limitations and Future
Voice-based computing has struggled to gain traction despite substantial investment and efforts by major players. Emerging companies are promising a transformative user experience, but practical concerns about convenience, privacy, and feasibility challenge the claim of voice AI as the main interface for general computing.
Voice-based Computing: A New Frontier
- Voice-based computing has faced significant challenges in recent years, with major players like Microsoft's Cortana and Amazon's Alexa struggling to gain traction.
- Amazon's Alexa reportedly incurred a staggering loss of $1 billion per year, overshadowing even the revenue of tech giants like Airbnb.
- Despite substantial investment and a decade-long effort, consumer interest in voice-based platforms has been limited to basic tasks such as setting timers and controlling smart home devices.
- Google and Apple have also faced obstacles with their respective voice assistants, with Google experiencing a decline in momentum and Apple's Siri receiving few meaningful updates.
- However, emerging companies like Humane and Meta are introducing a new generation of voice-based computers, equipped with advanced AI and computer vision capabilities, promising a transformative user experience.
- These next-generation devices aim to redefine the interface of the future, offering smarter, more flexible AI and enhanced interactions through advanced computer vision.
Voice AI and the Future of Personal Computing
- The development of voice-powered computing and wearable devices with always-on capability has the potential to make a major difference, especially for people with bad vision or the elderly.
- Companies like Humane have positioned voice AI as the future of personal computing, touting their AI pin device as a replacement for smartphones and the next form factor for general computing.
- However, the practical implementation and usability of such voice-based devices raise concerns about convenience, privacy, and the feasibility of using voice as the main interface for general computing.
- The AI pin's reliance on a computer with a real screen for setup, the inconvenience of inputting sensitive information via voice or a basic projector, and privacy implications are some of the key issues that challenge the claim of voice AI as the main interface for the next generation of computing.
The Limitations of Voice-Only Device Usage
- Many essential smartphone apps, such as those for remotely controlling cameras, social media, and video streaming, rely heavily on visual interfaces and cannot effectively be used through voice commands alone.
- Tasks involving photography, banking, finance, productivity, map-based navigation, and online shopping are also extremely challenging, if not impossible, to perform solely through voice interactions.
- Complex activities like managing emails, messaging apps, and conducting video calls are impractical and unenjoyable without the visual component, making voice-only use highly limiting for communication purposes.
- The only areas where voice commands may work well are for listening to music, podcasts, and controlling smart home devices, as these activities are relatively straightforward and don't heavily rely on visual input.
Limitations of Voice Assistants as Interfaces for Smartphones
- Voice assistants have limitations in performing complex tasks such as managing Spotify playlists, troubleshooting devices without Wi-Fi, or handling various computing needs on smartphones.
- Speech as an interface is not suitable for the majority of computing needs due to fundamental shortcomings.
- Voice and audio communication are slow and one-way, limiting the delivery of information compared to the multi-lane, multi-directional interface of a screen.
- Most humans are not precise or coherent enough with their speech for it to serve as an efficient input method for tasks such as writing code or image editing.
- Meta's Rayang glasses and Microsoft's recent innovation in incorporating voice capabilities as an addition to smartphones could be potential solutions to the limitations of voice assistants.
The Future of Voice-Controlled Devices
- The Hol lens is a voice-controlled device that incorporates co-pilot AI to analyze objects in its surroundings, particularly useful in scenarios where hands-free operation is necessary, such as maintenance work.
- While voice-controlled devices like the Hol lens offer additional capabilities beyond smartphones, there are concerns about the need for a separate screen and input methods, ultimately leading to a reinvention of the smartphone.
- However, the concept of natural interaction with generative AI through voice commands is intriguing, but claiming that it will replace smartphones entirely is impractical.
- Instead of investing in new voice-controlled devices, iFixit is recommended for the upcoming holiday season, providing repair kits for existing gadgets, promoting sustainability, and empowering users to repair their devices with confidence.
- iFixit offers a range of toolkits for various repair needs, including components for broken electronics, along with free manuals and repair guides, emphasizing the value of taking back control over our devices and promoting the right to repair initiative.
Conclusion:
Voice-based computing has made significant strides, but its limitations in handling complex tasks and the practical implementation of voice-based devices raise concerns. While there is potential for voice AI in personal computing, the transition to a full interface replacement is impractical. The future may hold solutions like Meta's Rayang glasses and Microsoft's innovations, but the right to repair initiative with iFixit presents a sustainable alternative for the upcoming holiday season.