voice assistant – Alan AI Blog

Productivity and ROI with in-app Assistants

asdivyansh — Mon, 21 Nov 2022 20:32:23 +0000

The world economy is clearly headed for “stormy waters”, and companies are bracing for a recession. Downturns always bring change and a great deal of uncertainty. How serious will the pending recession be – mild and short-lived or severe and prolonged? How can the business prepare and adapt?

When getting through hard times, some market players choose to be more cash-conservative and halt all new investment decisions. Others, on the contrary, believe the crisis is the best time to turn to new technology and opportunities.

What’s the right move?

A recession can be tough for a lot of things, but not for the customer experience (CX). Whether the moment is good or bad, CX teams have to keep the focus on internal and external SLAs, satisfaction scores, and churn reduction. In an economic slowdown, delighting customers and delivering an exceptional experience is even more crucial.

When in cost-cutting mode, CX departments find themselves under increasing pressure to do more with less. As before, existing systems and products require high-level support and training, new solutions brought in-house add to the complexity – but scaling the team and hiring new resources is out of the question.

And this is where technology comes to the fore. To maintain flexibility and remain recession-proof, businesses have started looking towards AI-powered conversational assistants being able to digitize and modernize the CX service.

Re-assessing investments in Al and ML

Over the last few years, investments in business automation, AI, and ML have been at the top of priority lists. Successful AI adoption brought significant benefits, high returns, and increased customer satisfaction. This worked during financially sound times – but now investments in AI/ML projects need to be reassessed.

There are several important things to consider:

Speed of adoption: for many companies, the main AI adoption challenge rests in significant timelines involved in the project development and launch, which affects ROI. The longer the life cycle is, the more time it will take to start reaping the benefits from AI solutions – if they ever come through.
Ease of integration: an AI solution needs to be easily laid on top of existing IT systems so that the business can move forward, without suffering operational disruptions.
High accuracy level: in mission-critical industries where knowledge and data are highly nuanced, the terminology is complex and requirements to the dialog are stringent, accuracy is paramount. AI-powered assistants must be able to support contextual conversations and learn fast.
Personalized CX: to exceed customer expectations, the virtual assistant should provide human-like personalized conversations based on the user’s data.

Increasing productivity with voice and text in-app assistants

Alan AI enables enterprises to easily address business bottlenecks in productivity and knowledge share. In-app (IA) assistants built with the Alan AI Platform can be designed and implemented fast – in a matter of days – with no disruption to existing business systems and infrastructure.

Alan’s IA assistants are built on top of the existing applications, empowering customers to interact through voice, text, or both. IA assistants continuously learn from the organization’s data and its domain to become extremely accurate over time and leverage the application context to provide highly contextual, personalized conversations.

With both web and mobile deployment options, Alan AI assistants help businesses and customers with:

Always-on customer service: provide automated, first-class support with virtual agents available 24/7/365 and a self-help knowledge base; empower users to find answers to questions and learn from IA.
Resolving common issues without escalation: let IA resolve common issues immediately, without involving live agents from CX or support teams.
Onboarding and training: show the users how to complete tasks and find answers, guiding them through the application and updating visuals as the dialog is being held.
Personalized customer experience: build engaging customer experiences in a friendly conversational tone becoming an integral part of the company’s brand.

Although it may seem the opposite, a recession can be a good time to increase customer satisfaction, reduce overhead and have a robust ROI. So, consider investing in true AI and intelligence with voice and text IA assistants by Alan AI.

Restaurant labor shortage? A Voice Assistant can fill the gap.

Alan Team — Mon, 20 Jun 2022 00:17:33 +0000

At first it was the deadly buzz of the pandemic, and now it’s the thunder of markets crashing and lightening flashing an impending recession. Will restaurant owners have no relief? Restaurants were just opening up and attending to the happy crowds as COVID ebbed, when out-of-the-blue the stock market nosedived. Complicating the state of affairs is a labor shortage in restaurants and signs for “Help Wanted’ are way to common. Given the precarious climate, owners cannot pay their employees high wages as profit margins have been squeezed for many restaurants. What do they do?

Voice Technology Helps Ease the Shortage

Voice automation is a cost-effective solution with long-term benefits. It creates back-end operational efficiency and lends a hand in front-end ordering, easing any lack of available labor. You can easily give back an hour to each employee with an intelligent voice assistant.

Backend Operational Efficiencies

A. Tasks: The voice assistant is a hands-free buddy who reminds an employee to login, enter work schedules, complete tasks, adhere to special instructions etc. When the employee finishes a task, he can just inform the voice assistant and the app will automatically check off the task. No need to take off gloves to touch and type task fulfillment in apps. And nothing beats a friendly voice prompt to remind employees to complete tasks in a timely manner. Operational efficiencies = cost savings. It is estimated that approximately 5 seconds are shaved off per task by using an intelligent voice assistant.

B. Order Fulfillment

Modern kitchens have Kitchen-Display-Systems with screens that can bring up orders according to priority, highlight special dietary requests, flag ad hoc changes, and showcase item inventory. With voice technology, the employee no longer needs to take out time to read the screen or re-engage with it while he steps away to get something- as it audibly prompts the employee for order fulfillment. The employee can ask questions and get intelligent and accurate responses if he did not understand the prompt. The hands-free voice assistant enables every restaurant employee to save approximately 10 seconds per order fulfillment.

Front-end Operational Efficiencies

Adding voice tech to restaurant food ordering mobile or web apps save precious employee time in taking the order, ensuring accuracy of order, getting payment etc. The customer orders on an app and can either get the food delivered at home or the office or can pick-up from the restaurant premises. Lengthy menus and frequent changes to the restaurant food items are common, making voice user interfaces faster and more desirable than using a touch screen. Instead of touch and type, swipes, and going through menu items, users can simply ask the app for their menu choice, exactly the way they want it, and get their items ordered in a few seconds, thus increasing their satisfaction.

A touchless restaurant kiosk facilitate a self-service, unhurried experience, and reduce the potential health risks of touch screens. Industry calculations indicate that each drive thru order is $1.56 vs one penny for a voice activated order. Kiosks have the potential to significantly increase each ticket size by prompting upsells and cross-sells, and making personalized menu recommendations based on historical buyer behavior.

Additional Benefits

Besides resolving the labor shortage and operational costs, voice technology can reduce food wastage and makes for quicker, superior customer service.

If you are looking for a voice-based solution for your restaurant app, the team at Alan AI will be able to deliver exactly that. Email us at sales@alan.app

Alan AI has patent protections for its unique contextual Spoken Language Understanding (SLU) technology to accurately recognize and understand human voice, within a given context. Alan’s SLU transcoder leverages the context to convert voice directly to meaning by using raw input from speech recognition services, imparting the accuracy required for mission-critical enterprise deployments and enabling human-like conversations, rather than robotic ones. Voice based interactions, coupled with the ability to allow users to verify the entered details without having the system to reiterate inputs, provides an unmatched end-user experience.

Ramco Systems partners with Alan AI to deploy Intelligent Voice Interfaces to 1K Enterprises

Alan Team — Fri, 08 Apr 2022 16:24:39 +0000

Ramco Systems Partners with Alan AI

" data-medium-file="https://i0.wp.com/synqqblog.wpcomstaging.com/wp-content/uploads/2022/04/ramco-press-release.png?fit=300%2C157&ssl=1" data-large-file="https://i0.wp.com/synqqblog.wpcomstaging.com/wp-content/uploads/2022/04/ramco-press-release.png?fit=800%2C419&ssl=1" />

Bolstering its enterprise applications with intelligent voice interfaces SUNNYVALE, CA 94582, USA, March 29, 2022 /EINPresswire.com/ —

Alan AI, a Silicon Valley company enabling the next generation of voice interfaces for Enterprise Apps, has announced a strategic partnership with Ramco Systems¹, a leading global cloud enterprise software provider, to embed intelligent voice interfaces for its enterprise offerings. In its initial stages of partnership, the organizations will primarily focus on building business use cases for the Aviation, Aerospace & Defense sector, followed by use cases for other industry verticals.

Ramco Systems offers an integrated and smart platform engineered to develop robust and scalable solutions, thereby offering a competitive edge to its end users. By embedding Alan AI’s voice interface, Ramco’s customers will be able to interact with their applications with natural human language and receive intelligent responses for daily workflows. Features such as accuracy of understanding spoken language, synchronization of voice with existing graphical interfaces, and a hands-free app experience will truly delight the user- from the very first interaction. Voice Assistant will drive smoother app onboarding, higher user engagement and scale adoption and loyalty.

Commenting on the partnership, Ramesh Sivasubramanian, Vice-President – Technology & Innovation, Ramco Systems, said, “Voice recognition is a maturing technology and has been witnessing huge adoption socially, in our day-to-day personal lives. However, its importance in enterprise software has been a real breakthrough and a result of multitudinous innovations. We are excited to enable clients with this voice user interface along with Alan AI, thereby ensuring a futuristic digital enterprise”.

“We are so excited to be able to help support Ramco’s applications and empower their customers with intelligent voice interfaces. Our advanced Voice AI Platform enables enterprises to deploy and manage intelligent and contextual voice interfaces for their Applications in days, not months/ years” said Blake Wheale, Chief Revenue Officer, Alan AI. “This partnership is a great testament of how voice can support a vision of a hands free, productive and safe environment for humans”.
Learn more about Alan AI

#VoiceAssistant #VoiceAI #RamcoSystems #AlanAI

Women Making History in Tech

Alan Team — Tue, 08 Mar 2022 05:01:00 +0000

At Alan AI, we care about making all technology easily accessible to everyone and aim to bridge this gap using voice AI. Accessibility is a value we hold in high regard and it starts within our company. We believe building a diverse team is critical in crafting our platform and covering AI ethics blindspots.

Internally, we want to do our part in changing the statistic that today, women hold 25% of jobs in the tech industry while making up half of the entire workforce. Externally, we want to praise those that have been the first to break barriers and encourage future generations that anything is possible.

Here are some of the many incredible women who have or currently are shaping technology.

Ada Lovelace (1815 – 1852)

Being the world’s first computer programmer, Ada Lovelace was a key contributor to the technological revolution. In 1870, Lovelace joined Charles Babbage’s work on the Analytical Engine by translating the lecture notes of an Italian engineer. During these nine months of intense analysis, she found many errors in the notes and expanded on them, leading to what is now considered the first ever algorithms that could be used in a computing machine.

Lovelace didn’t receive the recognition she deserved until a century later when her notes were republished in the 1950s. Following this, the U.S. Department of Defense named a programming language “Ada” in her honor.

Image: Wikipedia

Grace Hopper (1906 – 1992)

Grace Hopper was an American mathematician, teacher, U.S. Navy rear admiral and pioneer in developing computer technology. Her significant work consists of helping in the WWII efforts with the Harvard Mark I computer, inventing the first compiler to translate a programmer’s instructions into computer codes and paving the way for one of the first high-level programming languages, COBOL.

When she was awarded the National Medal of Technology in 1991, she said “If you ask me what accomplishment I’m most proud of, the answer would be all the young people I’ve trained over the years; that’s more important than writing the first compiler.” Hopper is remembered at the annual Grace Hopper Celebration, the world’s largest gathering of women technologists.

Image: Computer History Museum

Mary G. Ross (1908 – 2008)

Mary Golda Ross was the first known Native American female engineer. In 1942, she joined the Lockheed Corporation, an American aerospace company, as the first female engineer. She is one of the 40 founding members of the renowned and highly secretive Skunk Works project. Much of her research and writing remains classified, even today.

Mary G. Ross is featured on the US 2019 one dollar coin.

Image: transportationhistory.org

Evelyn Boyd Granville (1924 – )

Evelyn Boyd Granville is one of first African American women to earn a Ph.D. in mathematics. After graduating from Yale and struggling to find a job due to race discrimination, she accepted a teaching position at Fisk University in Nashville, Tennessee where she taught two African American women who would go on to earn doctorates in mathematics. In 1956, she started working at IBM’s Aviation Space and Information Systems division on various projects for NASA’s Apollo space program, studying rocket trajectories and orbit computations.

After her years in government work, Granville returned to teaching mathematics of all levels. Today she is retired but is continuously advocating for women’s education in technology.

Image: undark.org

Annie Easley (1933 – 2011)

Annie J. Easley was an American computer scientist, mathematician, and rocket scientist. After reading an article about twin sisters working as “human computers” at the National Advisory Committee for Aeronautics (NACA), she applied for a job the next day. In 1955, she started her 34-year career at NASA (previously known as NACA), doing computations for researchers by hand and then computer programming for important projects like the Centaur high-energy booster rocket and alternative systems to solve energy problems.

Easley also served as an Equal Employment Opportunity officer and was the founder and first President of the NASA Ski Club.

Image: Salon.com

Radia Perlman (1951 – )

The title of “Mother of the Internet” has been rightfully given to Radia Perlman, a MIT math graduate, computer programmer and network engineer. Her invention of Spanning Tree Protocol (STP) was a major contributor to making today’s internet possible. Her most recent work has been on the TRILL protocol to correct some of the shortcomings of spanning-trees.

She has done keynotes speeches across the world and is currently employed at Dell EMC. When asked about diversity in STEM, Perlman replied, “The kind of diversity that I think really matters isn’t skin shade and body shape, but different ways of thinking.”

Image: eniac.hu

Marissa Mayer (1975 – )

Former Yahoo! CEO and early Google employee, Marissa Mayer is now a co-founder of Sunshine, focusing on artificial intelligence and consumer media.

After completing her studies at Stanford University, Mayer joined Google as the first female engineer at 24 years old. Her contributions during her time there include the design of the Google homepage, Gmail, Chrome, Google Earth, and being one of the three members to develop Google Adwords. From 2012 to 2017, she held the role of president and CEO of Yahoo!. Today, Mayer is working as the co-founder of Sunshine.

Image: Martin Klimek — ZUMA Press/Alamy

Building the future

The future of technology is at the fingertips of today’s students, but the road to their success isn’t always an easy one. Here are three women-founded organizations set up to lift developers of all backgrounds, demographic and skill levels.

Girls Who Code

Girls Who Code Founder and CEO Reshma Saujani is photographed with participants at the IAC building in New York, NY on July 25, 2018.

photo/Carey Wagner

" data-medium-file="https://i0.wp.com/synqqblog.wpcomstaging.com/wp-content/uploads/2021/03/girls_who_code_1200.jpg?fit=300%2C169&ssl=1" data-large-file="https://i0.wp.com/synqqblog.wpcomstaging.com/wp-content/uploads/2021/03/girls_who_code_1200.jpg?fit=800%2C450&ssl=1" src="https://i0.wp.com/synqqblog.wpcomstaging.com/wp-content/uploads/2021/03/girls_who_code_1200.jpg?w=724&ssl=1" alt="" class="wp-image-4703" srcset="https://i0.wp.com/synqqblog.wpcomstaging.com/wp-content/uploads/2021/03/girls_who_code_1200.jpg?resize=1024%2C576&ssl=1 1024w, https://i0.wp.com/synqqblog.wpcomstaging.com/wp-content/uploads/2021/03/girls_who_code_1200.jpg?resize=300%2C169&ssl=1 300w, https://i0.wp.com/synqqblog.wpcomstaging.com/wp-content/uploads/2021/03/girls_who_code_1200.jpg?resize=768%2C432&ssl=1 768w, https://i0.wp.com/synqqblog.wpcomstaging.com/wp-content/uploads/2021/03/girls_who_code_1200.jpg?w=1120&ssl=1 1120w" sizes="(max-width: 800px) 100vw, 800px" />

Working to close the gender gap in technology, Girls Who Code “envisions a world where women are proportionally represented as technical leaders, executives, founders, VCs, board members, and software engineers.” This non-profit organization was founded in 2012 by Reshma Saujani, an American lawyer and politician, who during her run for the US Congress noticed a lack of girls in computer science classrooms while campaigning. With several bestsellers like “Girls Who Code: Learn to Code and Change the World” and Ted Talk “Teach girls, bravery not perfection” viewed by thousands and sparking a worldwide conversation, Girls Who Code has today reached 500 million people and 300,000 girls in USA, Canada, India, and the United Kingdom.

Image: Carey Wagner

Black Girls Code

“The great economic equalizer of our generation, the great revolution of this generation, is indeed technology. And by embedding these skills and abilities in our youth today, we can change the nation — one girl, one woman and one generation at a time.”
~ Kimberly Bryant, founder of Black Girls Code

After her daughter’s disappointing experience at male-dominated computer camp, Kimberly Bryant decided to build an environment that encourages girls, especially from underrepresented communities, to pursue careers in STEM. This led to the creation of Black Girls Code, a non-profit organization that provides African American youth with programming skills through community outreach programs such as workshops and after school programs. Since 2011, BGC has served over 200,000 students and has the ultimate goal of teaching 1 million girls how to code by 2040.

Image: Black Girls Code

CodeNewbie

Starting as a weekly Twitter Chat to connect people that are learning to code by fellow coder Saron Yitbarek, CodeNewbie has since grown into a supportive, international online community of people learning and supporting one another’s coding journey with weekly Twitter Chats every Wednesday at 9PM EST.

Saron Yitbarek also hosts several podcasts like CodeNewbie involving stories and interviews about new developers transitioning into tech careers and joining developer communities.

Image: freecodecamp.org

Happy International Women’s Day!

Inspired by the women you see here? Get started with Alan AI and build your own AI powered voice interface today.

Voice Apps for Covid-19 Contactless Paradigm

Alan Team — Mon, 24 Jan 2022 23:04:51 +0000

Technologies make our society resilient in the face of a Black Swan event like the Covid-19 pandemic. Some of these technologies might even have a long-lasting impact beyond Covid-19. A technology that is of huge benefit during this time of chaos and uncertainty is that of voice enabled apps. According to the Adobe Voice Survey, a majority of the users found that voice interfaces made their lives faster, easier, and more convenient. 77% of the respondents in the survey said that they were planning to increase their usage of voice technology in the next 12 months.

Thanks to voice AI’s ability to understand natural language, its adoption will keep increasing and there will be more use cases added to its repertoire. Voice user interfaces are great for companies in certain industries: finance, education, healthcare, technology and more.

Voice Technology Creates Safer Alternatives in Times of Covid

The Covid-19 pandemic has forced us to be wary of touching anything in a public place for the fear of contracting (and spreading) the virus, as new variants like Omicron emerge. AI-powered voice technology has enabled a contactless ecosystem by providing safer alternatives to get things done. Booking a medical appointment or checking one’s bank balance might not have been activities that consumers would have used voice technology for earlier, but that is fast changing.

INtelligent voice interfaces provide frictionless, contactless, responsive and predictive interactions. It changes the way we access information and how we navigate between the physical and digital worlds. According to a study by Juniper Research, 52% of voice interface users said that they used them a number of times a day or nearly every day, in 2021. You can imagine that the numbers will only keep increasing.

In mobile applications, voice AI reduces the complexity of navigation, increases conversion, offers greater convenience, and boosts engagement.

Voice AI has Enabled a Contactless Paradigm

Voice enabled apps have emerged to the forefront during the pandemic in many instances, for example,

A significant drop in cash payments. Voice and touch and type apps are leading the way for payment for goods and services.
In Quick Service Restaurants (QSR), voice-enabled kiosks offer hands-free ordering.
The hospitality industry uses voice-enabled kiosks for not only checking-in and checking-out customers, but also to control in-room amenities for the guests.
Voice shopping makes advanced filtering easy as customers don’t have to navigate through complicated menus. It is expected to reach $40 billion this year.
From virtual health guides to real-time medication reminders, voice AI has multiple use cases in healthcare.
In the education sector, voice AI can be used in conducting online viva exams, authenticate access to learning materials, act as smart campus assistants, and so on.

The pandemic has influenced a shift in the mindset and habits of consumers. They have been forced to adopt technology that promotes contactless experiences — to mitigate chances of getting infected from touching surfaces. To survive and thrive beyond the pandemic, brands should enable voice-based technologies to latch on to new consumer behavior and provide a superior user experience.

Wrapping Up

Consumers are using voice technology more than ever in the pandemic, and expect it to be a standard for all digital experiences. Voice is infinitely easier to use and advancements in voice technology gives quick, accurate results. With the massive societal and economic shifts that have occurred because of the pandemic, our lives, both personal and professional, will continue on the path of tectonic changes – and voice AI is going to be a huge part of it.

If you are looking to include AI-powered voice apps to your business, get in touch with the team at Alan AI. The Alan Platform can help you create voice enabled apps in a matter of days.

Write to them at sales@alan.app.

How Voice Assistants Increase Revenue And Usability of eCommerce Apps

Alan Team — Thu, 29 Oct 2020 11:36:24 +0000

If you’ve been trying hard to boost your revenue and enhance the usability of ecommerce apps, voice assistants are here to help you out.

Before we understand how voice assistants shape the ecommerce industry — what is voice commerce and how is it related to voice assistants?

What is Voice Commerce?

Voice commerce is the act of employing voice recognition technology to enable users to interact with ecommerce websites and applications to search, get support, and purchase products just by using their voice. Voice commerce is growing fast, and is expected to reach 8 billion devices by 2023, and is currently at 1.5 billion devices now, according to Juniper Research. So if you’ve been thinking about adding a voice assistant to your ecommerce store, now is the perfect time. You’re not too late!

What’s more, the general market awareness related to voice technology is particularly high. According to a report by PwC, only 10% of surveyed respondents were unaware of voice-enabled devices and products. On the other hand, 90% of the aware respondents had used a voice assistant. Widespread adoption of voice assistants is being driven by younger consumers and households.

That said, businesses are reaping the benefits from the mainstream adoption of voice assistants in various ways.

How Voice Assistants Drive Business Outcomes

Business Cost Savings

If you believe that implementing voice assistants in your ecommerce store is going to be a hefty expenditure, you might need to reconsider your views. Yes, you may need to invest a bigger amount upfront, but considering the gains it brings a few years down the line – the amount you are investing is almost nothing.

As a matter of fact, the return on investment for voice assistants in apps is considerably huge. First, there are low maintenance costs. The easiest route is to go for a third-party stand-alone voice assistant. You need to pay them on a subscription basis, and all the maintenance is their headache.

Secondly, as voice assistants are going mainstream, they attract better leads and close more sales. Users get to shop even when they are out somewhere – driving or meeting someone. They only need to instruct the voice assistant to place an order for XYZ, and that’s all – no scrolling, browsing, and tapping required.

This pretty much explains why consumer spending via voice assistants will reach 18% by 2022.

Higher Customer Satisfaction

Believe us when we say that voice assistants make way for better customer satisfaction. Consumers get personalized attention and real-time responses, just the same way they would if they were to shop in a brick-and-mortar store. All this in the comfort of their home.

A voice assistant reduces the time to buy considerably. According to Bing, searching with your voice is about 3.7 times faster than typing. Google has the same views as well. It revealed that 70% of searches that happen on Google Assistant are in natural language.

What’s more, a voice assistant not only helps you serve a better user experience, but you also get your hands on critical data points that can be further used to enhance your services. Considering that 40% of adults use voice search once daily, it’s easy to see what kind of data you can gather by adding a voice assistant to your ecommerce store.

Savings On Support Costs

Having a voice assistant means having a customer service team 24/7. A voice assistant provides automated customer support to your users, and with less costs. They can take care of most of their queries, thus delivering a higher response time and speeding up the resolution time.

This is why 93% of consumers are satisfied with the services provided by their voice assistants. Further, around 50% of consumers feel organized, 45% feel informed, and 37% feel happy with the help of these voice assistants.

Conclusion

Voice assistants are a necessity to stay ahead of the competition and deliver a best in class user experience. A voice assistant not only helps you bring down costs, but it also enhances customer satisfaction and improves the performance of your customer support team.

Go ahead, and add a voice assistant to your app. How, you ask? Alan AI is here to help. Alan is a conversational voice AI platform that simplifies the entire process of adding a voice assistant to your application. Contact us to learn more about our services and how we could help you realize your goals.

Voice AI Hackathon: World’s first Hackathon for Voice-enabled Applications

Alan Team — Thu, 02 Jul 2020 15:36:55 +0000

Have you ever wanted to develop voice-enabled mobile or web applications?
Well, You’re in the right place!

Alan AI is hosting its first virtual hackathon about Voice AI and inviting developers worldwide to take part! With the Voice AI Hackathon, we are challenging developers (as individuals or teams of 3) to integrate a voice assistant to their new or existing applications or other open-source apps through our conversational voice platform. All you need is basic JavaScript knowledge and the determination to win the TOP PRIZE of $500!

Participants can choose to voice-embed any application — from gaming, social networking, food delivery, or any other industry — the opportunities are endless.

Project submissions are due on July 15, 2020 at 11:59 PM (PST) which can be submitted with a website link, App Store/Play Store link or photo proof of app in the process of being published to App Store/Play Store. Participants will have Alan developers and mentors ready to assist with any questions or concerns throughout your progress. The top three submissions will win a cash prizes and be featured on our Alan platforms.

Ready to sign up? Fill out our sign up form

Learn more about the hackathon on our official Hackathon Website

We are excited to see what you build with the Alan Platform!

Alan presenting at VOICE Global 2020: Multimodal Voice Assistants

Alan Team — Fri, 12 Jun 2020 17:05:00 +0000

Alan AI is a Startup Sponsor for VOICE Global 2020 and will be giving a presentation on Multimodal Voice Assistants presented by James Shelburne, Senior Product Manager at Alan.

Learn how Multimodal Voice Assistants are transforming industries with an integrated visual and voice experience — an experience where users can switch between touch and voice. You’ll see how these revolutionary experiences are providing enterprise and consumer value, strategic differentiation, and gain insights on the ROI of voice from our partner use cases.

Register for free at voicesummit.ai/global and join us on June 17 at 11:00 AM PST on the VG6 channel.

Search for “Multimodal Voice Assistants: a revolutionary experience transforming industries” in the agenda search bar or click here after registering.

Top 10 Hands-Free Apps for Android 2020

Alan Team — Mon, 27 Apr 2020 13:59:57 +0000

Forward-looking businesses are starting to explore the possibilities of introducing voice control into their applications. Therefore, we are seeing a noticeable increase in Android apps with voice-operated software that provide a hands-free experience.

We’ve gathered some of the most popular and useful hands-free apps for Android to see what they can offer and why other businesses should be heading in that direction as well.

The term “hands-free” refers to equipment or software that requires limited or no use of hands. One of the most popular ways to access controls for hands-free apps is through voice. The main goal is to make sure all users can use features within the app – regardless of their ability to physically operate the device.

Voice is being integrated into all kinds of devices, and it’s reshaping the usual state of things. Here are a few reasons why making your application hands-free is a good idea, business-wise and in general:

Convenience – Hands-free apps can be used anywhere: while driving, doing chores around the house, carrying things, or when you’re simply far away from the device.
Accessibility – These apps can be operated by people with limited hand mobility, those who are visually impaired, and other groups in need of assistive technology.
Time efficiency – In many situations, making a quick call takes less time than typing a lengthy message and waiting for a response. The same principle applies to voice control; it requires no clicks, no typing, or any other time-consuming actions.
Simplicity – Users don’t have to be familiar with the interface to handle it. Unlike traditional apps, you hardly need any computer literacy or technical skills.
Multi-use – Voice control isn’t strictly tied to one function. This kind of software is incredibly versatile in terms of potential applications.

Hands-free technology is particularly useful in countries where it’s illegal to use a handheld mobile phone when you drive. These laws have been adopted in many jurisdictions around the world, which gave developers another incentive to develop the technology.

The market of hands-free applications is an interesting space right now. Let’s look at the best offerings available in the Play Store for Android users.

1. Google Assistant

Google Assistant is considered an undisputed champion of personal assistant apps developed for Android. Although it may not work on every device, the coverage is extensive. In addition to running the app on your phone, you can also integrate with smart devices such as Philips Hue lights.

The assistant can run basic functions like making calls, sending texts, emails, setting alarms and reminders, etc. On top of that, you can look up weather reports and news updates, send web searches, and play music. The range of features is constantly getting updated and expanded.

The company states the app was originally designed for people with disabilities and conditions like Parkinson’s and multiple sclerosis. However, it should come in useful for anyone who’s multitasking or has their hands full. To activate Google Assistant, users need to say “OK Google,” and it will be all ears.

2. Amazon Alexa

Amazon Alexa has pushed the trend of endless integration with many emerging smart home devices to the forefront. Contrary to popular belief, this service runs not only on Amazon Echo but also on mobile devices.

Alexa for Android is mostly used to control integrated devices. But the functionality also supports web searches, playing music, and even ordering deliveries. If you want to launch the hands-free app, say “Alexa” and it will be ready to hear commands whether the screen is on or off.

The device restrictions are by far the biggest downside of Amazon Alexa. So far, there is a limited number of mobile phones supporting this system. However, in terms of its abilities and intelligence, it rightly occupies the top of the list.

3. Bixby

Bixby is a relatively new addition, but it is already among the best. It’s important to mention that it’s only compatible with Samsung devices. The company may be looking into other platforms, but at this point, it only runs on devices and appliances connected to Samsung’s proprietary hub.

The app can accomplish a variety of tasks – from sending text messages and responding to basic questions to activating other applications in the device (dialer, settings menus, camera app, contacts list, and gallery).

One of the greatest benefits of Bixby is that it adapts to the user’s voice and manner of speaking. From the get-go, it can understand different request variations like “Show me today’s weather,” “What’s the weather like?” or “What’s the forecast for today?” and it only gets smarter with time.

4. Dragon

Powered by Nuance, which is the technology behind Siri, Dragon Mobile has been in operation for many years. Essential functionality includes dictating emails, checking traffic and weather, sharing your location, and a lot more.

There are also many customizable features aimed at simplifying how you live, work, and spend leisure time – all while minimizing touch-based interactions. Users can add their unique and personalized Nuance Voiceprint. Then, voice biometrics will only let a designated user talk and ask questions.

You can also set your own wake-up word. Unlike other services, this one gives you options to launch it with “Hi, Dragon”, “What’s up,” or anything else you like. The company is working on adding languages other than English, as well as support for the international market.

5. Hound

While the apps described above cover the most widely used basic functionalities, Hound takes a step further. Along with doing simple searches, it can accomplish advanced tasks such as hotel booking, a sing/hum music search, looking up stocks, or even calculating a mortgage. On a lighter side, you can play interactive games like Hangman.

The company launched partnerships with Yelp and Uber to make features like getting restaurant information and hailing a ride more precise. Another interesting feature is that it can translate whole sentences practically in real-time.

This speech-based app is only available for United States residents. However, the process of getting the app out of beta and ready for public consumption was pretty quick, so we may see some international development. Also, there are still occasional bugs within the app.

6. Robin

Robin has been around for a while as one of the original “Siri alternatives”. Like its counterparts, the app supports calling, sending messages, and providing the latest information on the weather, news, and more. However, the functionality still needs some work.

Intentionally or not, a lot of features available on Robin are related to car use. For example, it offers GPS navigation, gives live traffic updates, and shows the prices for gas directly on the map. You can even specify what kind of gas you need, and it will guide towards the closest station.

To call the app into action, you can tap on the microphone button, say “Robin,” or just wave hello twice in front of your phone (which is quite a unique innovation).

7. AIVC

AIVC stands for Artificial Intelligent Voice Control. It comes in two versions: free, which contains a number of ads, and Pro. The former option covers basic functionality, whereas the Pro one provides some appealing features like TV-Receiver control, wake up mode, and others. You can control devices that are accessible over a web interface with your own preset commands.

As far as voice commands go, the app gives you the option to define specific phrases to invoke a certain action. This is done to minimize the risk of the app not understanding what you want.

AIVC performs actions on other websites and services so you can compose emails, make Facebook posts, or move over to a navigation app.

8. DataBot

DataBot is one of the simpler Android Personal assistants. You can play around with it, ask for jokes and riddles, or do other goofy stuff, but it can actually be pretty useful for various tasks. You can ask the bot to make searches online, schedule events, and make calls by just using your voice.

It is a cross-platform application so you can sync it across all your devices: smartphones, tablets, and laptops. That way, you get a coherent, all-around hands-free experience. Also, DataBot gains experience while you’re using it.

A slight inconvenience that DataBot has is that it comes with ads and in-app purchases. If you aren’t bothered by that, it should be a good addition to your daily routine.

9. Car Dashdroid

Car Dashdroid includes everything you could possibly need while driving – navigation, music, contacts, messages, voice commands, and more. It is also integrated with popular messaging apps like WhatsApp, Telegram, and Facebook Messenger.

What makes this app stand out as a specifically car-oriented solution is that it comes with a compass, speedometer, and plenty of other features.

There are also customization blocks that help you arrange all tasks based on their priority. For example, if you mostly use the app for navigation, you can put it at the top. Then, you can place music control below navigation, and the list of frequently contacted people at the bottom.

10. Drivemode

Drivemode is a simple app meant to assist users while they’re driving. Users can select from their preferred navigation app (for example, Google Maps, Waze, and HERE Maps). You can also input favorite destinations (such as home, work, and so on), play music from multiple supported apps, and access messages in a low-distraction “driving mode” overlay with audio prompts.

Even though it’s not entirely hands-free, there is a function that presents shortcuts that you can access through tapping or swiping. Drivemode can also be integrated with Google Assistant, so the functionality can potentially be extended way beyond driving assistance.

Integrating a Hands-Free Experience with Alan

Voice AI offers immense benefits for businesses – from completing tasks more quickly to offering better user experience with verbal communication. You can add unique voice conversations, no matter the industry you’re in. The Alan platform allows you to implement hands-free, interactive functionality in your existing application with ease.

Alan Studio Walkthrough: Part 1

Alan Team — Fri, 08 Nov 2019 01:12:34 +0000

Part 1

This is the first in a three part series how to get started with the Alan Platform.

If you would like to follow along with this tutorial yourself, all the files necessary will be available on our GitHub, and you can also follow along using this video tutorial!

To begin, visit https://studio.alan.app/register to create your Alan Studio account. Once you create your account and verify your email, it will direct you to the main project page, so let’s take a look!

Project Page

Once you login, Alan will direct you to: https://studio.alan.app/projects

From here, there are many important things to note.

Tutorial: In our Menu Bar up top, you will see a button labeled “Tutorial” This will take you to https://alan.app/blog/docs/intro.html Where you can start with our documentation as well as how to integrate your script on any platform.
Create New Project: Click this button to start a new project quickly and easily.
Billing: On the top right of our menu bar, you will also see your monthly charge as well as how many free interactions you have left.
Menu Dropdown: This dropdown has quick shortcuts to our documentation, billing, and settings page.
Current Projects: The majority of this page will be taken up with cards that display your current project as well as quick analytics.

Creating our first project

Now that we are familiar with our project page, let’s get create our first sample project!

Go ahead and Click “Create New Project”, for this tutorial we are going to name our project, “Food Ordering”.

Scripting UI

Our Scripting page is the main page where you will do all of your scripting and project work that divides into five main sections:

The menu bar at the top
Our Scripts Navigation pane on the left
Our Script Development Window in the middle
The Debugging Pane on the right
The Logs bar featuring all input/output phrases and unrecognized phrases.

Script Basics

For this tutorial, we are going to be focusing on creating a fully voice enabled Food Ordering application. You will notice that the Script Development window is prompting us to create a new script, so let’s go ahead and add one now.

Click the “Create New Script” button and we will add a predefined script template called “Food_Ordering”.

Quick Tip: Go through our predefined scripts to learn more about the features of Alan and generate new script ideas!

Once you add your new script, you will see it open in our main window. The Source Code for this application is also available in our GitHub so you can download and follow along.

Let’s try out this script by clicking on the Alan Button and saying, “Order two pepperoni pizzas”.

From here, we can see how Alan associates our keywords with:

An intent on line 296:

intent(`(add|I want|order|get|and|) $(NUMBER) $(ITEM ${ITEMS_INTENT})`,

And a response on line 351:

p.play(answer);

A sample with more details on the Answer function is found on line 320.

If you look in the debugging chat, you can see the actual instructions that are being sent to the application in order to achieve commands.

Now that we have created our first project and understand the basics of Voice Scripts, we’ll give you some time to play around with your project and adjust the scripts as you wish. We’ll see you in the next blog post where we will discuss more about customizing scripts, version control, development stages, and logs.

What is a voice assistant?

Alan Team — Fri, 25 Oct 2019 16:58:00 +0000

A voice assistant is a digital assistant that uses voice recognition, language processing algorithms, and voice synthesis to listen to specific voice commands and return relevant information or perform specific functions as requested by the user.

Based on specific commands, sometimes called intents, spoken by the user, voice assistants can return relevant information by listening for specific keywords and filtering out the ambient noise.

While voice assistants can be completely software based and able to integrate into most devices, some assistants are designed specifically for single device applications, such as the Amazon Alexa Wall Clock.

Today, voice assistants are integrated into many of the devices we use on a daily basis, such as cell phones, computers, and smart speakers. Because of their wide array of integrations, There are several voice assistants who offer a very specific feature set, while some choose to be open ended to help with almost any situation at hand.

History of voice assistants

Voice assistants have a very long history that actually goes back over 100 years, which might seem surprising as apps such as Siri have only been released within the past ten years.

The very first voice activated product was released in 1922 as Radio Rex. This toy was very simple, wherein a toy dog would stay inside a dog house until the user exclaimed its name, “Rex” at which point it would jump out of the house. This was all done by an electromagnet tuned to the frequency similar to the vowel found in the word Rex, and predated modern computers by over 20 years.

At the 1952 World’s fair, Audrey was announced by Bell Labs. The Automatic Digit Recognizer was not a small simple device however, its casing stood six feet tall just to house all the materials required to recognize ten numbers!

IBM began their long history of voice assistants in 1962 at the World’s Fair in Seattle when IBM Shoebox was announced. This device was able to recognize digits 0-9 and six simple commands such as, “plus, minus” so the device could be used as a simple calculator. Its name referred to its size, similar to the average shoebox, and contained a microphone connected to three audio filters to match the electric frequencies of what was being said and matched it with already assigned values for each digit.

Darpa then funded five years of speech recognition R&D in 1971, known as the Speech Understanding Research (SUR) Program. One of the biggest innovations to come out if this was Carnegie Mellon’s Harpy, which was capable of understanding over 1,000 words.

The next decade led to amazing progress and research in the speech recognition field, leading most voice recognition devices from understanding a few hundred words to understanding thousands, and slowly making their way into consumers homes.

Then, in 1990, Dragon Dictate was introduced to consumers homes for the shocking price of $9,000! This was the first consumer oriented speech recognition program designed for home PC’s. The user could dictate to the computer one word at a time, pausing in between each word waiting for the computer to process before they could move on. Seven years later, Dragon NaturallySpeaking was released and it brought more natural conversation, able to understand continuous speech at a maximum of 100 words per minute and a much lower price tag of $695.

In 1994, Simon by IBM was the first smart voice assistant. Simon was a PDA, and really, the first smartphone in history, considering it predates HTC’s Droid by practically 25 years!

In 2008, when Android was first released, Google had slowly started rolling out voice search for its Google mobile apps on various platforms, with a dedicated Google Voice Search Application being released in 2011. This led to more and more advanced features, eventually leading to Google now and Google Voice Assistant.

Then, this was followed by Siri in 2010. Developed by SRI International with speech recognition provided by Nuance Communications, the original app was released in 2010 on the iOS App Store and was acquired two months later by Apple. Then, with the release of the iPhone 4s, Siri was officially released as an integrated voice assistant within iOS. Since then, Siri has made its way to every Apple device available and has linked all the devices together in a single ecosystem.

Shortly after Siri was first developed, IBM Watson is announced publicly in 2011. Watson was named after the founder of IBM, and was originally conceived in 2006 to beat humans at a game of Jeopardy. Now, Watson is one of the most intelligent, naturally speaking computer systems available.

Amazon Alexa is then announced in 2015. It’s name being inspired by the Library of Alexandria and also the hard consonant “X” in the name, helping with more accurate voice recognition. With Alexa, the Echo line of smart devices are announced to bring smart integration to consumers homes for an inexpensive route.

Alan is finally publicly announced in 2017 to take the Enterprise Application world by storm. Being first born as “Synqq”, Alan is created by the minds behind “Qik”, the very first video messaging and conferencing mobile app. Alan is the first voice AI platform aimed at enterprise applications, so while it can be found in many consumer applications, it is designed for enterprises to be able to develop and integrate quickly and efficiently!

At the bottom of the post we’ve included a Timeline to summarize the history of voice assistants!

Technology behind Voice Assistants

Voice assistants use Artificial Intelligence and Voice recognition to accurately and efficiently deliver the result that the user is looking for. While it may seem simple to ask a computer to set a timer, the technology behind it is fascinating.

Voice Recognition

Voice recognition works by taking an analog signal from a users voice and turning it into a digital signal. After doing this, the computer takes the digital signal and attempts to match it up to words and phrases to recognize the users intent. To do this, the computer requires a database of pre-existing words and syllables in a given language to be able to closely match the digital signal with. Checking the input signal with this database is known as pattern recognition, and is the primary force behind voice recognition.

Artificial Intelligence

Artificial intelligence is using machines to simulate and replicate human intelligence.

In 1950, Alan Turing (The namesake of our company) published his paper “Computing Machinery and Intelligence” that first asked the question, can machines think? Alan Turing then went on to develop the Turing Test, a method of evaluating a computer to test its capability of thinking like a human. There were four approaches later developed that defined AI, Thinking humanly/rationally, and acting humanly/rationally. While the first two deal with reasoning, the second two deal with actual behavior. Modern AI is typically seen as a computer system designed to accomplish tasks that typically require human interaction. These systems can improve upon themselves using a process known as machine learning.

Machine Learning

Machine learning refers to the subset of Artificial Intelligence where programs are created without the use of human coders manually creating the program. Instead of writing out the complete program on their own, programmers gives the AI “patterns” to recognize and learn from and then gives the AI large amounts of data to sift through and study. So instead of having specific rules to abide by, the AI searches for patterns within this data and uses it to improve its already existing functions. One way machine learning can be helpful for Voice AI, is by feeding the algorithm hours of speech from various accents and dialects.

While traditional programs requires an input and rules to develop an output, machine learning tools are given an input and an output and use that to create the program itself. There are two approaches to machine learning, supervised learning and unsupervised learning. In supervised learning, the model is given data that is already partly labeled, this means some of the data given will be already tagged with the correct answer. This helps guide the model into categorizing the rest of the data and developing a correct algorithm. In unsupervised learning, none of the data is labeled, so it is up to the model to find the pattern correctly. One of the reasons this is very useful is because it allows the model to find patterns that the creators might have never found on their own, but the data is much more unpredictable.

Different Voice Assistant approaches

Many conversational assistants today combine both a task-oriented and knowledge-oriented workflow to carry out almost any task that a user can throw at it. A task-oriented workflow might include filling out a form, while a knowledge-oriented workflow includes answering what the capital of a state might be or specifying the technical specifications of a product.

Task-oriented approach

A task-oriented approach is using goals to tasks to achieve what the user needs. This approach often integrates itself with other apps to help complete tasks. For example, if you were to ask a voice assistant to set an alarm for 3PM, it would understand this to be a task request and communicate with your default Clock application to open and set an alarm for 3PM. It would then communicate with the app to see if anything else was necessary, such as a name for the alarm, then it would communicate this need back to you. This approach does not require an extensive online database, as it is mainly using the knowledge and already existing skills of other installed applications.

Knowledge-oriented approach

A knowledge-oriented approach is the use of analytical data to help users with their tasks. This approach focuses on using online databases and already recorded knowledge to help complete tasks. An example of this approach is anytime a user asks for an internet search, it will use the online databases available to return relevant results and recommend the highest search result. If someone is searching up a trivia question, this would use a knowledge-oriented approach as it is searching for data instead of working with other apps to complete tasks.

Benefits of Voice Assistants

Some examples of what a Voice Assistant can do include:

Check the weather
Turn on/off connected smart devices
Search databases

One of the main reasons of the growing popularity of Voice User Interfaces (VUI) is due to the growing complexity within mobile software without an increase in screen size, leading to a huge disadvantage by using a GUI (Graphical User Interface). As more iterations of phones come out, the screen sizes stay relatively the same, leading for very cramped interfaces and creating frustrating user experiences, which is why more and more developers are switching to Voice User Interfaces.

Efficiency and Safety

While typing has become much faster as people have gotten used to using standard keyboards, using your voice will always be quicker, much more natural, and lead to less spelling errors. This leads to a much more efficient and natural intelligent workflow.

Quick learning curve

One of the greatest benefits of voice assistants is a quick learning curve. Instead of having to learn how to use devices like mice and touch screens and get used to using specific physical devices, you can just use your natural conversation tendencies and use your voice.

Wider Device Integration

Since a screen or keyboard isn’t necessary, it’s easy to place voice integration into a much wider array of devices. In the future, smart glasses, furniture, appliances, will all come with voice assistants already integrated into the device.

Why and When to use Voice Assistants

There are many use cases for using a voice assistant in todays’ world. For example, when your hands are full and you are unable to use a touch screen or keyboard, or when you are driving Let’s say you are driving and you need to change your music, you could just ask a voice assistant, “play my driving playlist”. This leads to a safer driving experience, and helps avoid the risk of distracted driving.

User Interfaces

To further understand voice assistants, it is important to take a look at the overall user Experience and what a User Interface is and how a VUI differs from a more traditional graphical user Interface that modern apps currently use.

Graphical User Interface (GUI)

A Graphical User Interface is what is most commonly used today. For example, the internet browser you’re using to read this article is a graphical user interface. Using graphical icons and visual indicators, the user is able to interact with machines quicker and easier than before.

A Graphical User Interface can be used in something like a chatbot, where the user communicates with the device over text, and the machine responds with natural conversation text. The big downside to this is since it is done all in text, it can seem cumbersome and inefficient, and can take longer than voice in certain situations.

Voice User Interface (VUI)

An example of a VUI is something like Siri, where there is an audio cue that the device is listening, followed by a verbal response.

Most apps today combine a sense of both Graphical and Voice User Interfaces. For example, when using a maps application, you can use voice to search for destinations and the application will show you the most relevant results, placing the most important information at the top of the screen.

Some examples of popular smart assistants today are Alan, Amazon Alexa, Siri by Apple, and Google Voice Assistant.

Popular Voice Assistants

Voice Assistant adoption by platform, from Voicebot.ai

Siri

Siri is the most popular voice assistant today. Created in 2010 by SRI Inc, and purchased in 2011 by Apple, Siri has quickly become an integral part of the Apple ecosystem in bringing all the Apple devices and applications together to use in tandem with one another.

Alexa

Created by Amazon in 2014, Alexa was named due to its similarity to the Library of Alexandria. Alexa was originally inspired by the conversational voice system found on board the U.S.S. Enterprise in Star Trek. Alexa was released alongside The Amazon Echo, a smart speaker intended for consumers to dive into the world of home automation, uses the Alexa platform to allow users to interact with the Amazon ecosystem and allow for a plethora of smart devices to be connected.

Google Assistant

Originally unveiled in 2016, Google Assistant was the spiritual successor of Google Now, with the main improvement being the addition of two-way conversations. Where Google now would return answers in the form of a search results page on Google, Google Assistant gives answers in the form of natural sentences and returns recommendations in the form of Feature cards.

Cortana

Beginning in 2009, Cortana by Microsoft has had one of the longest visions of giving people access to voice assistants in their daily lives. Microsoft began shipping Cortana with all Windows 10 and Xbox devices, leading to a huge increase in the amount of registered Cortana users. In 2018 it was reported that Cortana had over 800 Million users.

Alan

In 2017 Alan set out to take voice assistants to the next level, by enabling voice AI for all applications. Using domain specific language models and contextual understanding, Alan is focused on creating a new generation of Enterprise Voice AI applications. By using the Alan Platform, developers are able to take control of voice, and create an effective workflow that best fits their users with the help of vocal commands.

Future of Voice Assistants

As AI becomes more advanced and voice technology becomes more accepted, not only will voice controlled digital assistants become more natural, they will also become more integrated into more daily devices. Also, conversations will become much more natural, emulating human conversations, which will begin to introduce more complex task flows. More and more people are using voice assistants too, as it was estimated in early 2019 that 111.8 million people in the US will use a voice assistant at least monthly, up 9.5% from last year.

Further Integration

In the future, devices will be more integrated with voice, and it will become easier and easier to search using voice. For example, Amazon has already released a wall clock that comes enabled with Amazon Alexa, so you can ask it to set a timer or tell you the time. While these devices aren’t full blown voice activated personal assistants, they still show a lot of promise in the coming years. Using vocal commands, we will be able to work with our devices just by talking.

Natural Conversations

Currently, as users are getting more used to using voice to communicate with their digital devices, conversations can seem very broken and awkward. But in the future, as digital processing becomes quicker and people become more accustomed to using voice assistants in their everyday devices, we will see a shift where users won’t have to pause and wait for the voice assistant to catch up, and instead we will be able to have natural conversations with our voice assistants, creating a more soothing and natural experience.

More complex task flows

As conversations with voice assistants become more natural and voice recognition and digital processing becomes quicker, it won’t be uncommon to see users begin to adopt more advanced tasks in their daily routines with voice assistants. For example, instead of asking a voice assistant how long a commute is, and then asking about different options, you might be more inclined to say, “If Uber is quicker than taking the bus to work, can you reserve an Uber ride from home to work, and how long will it take?”

How to make your own voice assistant

As the amount of voice assistants available publicly begin to grow, tools are beginning to appear to create your own to make it as easy as possible to find a voice assistant that fits your needs!

For example, if you just wanted to create a specific skill, or command for a voice assistant, it might be more efficient to look into integrating a skill into an already existing voice assistant, such as Alexa.

Amazon has actually made it incredibly simple to add your own command to the vastly growing set of publicly available Alexa Skills. You can login to AWS with the same account you have an Echo linked to, and use the tools to create a free Alexa Skill!

Using Alan Studio, the completely browser based Voice AI IDE, you can develop, test, and push voice integration straight from your browser.

Why Alan?

Alan is a highly customizable Voice AI platform designed to work with any pre-existing application. Built with enterprise use in mind, security and business functionality are a top priority. You can leverage visual and voice context to support any workflow and improve efficiency today, and since Alan is a completely browser based IDE, you can edit your scripts on the go whenever the need arises. Long gone are the days of creating multiple versions of scripts to run on each platform, with Alan, you can use a single script version and embed into any app, iOS, Android, or Web. You can sign up today for Alan Studio and see how you can create an AI voice assistant solution to improve your quality of life!

The Alan Voice AI Platform

Click the Alan button to learn more!

Voice Assistant Timeline

1922 – First Voice activated consumer product hits store shelves as “Radio Rex”
1952 – Audrey, or the Automatic Digit Recognition Machine, is announced
1962 – IBM Shoebox is shown for the first time at the State Fair
1971 – Darpa funds five years of speech recognition research and development
1976 – Harpy is shown at Carnegie Mellon
1984 – IBM releases “Tangora” the first voice activated typewriter
1990 – Dragon Dictate is released
1994 – Simon by IBM is the first modern voice assistant released
2010 – Siri is released as an app on the iOS app store
2011 – IBM Watson is released
2012 – Google Now is released
2014 – Amazon Alexa and Echo are released
2015 – Microsoft Cortana is released
2017 – Alan is developed and released with the Alan Platform

From Voicebot.ai

Resources

Speech Recognition in 1920s: Radio Rex – The first speech recognition machine?

Audrey: The First Speech Recognition System

https://whatis.techtarget.com/definition/voice-assistant

https://www.smartsheet.com/voice-assistants-artificial-intelligence

https://www.ibm.com/ibm/history/ibm100/us/en/icons/speechreco

http://www.bbc.com/future/story/20170214-the-machines-that-learned-to-listen

https://towardsdatascience.com/build-your-first-voice-assistant-85a5a49f6cc1

This article was reposted at dev.to here:
https://dev.to/alanvoiceai/what-is-a-voice-assistant-492p

What is a Voice User Interface (VUI)?

Alan Team — Wed, 25 Sep 2019 16:56:00 +0000

A Voice User Interface(VUI) enables users to interact with a device or application using spoken voice commands. VUIs give users complete control of technology hands free, often times without even having to look at the device. A combination of Artificial Intelligence(AI) technologies are used to build VUIs, including Automatic Speech Recognition, Name Entity Recognition, and Speech Synthesis among others. VUIs can also be contained either in devices or inside of applications. The backend infrastructure, including AI technologies used to create the VUI’s speech components, are often stored in a public or private cloud where the user’s speech is processed. In the cloud, AI components determine the intent of the user and return a given response back to the device or application where the user is interacting with the VUI.

Well known VUIs include Amazon Alexa, Apple Siri, Google Assistant, Samsung Bixby, Yandex Alisa, and Microsoft Cortana. For the best user experience, VUIs have visuals created by a Graphical User Interface and additional sound effects to accompany them. Each VUI today has its own way of handling sound effects are used so that users know when the VUI is active, listening, processing speech, or responding back to the user. The benefits of VUIs include hands-free accessibility, productivity, and better customer experience that will change how the world interacts with artificial intelligence.

The Creation of VUI

Audrey

The first traces of VUI started as the first speech recognition system in 1952 with a device called Audrey. Audrey was invented by K.H. Davis, R. Biddulph and S. Balashek, it was known as the “automatic digit recognizer” due to its ability to recognize numbers 0 through 9. Although Audrey’s skill was limited to numbers, it was seen as a technological breakthrough. Audrey was also not a small device like usually seen today, Audrey stood 6 feet tall with a large and rather complicated analog circuit system.

During the creation of Audrey there was an input and output procedure like used today in modern VUI devices. First, a speaker recited a digit or digits into a telephone and made sure to make a 350 milliseconds pause between each word. Next, Audrey listened to the speaker’s input and with speech processes it sorted the speech sounds and patterns to understand the input. Audrey would then visibly respond by flashing a light like modern VUI devices.

Although Audrey could distinguish the numbers, Audrey could not universally understand everyone’s voice or language style and could only respond to a familiar speaker. Unfortunately this was not a feature like modern day VUI in devices, Audrey was simply not advanced enough and needed a familiar speaker to maintain a 97 percent digit recognition accuracy. With a few other designated speakers, Audrey’s accuracy was 70-80 percent, but far less with other speakers it was unfamiliar with. Why was Audrey created in the first place if manual push-button dialling was cheaper and easier to work with? Recognized speech requires less bandwidth (less frequencies for transmitting a signal) than the original sound waves in a telephone. It would also be more practical for reducing data traveling through wires and future technology.

Tangora

Shortly after the creation of Audrey, the most significant voice technology advancement was in 1971 when the U.S Department of Defense’s research team funded five years of a Speech Understanding Research program. Their goal was to reach a minimum of 1,000 vocabulary words with the help of companies such as IBM. In the 1980s, IBM built a voice activated typewriter called Tangora. Tangora was capable of understanding and handling a 20,000-word vocabulary. Today voice activated typing systems have evolved to be used in smartphones to send a text or write a research paper in a matter of moments.

Overtime, computer technology advanced VUI, Graphical User Interface (GUI), and User Experience (UX) design is placed into a small device that fits in the palm of a hand. Even GUI and UX is becoming old news due to the quick adoption of voice-only devices that no longer use these features. Speech recognition technology went from understanding 9 numbers to millions of phrases and words from any voice. This advancement was made possible with new speech recognition software processes such as Automatic Speech Recognition, Name Entity Recognition, and Speech Synthesis.

Technology used to create a VUI

A range of Artificial Intelligence technologies are used to create VUIs, including Automatic Speech Recognition, Name Entity Recognition, and Speech Synthesis.

Automatic Speech Recognition

Automatic Speech Recognition(ASR) is a technology used to analyze and process human speech into text. For a given audio input, ASR is required to filter out any distracting acoustic noises and identify human speech instead. Distortions in the audio and streaming connectivity can make this a challenge. Several underlying technologies have been tested and used to build ASR technology, including Gaussian mixture models (a probabilistic model) and deep learning with neural networks that process and distribute information to collect data. Often times, the words recognized by ASR are not an exact match to entities within a user intent. In these cases, augmented entity matching is used, which will take similar words or similar sounding words and match them to a predefined entity in the VUI.

Name Entity Recognition

Name Entity Recognition(NER) is used to classify words as their underlying entity. For example, in the command “Get directions to New York City”, ‘New York City’ is recognized as a location. In addition to locations, NER locates entities or semi-structured text that can be a person, a subject, or something as specific as a scientific term. NER often takes surrounding text or words to determine the value of the entity. In the “Get directions to New York City” example, pre-trained probabilistic models assume that whatever word(s) come after “Get directions to” can be safely classified as a location. Examples like “Get directions to the nearest gas station” can also work for the same reasons, with ‘the nearest’ being a defined qualifier that precedes location.

NER assists ASR in resolving words as their entities. On the basis of voice input alone, “New York City” is recognized as “new” “York” “city”. NER then identifies this as a unique location and adjusts to “New York City”. NER is highly contextual and needs additional input to confidently determine entities. Sometimes, NER is reliant on previous training and will not be able to confidently determine an input’s entity.

Speech Synthesis

Speech Synthesis produces artificial human voice and speech using input text. VUI does the job in three stages. The stages are input, processing, and output. Speech Synthesis is simply a text-to-speech (TTS) output where a device reads out loud what was input with a simulated voice through a loudspeaker.

These AI technologies analyze, learn, and mimic human speech patterns and can also adjust the speech intonation, pitch, and cadence. Intonation is the way a person’s voice rises or falls as they speak. Factors that affect intonation is emotion, accent, and diction. Pitch is the tone of voice, but it is not affected by emotion. Pitch is high or low and can be best described as a squeaky or deep voice. Cadence is the flow of voice that fluctuates in pitch as someone is speaking or reading. For example, a public speaker will change their cadence by descending their voice during a declarative sentence to make an impact on their audience.

Once all of this information is stored and analyzed, these technologies will use it to improve itself and the VUI through what is called machine learning. The clouds and technologies will determine the intent of the user and return a response through the application or device.

Intents & Entities

Voice commands consist of intents and entities. The intent is the objective of the voice interaction and has two approaches. There are local intents and global intents. A local intent is when the user is asked a question in which they respond “Yes” or “No”. A global intent is when a user has a more complex answer. When designing VUI’s, the way different commands can be said need to be taken into consideration in order to recognize the intent and respond correctly. Here is an example of getting directions to a location: “Get directions to 1600 Pennsylvania Avenue”, “Take me to 1600 Pennsylvania Avenue”. Entities are variables within intents. Think of it as the blanks needed to fill into a Mad Libs booklet, such as “ Book a hotel in {location} on {date}” or “Play {song}.”

VUI vs GUI

User Experience (UX) is the overall experience of an interface product such as a website, application, and more in terms of how aesthetically pleasing it is or how easy it is to navigate for users. Together VUI and GUI play a large role in UX design because they assemble a product for consumers.

Voice User Interface

As explained earlier, Voice User Interface (VUI) enables users to interact with a device or application using spoken voice commands. VUIs give users complete control of technology hands free, often times without even having to look at the device.

Graphical User Interface (GUI)

Graphical User Interface (GUI) is graphical layout and design of a device. For example, the screen display and apps on a smartphone or computer is a graphical user interface. GUI can be used to display visuals for VUI, such as a graphic of sound waves when a voice assistant on a smartphone responds to its user. Another real life example can be how Google and Apple Siri use VUI and GUI together.

Apple Siri VUI & GUI

Apple Siri responds to “Hey Siri” using VUI or by pressing down on the home button of the Apple device. Users will know that Siri is active when Siri says “What can I help you with?” through its speaker or on the screen using GUI. While a user speaks to Siri, colorful representational wavelengths move to the sound of speech. This also shows users that Siri is actively listening and processing their question. When a user is quiet, Siri will prompt “Go ahead, I’m listening…” If a user still does not respond, then it will display on the screen “Some things you can ask me:” with a few examples of what it can do, such as calling, face timing, emailing, and more.

This GUI feature is specifically catered to people who are new to Siri and are unsure on what to do. The Apple device will also display what the user has asked and Siri’s response on the screen to show what is being understood from the interaction. Other features that Apple Siri has is the customization of Siri’s gender, accent, and language.

Google Assistant VUI & GUI

Google Assistant responds to users when it hears “OK Google” or “Hey Google.” At the bottom of the screen, colorful dots will display to let the user know that Google Assistant has been activated and ready to listen. While it waits for the user to ask a question, the dots will move in a wave formation to represent wavelengths until it gets speech. Once a user starts speaking, the dots will transform into bars and move into a wave formation to the sound of speech to let users know it is processing information. Another GUI feature that Google Assistant has is that it will display what the user has asked and Google’s responses. Like Apple Siri, this display is another way of showing users what is being understood by the interaction. Google Assistant is also customizable in language and accent.

VUI vs Voice AI

The term Voice Artificial intelligence (AI) is used with VUI very commonly. Both terms usually get confused to mean the same thing since they are closely connected. VUI is all about the voice user experience on a device. Voice AI is the term for speech recognition technologies. The technologies fall under the Voice AI umbrella and are Automatic Speech Recognition, Name Entity Recognition, and Speech Synthesis.

Different VUI approaches

Voice command devices also known as voice assistants use VUI and can be auditory, tactile, or visual. Devices can also range from a small sized speaker or to a blue light that blinks in a car’s stereo when it hears a command. More common examples of a voice command device are iPhone Siri, Alexa, and Google Home. These voice assistants are made to help people in daily tasks. There are also device genres for what the VUI is used for. This influences how the interaction between the user and device is set up.

VUI Device Genres

Smartphones
Wearables
- Smart wrist watches
Stationary Connected Devices
- Desktop computers
- Sound System
- Smart TV
Non-Stationary Computing Devices
- Laptops
- Speakers
Internet of Things (IoT)
- Thermostats
- Locks
- Lights

Each voice enabled device has a different functionality. A smart tv will respond to changing the channel, but not to sending a text message like a smartphone would. Users can ask for information from the news and weather channel or simply send a voice text with the power of VUI. Not only are there devices, but VUI integrated voice controlled apps that serve the same purpose as well. The VUI will interact with an app in a task-oriented workflow and/or knowledge-oriented workflow. Task-oriented workflows can complete almost anything a user asks it to do, such as setting an alarm or making a phone call. Knowledge-oriented workflows responds to its user by using secondary sources like the internet to complete a task, such as searching for a question about Mt. Everest’s height.

The Benefits of VUIs

The primary benefit of VUIs is that they allow a hands-free experience that users can interact with while focusing on something else. It can save time in daily routines and improve people’s lives such as, checking the weather or setting an alarm clock the night before work.

VUI in Workflows & Lifestyles

VUI is beneficial in multitasking productivity in work spaces that range from an office space or outdoor labor. Voice User Interface can actively participate in worker safety by assisting users in hazardous work flows, such as construction sites, oil refineries, driving, and more. Traditional devices like phones and computers aren’t the only devices connected to the internet or VUI. Smart light fixtures, thermostats, smart locks, and other Internet of Things (IoT) are connected as well. These VUI devices are useful in households with travelers and/or busy families from home or a smartphone.

Improving Lives

With individualized experiences, VUI can lead society to a more accessible world and help give a better quality of life. VUI benefits users with disabilities such as the visually impaired or others that cannot adapt to visual UI or keyboards. VUI is also becoming popular with Seniors who are new to technology. Aging has many effects on abilities such as sensory, movement, and memory, which makes VUI an alternative to hands-on assistance. With the assistance of VUI, elders can communicate with loved ones and use devices without the confusion and frustration.

VUI in Education

Educational strategies are constantly being updated in educational systems for all ages. VUI can be a learning tool where classrooms interact with a voice assistant to create a new experience and cater to all learning styles. Since VUI is very accessible, training isn’t required for using it which makes it very easy to use in any audience.

Technology Innovation

As VUI grows, it will change the way that products are designed and start a new job demand. VUI design will become a key skill for designers due to the evolving user experience. User Experience (UX) designers are trained in providing experiences for physical input and graphical output. VUI design is different from UX because the design guidelines and principles are different. This will encourage designers to focus more on VUI design. In 2019, it was estimated that 111.8 million people in the US will use a voice assistant at least monthly, up 9.5% from last year. Since users are using voice assistants more than ever, it will eventually become a habit and the new device feature that everyone will own.

It will be easier for users to speak to a device than to physically use a device after the habit has been formed. This will create a high demand for VUI knowledgeable designers and contribute to the change of how devices are designed.

Lastly, another benefit to voice command devices is that they don’t stay stagnant to what they are programmed to do. Over time, the interaction between the user and voice-user interface improves through machine learning as discussed earlier. The user learns how to better utilize the voice command device and the device in return learns how to work with its user.

Solutions With Alan

With the Alan Platform, it is very simple to create your own voice interface designed for natural communication and conversation. Signing up for an account with Alan Studio gives you access to the complete Alan IDE to create a VUI you can integrate with any pre-existing app. The Alan Platform allows you to create a Voice User Interface completely within your browser and allows you to embed the code into any app, so you only have to write it once and not worry about compatibility issues.

Final Thoughts

Voice User Interface went from only recognizing numbers 0-9 to more than a million vocabulary words in different styles of speaking. VUI has never stopped progressing and is creating a new job demand and an important focus in User Experience design. As VUI progresses, more voice assistants and solutions are being created to benefit society. Companies and consumers are switching to the new and practical trend of VUI or combining Graphical User Interface with VUI.

Voice assistants come in many shapes, forms, and genres. Each device has its own purpose using VUI, such as assisting in the productivity of workflows, lifestyles, and education. What they all have in common is that their purpose is to help users in their everyday lives with a hands free user experience. This is done by using a range of Artificial Intelligence technologies that are used to create VUIs, including Automatic Speech Recognition, Name Entity Recognition, and Speech Synthesis.

Another reason why VUI never stops growing and improving is because it does not stay stagnant to what it is programmed to do. Over time, the interaction between the user and voice user interface improves through machine learning. The user learns how to better utilize the voice command device and the device in return learns how to work with its user. Together they are working towards a more advanced artificial intelligence and voice user interface.

This article was reposted at dev.to here:
https://dev.to/alanvoiceai/what-is-voice-ui-2ga7