Let’s build a voice recognition chatbot to grow your business in 15 minutes.

4 min readNov 3, 2020

Why voice recognition for business

According to Small Biz Trends, voice assistant, voice commerce and voice recognition chatbots are poised to grow from $2 Billion in 2018 to $80 Billion per year by 2023.

Less Intrusive Marketing Channel

For businesses considering voice solution as a sales and marketing channel, it can also be less intrusive and less salesy than other promotional methods, such as television commercials, ad banners, product placements, and youtube ads, etc. In addition, voice assistants can reach broader demographics, including older and less technology savvy groups who would prefer to speak than text.

OC & C Strategy Consultants’ studies show that users are open to information about deals, sales and promotions from voice assistants.

Good user experience for wealthy and broad demographics

Study from Statista shows that affluent families rate their experience as mostly good to excellent when using voice-activated devices.

Bearing these business insights in mind, let’s create a web based chat assistant. This chat assistant can be opened in a Chrome browser. When talked to, the assistant will recognize users’ voice, convert the voice into text, and automatically look up the text in google. Demo can be found here.

Front-End

First things first, let’s create a simple UI which contains a button to prompt the voice assistant. We will also display the results of speech to text and add some scripts to the bot.

const startBtn = document.createElement("button");
startBtn.innerHTML = "Start listening";
const result = document.createElement("div");
const processing = document.createElement("p");document.write("<body><h1>My Siri</h1><p>Give it a try with 'hello', 'how are you', 'what's your name', 'what time is it', 'stop', ... </p></body>"); 
document.body.append(startBtn);
document.body.append(result);
document.body.append(processing);

Back-End

We will be using Web Speech API. The API has two distinct features: speech recognition and speech synthesis (text to speech). In this article, we will primarily be focusing on its speech recognition capabilities.

Web Speech API will be used to capture voice and convert it to text, which would then be used to prompt the next action. Please note that the API is only available in selected browsers, such as Chrome.

const SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition;

if (typeof SpeechRecognition === "undefined") {
  startBtn.remove();
  result.innerHTML = "<b>Browser does not support Speech API. Please download latest chrome.<b>";
}

In order to display the speech text in real time, we will customize the SpeechRecognition with a set of properties. We will set continuous and interimResults to true in order to display the voice-to-text in real time.

A handle is added to process the onresult event from the API. In this handler, we will call the function process to display users’ voice.

function process(speech_text) {
    return "....";
}recognition.onresult = event => {
   const last = event.results.length - 1;
   const res = event.results[last];
   const text = res[0].transcript;
   if (res.isFinal) {
      processing.innerHTML = "processing ....";
      const response = process(text);
      
      const p = document.createElement("p");
      p.innerHTML = `You said: ${text} </br>Siri said: ${response}`;
      processing.innerHTML = "";
      result.appendChild(p);
      // add text to speech later
   } else {
      processing.innerHTML = `listening: ${text}`;
   }
}

To initialize or stop the speech recognition, we will link the UI button with the recognition object.

let listening = false;
toggleBtn = () => {
   if (listening) {
      recognition.stop();
      startBtn.textContent = "Start listening";
   } else {
      recognition.start();
      startBtn.textContent = "Stop listening";
   }
   listening = !listening;
};
startBtn.addEventListener("click", toggleBtn);

We will then build out a conversational logic and handle some basic actions. The voice assistant can reply to hello, what’s your name, how are you, give you current time, and stop listening or can even open a new tab to search questions it cannot answer. Best yet, you can extend this process function further with AI libraries, which makes the voice assistant more powerful.

function process(rawText) {
   // remove space and lowercase text
   let text = rawText.replace(/\s/g, "");
   text = text.toLowerCase();
   let response = null;
   switch(text) {
      case "hello":
         response = "hi, how are you doing?"; break;
      case "what'syourname":
         response = "My name's Siri.";  break;
      case "howareyou":
         response = "I'm good."; break;
      case "whattimeisit":
         response = new Date().toLocaleTimeString(); break;
      case "stop":
         response = "Bye!!";
         toggleBtn(); // stop listening
   }
   if (!response) {
      window.open(`http://google.com/search?q=${rawText.replace("search", "")}`, "_blank");
      return "I found some information for " + rawText;
   }
   return response;
}

Last stretch! We will use the speechSynthesis controller of the Web Speech API to give our assistant a voice.

Congratulations! You’ve just built a Voice bot with less than 77 lines of code.

Demo can be found here.

Let’s build a voice recognition chatbot to grow your business in 15 minutes.

Why voice recognition for business

Front-End

Back-End

Written by Growthbotics