Artificial Intelligence Voice Technology in Healthcare

Michael Ferro, Jr and Robin Farmanfarmaian

6 min readMar 9, 2021

From AI voice bots to vocal biomarkers

Installment 4 of the AI in Healthcare Series with Michael Ferro
Written by Robin Farmanfarmaian

In the past couple of years, the error rate for understanding spoken English using the most cutting edge AI voice programs has dropped to under 3%. If you are an American born English speaker, your error rate at understanding spoken English is about 4%. Which means cutting edge AI software is now better than Americans at understanding spoken English.

AI Voice Technology is intersecting healthcare in many ways — and this is still the early days of AI voice technology.

AI Voice Technology in Healthcare with Michael Ferro

The leading giant in the world of home-based smart speakers, Amazon Alexa is now HIPAA compliant. That means all patient communication and data is secure to the same standards and rules as patient hospital data.

Of note, a Smart Speaker “Skill” is the equivalent of Smartphone “App” — an individual application built for use on a software platform. A Voice Bot is an AI based system that usually uses NLP / NLU (natural language processing / understanding) in order to have a natural sounding, spoken conversation with a user.

As mentioned in the installment on Digital Therapeutics, Livongo has an Alexa Skill that connects to a patient’s CGM (continuous glucose monitor) so that the patient can say things like “Hey Alexa, ask Livongo what my last glucose reading was today” or “what are my sugar level trends”. Digital therapeutic companies like Headspace Health for meditation also have corresponding Amazon Alexa Skills. These smart speaker skills are just a preview of more to come in the health coaching and remote patient monitoring space.

Besides Amazon Alexa, there are other new smart speaker and tablet combinations that use voice bots for patients in the home. The ElliQ is a smart speaker, video camera and tablet combination specifically for aging in place. It allows grandma or grandpa to call people on video just by saying “ElliQ, please call my doctor”. ElliQ also gives vocal and written medication reminders, and learns grandma’s schedule in order to determin if there might be a problem. ElliQ learns grandma’s typical schedule through voice and video, and notes she passes by every day around 9am to go to the kitchen for a cup of tea. If one day grandma doesn’t go to the kitchen, ElliQ gets worried. ElliQ checks in on grandma, and if she doesn’t respond — alerts the caregiver, doctor, or emergency to come check on grandma.

ElliQ Smart Speaker and Tablet for aging in place: AI in Healthcare with Michael Ferro

As AI Voice Bots and smart speakers become more prevalent in our homes, these bots will have the ability to connect to our household and health IoT devices. Combined with additional data streams, expect to see more and more smart speaker skills that can help personalize a patient’s experience and be a voice presence in the patient’s daily life. That means businesses like hospitals, clinics, insurers, and even employer wellness programs can be a part of a patient’s day with coaching, reminders, information, or even just to motivate the patient. From a patient’s point of view, hearing from their doctor or healthcare team in the form of a voice bot can help change behavior in a positive way, which could improve patient health outcomes.

Over the next few years, any patient-facing businesses who would normally have an iOS app and/or Android app will also need a smart speaker skill.

Smart speaker skills will become the norm, not the exception.

There are a lot of hospitals that have already created Amazon Alexa Skills to interact with patients in their homes. The Mayo Clinic was one of the first hospitals with an Alexa skill, and currently have a few including a skill for information on covid19, as well as a first aid vocal voice assistant. Boston Children’s Hospital (BCH) has partnered with Seattle Children’s Hospital and together they have a skill specifically around the flu. BCH has another skill called “My Children’s Enhanced Recovery After Surgery” (ERAS) — it’s specifically for parents and caregivers to give their doctors and healthcare teams updates on the child at home after surgery.

Hospitals that have Smart Speaker Skills to Interact with Patients at Home: AI in Healthcare with Michael Ferro

Many hospitals — such as Atrium Health and Providence St. Joseph — have skills around finding things like the nearest urgent care, making an appointment with their provider, and finding out wait times at the nearest emergency room. There are a lot of other use cases as well — Beth Israel Lahey is using it to communicate between the staff. ChristianaCare in Delaware has a skill for their home health care initiatives. Other hospitals are using it in patient rooms so patients can interact with their EMR, interact with their care team, and even control aspects of the hospital room like the curtains or the TV. We’re also seeing smart speakers being used in operating rooms so the surgical team can check the EMR or other data using their voice — especially beneficial during surgery.

Vocal Biomarkers

Vocal biomarkers are a relatively new area of healthcare. Vocal biomarkers translate and measure a patient’s voice as data points, which can then be used as one data point to help lead to an overall diagnosis, the same way measuring vital signs like pulse-ox, imaging or blood testing can help lead to an overall diagnosis. The patient’s vocal biomarkers include things like the speed of their speech, the way they string words together over a period of time, their tone, pitch, annunciation, fluency, which are all used to help detect if any aspect of their voice has changed day-to-day. Human beings can’t detect the small changes day to day that an AI software program can detect, making vocal biomarkers a great tool because they are able to quantifiably measure tiny changes and objectively compare it to earlier recorded data. Human beings have to rely on subjective data when conversing.

Vocal biomarkers are a great tool because they objectively measure what was previously not easily or inexpensively quantifiable.

Sonde Health, a spin-out from Boston based PureTech, is one of the main startups in the world of AI voice technology in healthcare. Sonde has amassed a giant, clinical grade database of vocal biomarkers. They have also rolled out a covid19 test specifically for employers. The test uses the standard employee covid19 questionnaire in addition to integrating other standards like the patient’s temperature. What Sonde Health does differently is they also take a voice sample to analyze every day. Sonde is able to detect the smallest changes in someone’s voice, which could indicate that the employee is about to start or has already started to experience some type of a symptom.

Sonde Health isn’t claiming to be a diagnostic. What the AI enabled software does is measure the risk of disease by identifying symptoms through vocal analytics. Sonde Health also has applications for behavioral health, respiratory health, and one that measures the risk of CHF — congestive heart failure. One great feature is that Sonde Health can be easily integrated into other apps through a self-serve API.

For the non-techies, API is short for Application Programming Interface, and acts as the intermediary between 2 software applications.

Germany is attracting startups in digital health with incentives in order to become one of the main digital health hubs in the world. There’s a German startup company working on categorizing and analyzing coughs, even in a small crowd such as a busy family living room. The way it works is two microphones are placed in the room, and the AI enabled software takes a vocal fingerprint of the patient being tracked for their coughs. The software also records other details such as the patient’s height, weight, sex and age, as those all contribute to the way a voice or cough sounds. Now, when the patient does cough, the AI enabled software is able to identify that the cough came from the patient, versus anyone else in the room, and then classifies the cough into categories like “productive”, “wheezing”, and “whooping” coughs. This company hopes to be in the US sometime this year or next, and will be one of many companies with this type of offering.

Over the next 5 years, healthcare will see many more companies and use cases for vocal fingerprints and voice-based remote patient monitoring.

AI voice bots will become the norm, not the exception.

Next week’s installment will be a dive into Remote Patient Monitoring

Check out the previous 3 installments on the AI in Healthcare Series with Michael Ferro:

AI in Healthcare Introduction
Digital Therapeutics: DTx
Who Pays for AI in Healthcare?

Artificial Intelligence Voice Technology in Healthcare

Vocal Biomarkers

Written by Michael Ferro, Jr and Robin Farmanfarmaian

No responses yet