A study of voice-based health status monitoring using a smartphone

Professor Shinichi Tokuno discusses research into voice-based health monitoring using smartphone technology in diagnosis scenarios

Professor Tokuno explains: we are developing a system to detect various diseases based on the human voice. In particular, we are focusing on developing neuropsychiatric disorders that are diagnosed mainly by subjective judgement in clinical practice. We believe that providing new objective indicators for these diseases will contribute to the development of medicine.

First, I focused on stress-related diseases such as depression.

Recently, stress in places such as the home and the workplace, and accompanying mental health disorders, have become a major problem. Stress will cause a person to become ill, which is then reflected in their voice.

Everyone has had the experience of listening to a close friend or family member’s voice and noticing, “Hey, you don’t sound too good.” Additionally, skilled doctors can listen to a patient’s voice and guess their physical condition. However, they would probably not notice changes in their own voice.

Therefore, we have considered whether it is possible to automatically analyse one’s voice and monitor one’s health status while talking on a regular smartphone. Here, we use a privately developed application, identify the emotional components contained in voices, and then measure health status (mood modulation) from patterns in the changes.

By using this application, it is possible to know about the patient’s daily health status without being especially conscious of smartphone operation, and by knowing body modulations that one would not notice on one’s own, it may be possible to address them on one’s own before becoming sick.

MIMOSYS for voice-based health monitoring

As mentioned above, MIMOSYS measures the prosodic parameters of the voice and detects emotional components contained in the voice. Then, the degree of stress or depressive symptoms is estimated from changes in calculated emotional components.

We have already verified it with more than 10,000 Japanese voices and confirmed that there is no incompatibility with the widely used self-administered questionnaire and there is sufficient correlation with the doctor’s diagnosis. We also know that the effects of various interventions can be measured. In addition, in these verification activities, we have experienced cases that were able to lead to appropriate treatment and counselling at an early stage.

In Japan, MIMOSYS or its basic technology has already been commercialised as a system for occupational health and is pre-installed on several smartphones to measure daily health. There is an environment where more than 1 million Japanese can touch MIMOSYS.

MIMOSYS uses prosodic voice parameters, so it is theoretically independent of language. In fact, in our small verification, there was no difference between several languages and Japanese. However, the question is repeated whether it can also be used in our language. We need more extensive validation to convince them and are preparing for that.

Health monitoring by voice analysis is the ultimate telemedicine. Therefore, research on applications in space stations or deep sea exploration has also begun.

Voice-based health monitoring for other disease

Repeated questions, as well as differences between languages, include questions related to other disease applications. We are working on the development of new algorithms for dementia and Parkinson’s disease for a global ageing society. In older adults, senile depression, dementia, and Parkinson’s disease are very popular diseases that have similar initial symptoms and high rates of complications with each other. The early detection and differentiation of these diseases by voice also leads to early treatment and prevention, and we believe that the healthy life expectancy can be extended.

The differential diagnosis algorithm does not use emotion recognition technology, because the use of emotion recognition technology makes it possible to make more human-like judgements, and therefore the difference in voice due to the difference in disease is not known. Our algorithm works well in small clinical studies in Japan. In the future, it will be necessary to improve accuracy and conduct large-scale verification and multilingual verification in the same way as MIMOSYS.

Application of voice analysis other than health care

The scope of application of voice analysis is not limited to the healthcare field. Car driver status is also monitored. Especially in Japan, traffic accidents caused by elderly drivers are increasing, and we believe this technology will be useful to prevent them.

Athlete monitoring is also within the scope of this technology. Athlete management and conditioning is based on coach and trainer experience and subjective judgements are made. Visualising these empirical indicators will enable more effective training.

To download MIMOSYS

Although the social demonstration of MIMOSYS has been completed, the joint developer PST has made it possible for you to download MIMOSYS for free and try it out.

Stakeholder Profiles

Technology focus: Social impact analysis using voice biomarkers

Life events and voice biomarkers: Voice analysis technology

Ageing: Detection of cognitive impairment using voice analysis technology

Stakeholder Special Reports

Voice biomarkers that identify driving skills