Who is this? It's a question we rarely have to ask of callers these days with caller ID, but I remember back in the days of land lines when a call or two came through from a person whose voice I didn't recognize.
Parents warned their children not to talk to people they didn't know and to never tell a stranger they were home alone. Children complied because even at a young age, we are able to recognize each other through voice.
Not only do we identify people by their voices, but we also make assumptions about people based on their voice, said Rita Singh, research scientists, Carnegie Mellon University.
Singh said her work focuses on the core algorithmic aspects of computer speech recognition, and the understanding of and learning from speech signals. "The goal of my work is to enable computing machines to recognize speech better in general, especially in high noise and complex environments," Singh said.
In her research, she has found that the data they collect and analyze is helping them to identify personal characteristics beyond gender, age, or even race. "We can use the predictions about weight, height, or facial structure to determine all the physiological characteristics: medical, physical health, mental health, absence of trauma," said Singh.
There is also demographic information they can deduce from voice, like a person's level of education, sociological parameters, and even socioeconomic status. "We are individually relating these things to voice, but profiling also includes looking at the environment, like where is a person calling from?" Singh said.
By analyzing the data, researchers may be able to determine not only whether a person is in a room, but also what kind of room, how big it was, what kind of materials the voice is reflecting off of,and even what the walls were made of."We can listen to the sound objects in the background that are all coming together to make these determinations," Singh said.
The technology could be useful for law enforcement in determining hoax callers, "Not whether a person's life is really in danger. They don’t have time to think about whether you are lying. They have to respond," said Singh. Once they find out it is a hoax and they are repeatedly going and spending money and time to find nobody, then the technology serves them well.
"Once they find out the hoax caller, they are going to find the person," said Singh, "but this is so useful for any kind of voice-based crime, whether voice is one piece or the only evidence in a crime."
When the only thing you have is the voice, any information about the person's surroundings becomes extremely useful. "Every little piece of information narrows down the pool of people law enforcement has to consider. With no information, they are looking at the entire population," said Singh.
Whatever can be gleaned is useful because it’s supportive evidence, even though it’s not exact. "The technology can be useful in voice-based crimes ranging from bank fraud to social engineering impersonations. There are four billion videos viewed each day on YouTube where they use propaganda and the perpetrator might not be visible," Singh said.
If the person said anything, even one word, that sound can be used to make several predictions.
Whether you sound tired or elated, nervous, or underwhelmed, your voice reveals much about who you are and where you are. "The things you don’t know are the judgments of the competence of the person. Is this person under stress or not. We are making judgments all the time through only hearing the signal a person produces," Singh said.
That sound resonates and you hear that signal, but voice researchers know that there must be something in the signal that lets you make greater determinations, and they are working to make these advanced technologies through the marriage of statistics, machine learning, and signal processing.
This article is published as part of the IDG Contributor Network. Want to Join?