Lips Pouted or Not? How Improved Speaker Recognition Can Help Investigations

  • <<
  • >>



Police investigations use wiretapped phone recordings as investigative material fairly regularly. But how do they know that the voice on the recording actually belongs to the suspect? Universitet Leiden Ph.D. student Laura Smorenburg is trying to answer that question.

Everyone has a different voice, so we often know immediately whether we are hearing that one friend talking or a person you'd cross the street to avoid. But if the police intercept a phone call, it is more difficult to say with 100 percent certainty who they are hearing on the tape. Additionally, the quality is worse than, say, a microphone recording. Forensic analysts, therefore, are trying to improve speaker recognition.

One way to do that is to zoom in on the different sounds in a language. Smorenburg looked specifically at the role of two groups of consonants: nasal and fricative consonants.

“For example, how does someone pronounce the G in Dutch? Is it a soft or a hard G? If you compare different speech characteristics of one person from different types of recordings, you can examine whether it’s likely to be the same speaker,” she said.

Additionally, some sounds contain more information about the speaker than other sounds, according to Smorenburg. In fact, the phonetic context of a word can influence pronunciation.

When you speak, you don't pronounce the sounds separately, but joined together. That affects the sounds themselves. The G in Dutch is a good example: In the word “geen” (meaning none), it sounds different from in the word “goed” (meaning good) because your lips are in a different position.

“People have a different degree and timing of lip pouting when they pronounce rounded sounds like the vowels in good, so I wanted to know if those differences between different speakers matter for forensic analysis,” said Smorenburg.

Smorenburg’s study found that analyzing nasal and fricative consonants in specific sound contexts does indeed lead to slightly more speaker information, although there is a caveat.

“In practice, you have to make do with what you have because there is often very little data available for analysis. Examining only nasal and fricative consonants from specific contexts gives you only minimal gains in evidence. In principle, this is also good news, because it means that forensic investigators do not have to consider phonetic context in their analyses,” she said.

Overall, if sounds in certain sound contexts contain more speaker information, forensic analysts could work more effectively.


Subscribe to our e-Newsletters
Stay up to date with the latest news, articles, and products for the lab. Plus, get special offers from Forensic – all delivered right to your inbox! Sign up now!