Myths, promises, and realities

If only it were true, as the televised CSI seems to promise, that any audio recording could be made intelligible with a little bing from a computer. The realities of forensic audio may surprise you—amazing things are possible, but not all things.

As forensic audio engineers, sometimes we can make out words in what sounds like a totally unintelligible digital hash. We can often filter the sound of the airplane out enough to hear the drug deal going down. We can make out the whisper recorded in the squad car and hear a suspect ask his friend, “Where’d you hide the gun?” We might even be able to unearth information on an 18-minute gap in a recording, or uncover what Rodney King was really saying. We can have an opinion on whether or not a voice belongs to Osama Bin Laden. We can test to see if a recording is edited, even if it is a digital file. And very soon, we may be able to discover what date, time, and where a recording was made—after the fact—from almost any recording, even ones from the past.

What a forensic audio examiner cannot do is alter the laws of physics. The covert recording made with a cheap digital recorder, set on the longest record time, and hidden way underneath the bed may very well yield an enigma that will be impossible to decipher. Numerous times, I have been paid good money to try to enhance an important recording after the FBI or another very competent forensic audio lab has already had a whack. Detectives can’t believe that key words cannot be deciphered. No matter how much you complain that you saw it on TV, if the information isn’t there, it can’t be unearthed.

Forensic Audio Investigation
In my forensic audio practice in Los Angeles, I perform several types of investigation. The most common is audio enhancement. I take a recording in whatever format it comes in and enhance the audio to improve intelligibility. There is a big difference between “listenability” and “intelligibility.” Listenability refers to how “nice” or “good” a recording sounds. I am capable of improving music, for example, by removing pops, distortion, etc. but my goal is usually to be able to understand the words, even if the resulting sound is brittle and hard on the ears. For this I use some rather expensive computer equipment and specialized forensic audio programs.

In particular, covert recordings present a number of challenges to the forensic audio engineer since a detective frequently faces problems that he cannot control. Suspects whisper, end up far from the microphone, and often speak with an accent or in “code.” Recording situations are rarely optimal. For this reason, it is best to at least have a high-quality, uncompressed recording made using a good recorder with good microphones, even if they are very tiny ones.

Another major kind of examination is one of authenticity. The question is whether a recording is authentic—typically meaning continuous, not altered or edited. There is a specialized form of “tape authentication” where an examiner uses the characteristics of the recorder used and examines the audio tape physically under a microscope using a magnetic developing solution. This is a very time consuming process and only applies to tape recordings.

The authenticity of a digital audio file can also be examined now using a new software product that checks for phasing anomalies in stable technical signals and discontinuities in background noise parameters. These technical results need to be correlated with examinations of the spectrogram, critical listening using very high quality equipment, and examination of syntax and vocal context looking for non-sequitur transitions and illogical juxtapositions of words. If all of these measures concur, there is a high degree of probability that editing can be detected in a manner acceptable in many state courtrooms.

The primary directive in enhancement is to never do anything to the audio file that alters the core meaning of what was recorded. Frequently, the job of a forensic audio specialist is merely to handle evidence, copy, and present it in a way that is easy for the court to understand and interpret. Thus, no hidden edits are made in a file. When an edit is made—for example to redact something that is inadmissible, or to shorten an overly long group of evidence—the edit must be obvious. Typically I place a short beep in between any edits so that they will be blatant. Sometimes I gather a selection of shortened clips from a long recording, but often the judge or opposing counsel may object to this, so I typically provide this on a separate CD. I always provide a copy of the unprocessed file for reference. If the recording spans more than one CD, I provide a one-minute overlap to verify continuity.

Some engineers prefer to provide CD-Data discs instead of Redbook CDs. These data discs can also contain a digital “hash” signature to verify integrity of the file. There are programs that create and read digitally signed, secured, data integrity checked, data reliable, and password protected CDs and DVDs. Data CDs are not playable in a normal CD player, for example, so detectives may object to not having a Redbook CD. CDs or DVDs are labeled with all pertinent case information, indication if the CD is first generation “evidence,” and a box where the engineer’s initials can be marked.

Another area is audio puzzle solving, or forensic audio examinations. For example, this includes trying to determine the absolute loudness of a recording in decibels, or solving some kind of a puzzle in the audio realm. Recently, I was given a recording from an answering service to try and determine if it was a threat. Not one word was really understandable, although an attitude of theatrical reading was discernable. Basically, the recording consisted of a digital sounding incoherent buzz. After several hours of enhancement I was able to determine one three-word sequence and one two-word sequence. Following the hunch that this was in some way theatrical, I started doing Boolean searches on the Internet. I ended up on a site mostly in Russian, but it had a French translation button. From the French I got an author and a book title. Searching again for author and book and these word sequences under literary databases I eventually found the entire quote. Amazingly, when I listened to the recording while reading the actual text from the book, I could confirm that this was exactly what was being said. This kind of sleuthing result is the glory that an examiner lives for— the ultimate puzzle—perfectly solved. Eat your heart out, Mr. Holmes.

Audio Tools
Forensic audio enhancement software comes equipped with a number of tools. My favorites are the spectral inverse filter—where you “train” the program to identify a sample of sound that is “noise” and then it digitally deducts this sound—and adaptive filters that can be tuned to minimize certain types of noise in a dynamic fashion. They also contain an array of de-buzzers, de-clickers, de-hissers, advanced dynamics, and equalization. These tools now use advanced spectrum analyzers that visualize the sound graphically. I work using an array of computer screens with an array of colorful graphics—come to think of it, just like in CSI.

The audio spectrogram or “voice print” is sometimes compared to the fingerprint as a unique identifier for a voice. This is how an examiner can identify a voice in order to give odds that a voice is Osama Bin Laden’s. You notice I say “give odds” not prove. Voice print analysis is a tricky art as well as a science and is not accepted in many states as evidence. To be more certain, the examiner should have a sample of the suspect saying the exact words (an exemplar) of the recordings to be examined. This is very often not possible. Some skeptical examiners have called voice print “vodoo science,” and it is typically quite expensive.

Transcription and Chain of Custody
Working with audio evidence presents many challenges. Often a problem arises with flat rate transcriptions done by an outside company. If a company bids a flat rate per hour of recorded material to transcribe, they are not likely to anguish over deciphering a word or two. When confronted with a difficult passage, the typist will often indicate (inaudible) and quickly move on. When I do a certified transcript, I may play a section 30 times trying to figure out what is being said, and on a very good playback system. I also transcribe words that I am only 95% certain of, and underline these words. Often these underlined words provide the most valuable clues. It helps to give your transcriptionist detailed information on the recording—names of speakers, situation, proper names, etc. To save money, a client can give me a transcribed file and then I modify it listening to the recording, and certify it. E-mailed, digitally signed PDF files—a feature available in Adobe Acrobat—also save me from having to messenger or mail hard copies of certified transcripts.

Chain of custody is always a concern when handling evidence. In addition to standard record keeping, I make hard copies of e-mails that contain digital files to try to establish a chain when files are not delivered by a delivery service. Shipping receipts, dates, etc. are also filed with the project papers, or kept in a bound journal.

The Future
One alluring new wrinkle in forensic audio is Electric Network Frequency (ENF). When a recorder records a signal, it also records a subtle recording of the 60-cycle power (or mains hum) at that moment (50Hz in Europe). I call this the “trace.” This trace may also be recorded on some battery operated systems depending on the microphone used. This trace recording is defined as the ENF. The trace is there, though VERY slight, and always has been; it is not some new added element of recordings. The catch is that the ENF is very closely controlled by the power companies and very consistent over time. If one has a detailed database of the minute ENF fluctuations over time in a particular location, it is possible to compare the ENF trace in a recording and identify the exact date and time a recording was made. Some think that as we learn more, we may be able to isolate aspects of the trace and point to specific locations—a dialect of the trace, to coin a term. Also, any disturbance of the trace could be used to detect an edit in the recording.

The forensic audio use of ENF is already a reality in some European countries, where ENF fluctuations are already recorded in searchable databases. It is hoped that American power companies may be able to provide ENF data so that a similar U.S. database can be built. It is even possible that if data exists from the past, we may be able to retroactively identify the date and time of a recording from years ago. This is the equivalent of a secret invisible code embedded in almost every recording ever made to identify its date and time.We still do not know if these data still exist and can be compiled, but forensic audio engineers and law enforcement agencies are licking their lips with anticipation. At the minimum, we can expect ENF data from the several separate electricity grids in the U.S. to be compiled in the future to give us a historical reference data. The subsequent ability of a forensic audio scientist to authenticate the original, complete, and continuous nature of a recording would revolutionize the field.

In the future, there may be nowhere to hide.

Kent Gibson has a BA from Yale and an MA from Stanford. He is the owner of Forensic Audio in Los Angeles and is a primary contractor to the LA County Sheriff for forensic work. Forensic Audio’s clients include the FBI, U.S. Secret Service, Pasadena Police Department, LA Public Defender’s Office, County of San Bernardino, County of Santa Clara, the Santa Monica and Los Angeles Courts, and private attorneys nationwide. Forensic Audio, 3251 Oakley Drive, Los Angeles, CA 90068 (323) 851-9900,;