Академический Документы
Профессиональный Документы
Культура Документы
Forensic Phonetics
273 Whitaker Avenue, South
Powell, OH 43065
25 January 2011
In the early part of July, 2010, I was contacted by Sergeant Chris Kelley in the Patrol
Division of the Columbia, MO Police Department regarding analysis of a covert digital recording
made by an undercover police officer just before a subject was arrested. The recording was
relatively noisy so Sergeant Kelley was particularly interested in whether I could reduce the
noise in a particular section of the recording and determine just what the suspect had said. This
report provides a description of my analytic procedures and, on the accompanying CD, a copy of
the unprocessed recording as well as the processed recordings. In this report I also provide my
expert opinion on what the suspect actually said.
On or around 7/8/2010, Sergeant Kelley had a CD which contained a copy of the original
recording sent to me via Fedex. I have copied this original digital recording to a CD
accompanying this report and named it Complete_Original_Recording.wav (the original name of
the recording on the CD sent to me via Fedex was WS400014.wav). Subsequently, I created a
shorter wavefile which contained the first 1:15 minutes of the original recording. This I have
included as a wavefile named initial_v1.wav it serves as the source of the utterance in question.
Unless otherwise note, the waveform manipulations (and editing) were done using Adobe
Audition (1.0 and 3.0). Noise reduction processing was done using procedures available both in
the Adobe Audition and Adobe Soundbooth CS5 programs. A figure of the waveform
represented in this wavefile can be seen in Figure 1.
Figure 1. Graphic display (time by amplitude) of the waveform of the portion of the recording
named initial_v1.wav.
Next, I copied a 2.057 sec portion of the recording which had the utterance in question into a
separate soundfile. This file I have copied and named short_utterance_v1.wav. If you listen
carefully, you can hear the suspect’s speech in the background seeming to say (in the utterance
form the time position 0.341 to 1.621 sec) “give me (your) fucking wallet.” However, the noise
in the recording makes it somewhat difficult to hear. Figure 2 shows a graphic display of the
waveform in short_utterance_v1.wav. The highlighted portion contains the utterance of interest.
What is evident in Figure 2 is the “noisiness” of the recording although you can hear the
utterance, albeit as a low amplitude level, in the noise when this wavefile is played. The goal of
my analysis was to reduce the noise level in this signal while preserving the speech signal.
Figure 2. Graphic display (time by amplitude) of the waveform of the portion of the recording
named short_utterance_v1.wav. The portion highlighted contains the utterance of interest.
The first step in the noise reduction process is to eliminate persistent background noise that is
found throughout the recording. This is done using a three-step process. First, the
phonetician/acoustician searches the wavefile for a stretch of the recording during periods during
which no one is talking and when there is no any other identifiable noise source (such as a
barking dog, or banging on a table). Going back to the initial_v1.wav file, one can find such a
stretch of recording from 26.689 to 28.220 (a length of 1.53 seconds). Using FFT analysis (Fast
Fourier Transform) using the Adobe Audition noise reduction option, one can calculate a
frequency profile of the sound in this section of the waveform (which represents background
noise). This was done, and the frequency profile saved (this information is provided in the file
named Profile_26.689.fft). The profile was then loaded it into the Adobe Audition program and
this background noise was removed from short_utterance_v1.wav. The modified waveform
(saved in a file named short_utterance_v2.wav) is shown in Figure 3 following this noise
reduction step. Although there is some improvement in the signal that can be heard, there was
not a dramatic change in the waveform itself.
Figure 3. Waveform display of short_utterance_v2.wav.
Next, the section of interest was extracted, saved to a new soundfile and the amplitudes of the
obvious noisy “clicks” (seen as sharp vertical lines in the waveform) were reduced by 8 dB,
yielding the waveform shown in Figure 8. What is left, after noise reduction, bandpass filtering
and amplification, is to the trained (and likely to the untrained) ear, a production of the utterance
“Give me (your) fucking wallet.”
Figure 8. Graphical display of short_utterance_v5.wav.
While completing the analysis of this signal, I created several other noise reduced/speech
enhanced versions of this same section of the recording using Adobe’s Soundbooth program.
The end result was very similar, but short_utterance_v5.wav is the clearest, in my opinion.
In summary, after the noise reduction process described above was completed in my opinion
(as a phonetician and speech scientist well-versed in acoustic analysis of speech), the utterance
produced by the suspect and recorded was “Give me (your) fucking wallet.” The “your” is in
parentheses as this word was likely reduced in the speaker’s production to a transitional voicoid
(a vowel-like sound) between the “me” and “fucking” that is not readily discernible (the same
way that the “a” in “One small step for (a) man” produced by Neil Armstrong when he first
stepped on the moon is not discernible).