This monograph describes an experiment in Forensic Speaker Identification,
showing how speech samples from the same speaker can be discriminated from
speech from different speakers with acoustic features commonly used in
forensics. It also explains what is now considered the legally and
logically correct approach to Forensic Speaker Identification, and presents
data that can be used both in real casework and in further testing.
Forensic Speaker Identification is typically concerned with addressing the
question of whether two or more speech samples have been produced by the
same, or different, speakers. It is clear from recent research that the
legally and logically correct way of doing this is by using a Bayesian
Likelihood Ratio. The monograph explains what a Likelihood Ratio is; why
its use is now considered correct; and how it can be used to successfully
discriminate same-speaker pairs from different-speaker pairs. The monograph
shows how the Likelihood Ratio is a ratio of the probability of the
evidence given a hypothesis (e.g. that the two samples are from the same
speaker) to the probability of the evidence given a competing hypothesis
(e.g. that the speech samples are from different speakers). This can be
seen as a ratio expressing the similarity of the samples, divided by the
typicality of the samples (i.e. how common these similarities are in the
rest of the population). Since same-subject pairs are predicted by theory
to have Likelihood Ratios greater than unity, and different-subject pairs
are predicted to have Likelihood Ratios smaller, the Likelihood Ratio lends
itself to use as a discriminant function to discriminate same-speaker from
different-speaker speech samples. The extent to which this is possible is
vital knowledge, given the legal evidentiary standards now accepted in the
wake of the well-known Daubert rulings.
One stumbling block in the implementation of Bayesian Forensic Speaker
Identification is the general lack of adequate background distributions for
the assessment of the typicality of the similarities; that is, while two
forensic speech samples may be similar, how common are the similarities in
the general population?
Typically, one of the most important acoustic features used to compare
forensic speech samples is vowel formants. These are the resonant
frequencies of the speaker’s vocal tract when they are producing vowels.
Bernard’s early study on the formants of male Australian English vowels,
although now relatively old, provides potential background distribution
data from a large number of speakers. The first goal of the monograph,
therefore, is to describe, in adequate detail for forensic-phonetic
investigation, the distributions of formant values for a subset of the
vowels from the Bernard data set.