Title: Forensic Phonetics
Author(s): Michael Jessen
Journal Title: Language & Linguistics Compass
Volume: 2
Issue: 4
Page Range: 671-711
Publication Date: May-2008
Abstract: An overview of forensic phonetics is presented, focusing on speaker identification as its core task. Speaker profiling/speaker classification is applied when the offender has been recorded, but no suspect has been found. Auditory speaker identification by victims and witnesses becomes relevant when no speech recording of the offender is available. It can take the form of familiar-speaker identification or unfamiliar-speaker identification, and in the latter case a voice line-up/voice parade can be carried out. When recordings of both the offender and a suspect are available, a voice comparison is done by an expert in forensic speech analysis. Current issues and domains in voice comparison analysis include the Bayesian approach to forensic reasoning and the Likelihood Ratio, the use of formant frequency measurements, non-analytic perception and Exemplar Theory, forensic automatic speaker identification, and the interaction between different methods.

forensic voice comparison   by Geoffrey Stewart Morrison , 6-Jul-11
In his excellent survey of the field of forensic phonetics Michael Jessen recommends the likelihood-ratio framework for the evaluation of forensic-voice-comparison evidence. He also mentions French & Harrison (2007) "Position statement concerning the use of impressionistic likelihood terms in forensic speaker comparison", International Journal of Speech, Language and the Law, 14, 137-144, endorsed by many of the forensic phoneticians in the United Kingdom. Although, at first glance, the framework for the evaluation of forensic-voice-comparison evidence presented in the UK position statement may appear to be compatible with the likelihood-ratio framework, on deeper inspection it becomes apparent that it is not. I would like to draw interested readers' attention to "A Response to the UK position statement on forensic speaker comparison" written by Philip Rose and myself. It is available in English and Spanish (a Chinese translation is being prepared) at
Focus Questions   by Compass Editorial , 6-Jul-11
Dear Readers,

Please find below some focus questions submitted by the author Michael Jessen. These questions are related to issues raised in the article.

Best wishes, Compass Editorial

1. For many years the interest of general phoneticians in speaker variation (or “talker variation”, how it is frequently called) was essentially limited to the issue of speaker normalization. Speaker normalization was assumed to be a perceptual task of limited complexity, where the listener builds a speaker model based on information such as overall f0 and formant structure. This has changed during the 1990 through work by Pisoni, Johnson, Remez and others, where interest in speaker variation increased and where it was shown that the listener has a cognitive representation of speaker variation that is far more complex than previously assumed. This interest in the cognitive representation of speaker variation is part of a more general interest in the cognition of other “indexical features” such as age, sex/gender, regional status and social status (see Pisoni & Remez eds. “The Handbook of Speech Perception,” Blackwell, 2006). Nowadays, the investigation of speaker variation in “fine phonetic detail” has become a regular interest – as for example witnessed by contributions on that topic at the 2007 International Congress of Phonetic Sciences (

What are the implications of this general-phonetic speaker variation research for forensic phonetics? Are there speaker characteristics from that research that forensic phoneticians have not properly taken into account? Conversely, are there speaker characteristics commonly used by forensic phoneticians that have not been taken into account in the cognitive modeling of speaker variation?

2. A theoretical framework for the explanation and modeling of cognitive aspects of speaker variation and other indexical features that has received much attention is Exemplar Theory. What is the importance of Exemplar Theory for forensic phonetics? Here a distinction between speaker identification by witnesses on the one hand and voice comparisons on the other hand should be made. As far as the former domain is concerned, previous research has shown that the ability of (naïve) listeners to recognize familiar and unfamiliar speakers depends on many factors, such as the time from first exposure to an unknown voice to recognition, the duration of exposure, whether exposure was passive or a conversation took place and so forth. Do the results from that research correspond to the predictions made by Exemplar Theory? As far as the second domain is concerned, in which ways can Exemplar Theory become relevant in voice comparisons? Is this entirely a matter of holistic perception or has Exemplar Theory something to say about analytical methods?

3. In a paper on sociophonetic variation, Pierrehumbert (in “Journal of Phonetics” 34, 2006) makes a distinction between internal and external aspect of indexical variation. The cognitive aspects, which were mentioned so far, are internal aspects, whereas external aspects of variation are patterns that occur in the population but that are not necessarily cognitively present. In forensic phonetics the external aspect of speaker variation is approached in studies where multispeaker corpora are collected and evaluated with respect to forensically relevant parameters such as average f0. What is the forensic-phonetic importance of internal and what is the importance of external aspects of speaker variation?
