‘Your Voice Speaks Volumes: It’s not what you say, but how you say’ considers the connection of our voices to our identity and to how we perceive others (and ourselves). The book is divided into seven chapters. In the first chapter, ‘The nuts and bolts; how speech works’, Setter defines her use of ‘voice’ as ‘a cover term for the way that people produce speech and how that sounds’ (p. 4). This chapter previews the main goals of the book, and provides a concise introduction to how speech is produced, perceived, and acquired from the womb. Setter stresses that people can use their voice in a ‘chameleon-like’ manner, and that our voice shows our ‘social allegiances, our tribal memberships’ (p. 13); this idea is developed further in Chapter 2. This first chapter also introduces the sounds of English, taking as reference General British (also known as Received Pronunciation (RP) or BBC English). This chapter concludes with a brief overview on how the rest of the book is organized.

Chapter 2, ‘The Watling Street divide: Romans, Anglo-Saxons, Vikings, and accent prejudice’ addresses accentism, both in terms of the discrimination suffered by speakers of certain dialects, and of the criticism suffered when someone modifies (or is perceived to modify) the way they speak. Setter centers the discussion on accent prejudice by presenting a concise overview of how UK dialects originated. She shows that the development of distinct accents in the UK resulted mostly from migratory settlements and invasions from various linguistic groups over several centuries. While regional accents of English can be traced to the linguistic influence of the Anglo-Saxons (from the 5th century) and Vikings (from the 8th century), the origin of prestigious RP is related to the influence of Norman French in the English spoken by the ruling classes after the Norman Conquest in the 11th century. In addition, Setter shows that the presence of various accent traits in English dialects is connected to the Danelaw, i.e., the geographical area where the Vikings settled. The chapter concludes by further elaborating the idea that accent is tribal.

In Chapter 3 ‘Men can’t make their voices sound sexy, and other gems’, Setter addresses several topics related to voice and gender. Setter first considers how specific voice traits are connected to physical and personality characteristics; in particular, the use of low pitch tends to be associated (at least in English) with larger builds in men, and also with confidence and authoritativeness for both men and women. A well-known example of this connection is that of British Prime Minister Margaret Thatcher, who took voice lessons to lower her overall pitch and reduce her pitch range in order to be more successful in politics. Setter devotes a large part of this chapter to describing in detail the study in Hugues et al. (2014), which investigates the perception of various attributes (attractiveness, confidence, intelligence, and dominance) in female and male voices. One of the key points of the study is that the perception of attractiveness in male and female voices is very different; female voices modified to sound more attractive are rated accordingly; but this is not the case for male voices, which are actually rated as less attractive if modified to sound more so. In the remainder of the chapter, Setter considers two voice qualities routinely criticized because of their association with young women: ‘uptalk’ (i.e., the use of final rising intonation in statements) and ‘vocal fry’ (i.e. ‘creaky voice’), which Setter defines as very slow vibration of the vocal folds. Although uptalk is used by both men and women and has been documented in Australian English since the 1950s, and in the US and the UK since the 1970s and 1980’s, respectively (Warren 2016), it continues to be negatively associated with young women, particularly in the US. Similarly, creaky voice, particularly in the US, is primarily (and negatively) associated with young women, even if male speakers and older women use it as well. Setter concludes her discussion pointing out that the negative stereotyping of women’s voices is a form of discrimination and ‘even sexual violence’ (page 85).

In Chapter 4, ‘Gahaad save our Queen! Professional and performance voices and accents’, Setter writes about the intersection between accent, style, identity and prejudice in voice professionals. The chapter begins by addressing media controversies, mostly concerning British singers singing in American accents. As background to the discussion, this chapter provides a concise account of some differences in consonant and vowel pronunciation in General American and General British. Setter interviews several voice professionals for this chapter, including singers, DJs, accent coaches, and newsreaders and radio presenters, offering a nuanced account of the complexity involved in choosing one accent over another in voice performance. While some singers sing in an accent different from their regular speaking accent, mostly because some musical genres are closely associated with specific dialects, other singers choose not to do so since they feel their accent is closely connected with their identity. Setter also discusses the thorny issue of regional accents in broadcasting in the UK, where RP is preferred, and the slow strides made to hire and retain more female newsreaders and radio presenters because of accentism and voice gender prejudice. Setter also addresses the need for actors to learn different accents to enhance their chance of being hired (and/or to avoid being typecast).

In Chapter 5, ‘Your voice is your witness: forensic speaker analysis in criminal investigations’, Setter provides a succinct but illuminating account of how voice analysis can be harnessed for forensic purposes. It discusses how auditory and acoustic analysis is performed to compare recordings and evaluate the likelihood that they belong to the same voice. This analysis, usually conducted by a team of specialists, analyzes both ‘top down’ characteristics, including fillers, discourse markers, and forms of address, but also intonation, voice quality (such as creaky or breathy voice), and background noise. Forensic acoustic analysis also considers ‘bottom up’ (fine detailed) characteristics, including specific consonant and vowel pronunciations. The chapter includes a brief discussion of the main characteristics of waveforms and spectrograms. Other topics covered in this chapter include voice parades (voice line-ups), speaker profiling, and evidence tampering.

Chapter 6, ‘Making a change: transgender speech and synthesized voices’ begins by considering how voice is tied to gender perception and identity in the case of transgender speakers. While male voices tend to average 100-150 Hz., female voices tend to have a higher pitch, around 200-300 Hz. Biologically, differences in larynx size account for this difference. More subtle differences are involved as well, including the ways in which some sounds are articulated. Transgender people are commonly misgendered by the way their voice sounds, particularly transfeminine speakers. While female-to-male transgender people can increase the size of their larynx via testosterone hormones to achieve a lower average pitch, it is harder for male-to-female transgender people to decrease the size of their larynx, and thus transfeminine people tend to be misgendered more often. Setter interviews one transmasculine and three transfeminine English speakers; the latter report more frequent misgendering than the former, and also the use of a wider range of strategies beyond pitch lowering or rising to make their voices better represent their non-biological gender, including breathiness, and creaky voice. This chapter also brings up why (most) people appear to dislike their voices, and provides some additional discussion of the importance of visual cues and voice recognition, also mentioned in Chapter 3. The second part of this chapter focuses speech synthesis, in particular its application in providing a voice to people who cannot produce speech--as was the case for Stephen Hawking. Setter briefly covers some of the challenges involved in synthesized speech, including conveying the appropriate intonation in a given context, and also the need for increased options and more affordability of synthetic voices that can be a good fit for speakers of diverse genders, ages, and dialects.

In the final chapter, ‘English voices, global voices’, Setter focuses on the English spoken in other parts of the world, addressing the aftermath of colonialism in the development of ‘Old varieties of English’, including North American, Australian, and New Zealand English, and ‘New varieties of English’, including Indian and Singapore English. Setter acknowledges the importance of ethnicity and race in the discussion of voice and accent, and brings to the discussion some recent controversies reflecting accentism, mostly having to do with American English. This short chapter is followed by a brief annotated bibliography for readers interested in learning more about the topics covered in this book. The book also includes a list of commented further reading, a list of figures and QR codes, and an epilogue. The companion site provides a list of the URL links for the QR codes mentioned throughout the book, and sound files and color waveform/spectrograms, mostly from Chapter 5.


Setter’s book provides an enjoyable, informative discussion of the relevance of our voice for our identity and how we perceive others. Many of the topics covered are very timely and are not typically covered in introductory sources to phonetics or linguistics. Setter focuses mostly on British accents, which she acknowledges could be considered a limitation. However, many or most topics covered in this book also apply to other dialects of English, and Setter does address some voice characteristics typically associated with Australian and/or American accents and points the reader to additional sources on other dialects throughout the book.

From the beginning, Setter incorporates in the discussion controversies in the media and her own life experience, which makes the volume relevant and engaging. Setter’s ample experience as a phonetician, media commentator, forensic expert, and musician enriches the discussion and contributes to making this book highly interdisciplinary. The book is clearly written and Setter comes across as a highly relatable and approachable author (she even provides her twitter feed name and invites readers to contact her to further discuss the book). The volume also includes innovative features, including QR codes that make some of the audio/video examples she refers to available to anybody with access to a cell phone. The links to recent websites, videos and podcasts make the concepts, people and examples in this book come alive and provide the reader with additional food for thought.

As a non-technical book intended for a general audience, the content and necessary background covered is more than adequate. As a linguist, however, I would have loved to see further elaboration on a few topics. For example, in Chapter 4, Setter points out that women are language innovators and that they will continue to be criticized for the way they speak, even as their innovative speech traits become widespread. I would have liked to see a bit more background on women as linguistic innovators. In addition, in Chapter 5, Setter introduces the first and second formants (F1 and F2) and mentions that sometimes the third and fourth formants (F3 and F4) are relevant in forensic speech analysis. More discussion of the role of higher formants would have been an excellent addition to the discussion of forensic speech analysis, particularly as F4 and higher formants in vowels are reported to be closely associated with the voice quality of individual speakers (Ladefoged & Johnson 2011:214). I would also like to point out two aspects that could be clarified or expanded upon. In Chapter 4, Setter mentions a phonemic contrast between the voiced labio-velar approximant /w/ in General American English and its voiceless counterpart in words spelled with ‘wh’ (p. 90-91). This contrast has actually been on the decline for decades, and very few American English speakers maintain it nowadays (Edwards 1992:194-196; Wayland 2019:23). Finally, in Chapter 3 vocal fry is characterized as very slow vibration of the vocal folds, but the characterization of this voice quality is more complex and also tends to involve irregular vibration of the vocal chords (see for example Esling et al. 2020).

This book will be mostly relevant for readers interested in language and/or gender. Because of its focus on practical applications of phonetics and its discussion of voice discrimination, particularly regarding accentism, gender differences, and misgendering in transgender speech, it will also be of interest to instructors and students of introductory courses in general linguistics, phonetics, and sociolinguistics. I would like to end this review by referring the interested reader to Setter’s excellent Google talk on her book, available on YouTube and cited below.


Carolina Gonzalez is an Associate Professor of Spanish and Linguistics and coordinator of the Spanish and Portuguese Program at the Department of Modern Languages and Linguistics at Florida State University. Her research interests encompass phonetics, phonology, L2 phonological acquisition, the syntax-phonology interface, and constructed languages.

