LINGUIST List 35.1765

Thu Jun 13 2024

Review: A model of sonority based on pitch intelligibility: Albert (2023)

Editor for this issue: Justin Fuller <justinlinguistlist.org>

LINGUIST List is hosted by Indiana University College of Arts and Sciences.



Date: 14-Jun-2024
From: Christopher Geissler <cageissler42gmail.com>
Subject: Phonetics: Albert (2023)
E-mail this message to a friend

Book announced at https://linguistlist.org/issues/34.2041

AUTHOR: Aviad Albert
TITLE: A model of sonority based on pitch intelligibility
SERIES TITLE: Studies in Laboratory Phonology
PUBLISHER: Language Science Press
YEAR: 2023

REVIEWER: Christopher Geissler

SUMMARY

The first three chapters of A model of Sonority based on Pitch Intelligibility provide a general introduction (Ch. 1), as well as context on sonority (Ch. 2) and discrete vs. continuous models (Ch. 3). The review of past literature on sonority offers three main criticisms: that sonority-based concepts, particularly sonority slopes, have been over-generalized (§2.2.1); that sonority-based accounts fail to explain aspects of cross-linguistic typology (§2.2.2); and that sonority has lacked a consistent acoustic correlate (§2.2.3). The empirical generalizations identified are /s/-stop clusters, which are unexpectedly common, and the difference between two kinds of sonority plateaus: the more-common low-sonority plateaus (e.g. /sf/), and the less-common high-sonority plateaus (e.g. /nm/). In terms of theoretical architecture, Chapter 3 surveys literature on the interface between symbolic and continuous approaches, urging a renewed focus on perception (rather than production), as well as a principled separation of symbolic and dynamic models.

Chapter 4 outlines the proposed solution for these problems, “Perceptual regimes of repetitive sound (PRiORS)”, a framework for thinking about the linguistic role of sound patterns that recur at different rates. This synthesizes previous research in a novel presentation. Broadly speaking, patterns slower than 20 Hz are perceived as rhythm, while those above 20 Hz are perceived as pitch (including spectral structure). Very slow (<0.5 Hz) patterns are not perceived as rhythmic, while very fast (>5,000 Hz) patterns, where audible, are not clearly distinguished. Albert identifies the syllable duration and structure as a balance between being long enough to include sufficient periods for pitch perception, while still being short enough to be perceived rhythmically. As a consequence, the tension between discrete and continuous models is resolved by assigning phenomena to different perceptual systems according to the rate at which patterns recur.

The core proposal of the book, the Nucleus Attraction Principle (NAP), is outlined in Chapter 5 and presented with two implementations in Chapter 6. In this model, syllable nuclei are attracted to sonority peaks, which are defined in terms of periodic energy–the component of acoustic energy from periodic, rather than aperiodic sources. This redefinition approximates a standard sonority sequence, with vowels having the most periodic energy and voiceless obstruents the least. To implement this in a model of syllable well-formedness, the sonority of the syllable onset is calculated and compared to the sonority of the syllable nucleus. If the sonority of the onset is small enough relative to the sonority of the nucleus, the result is identified as one syllable. However, if the sonority of the onset is too large relative to the sonority of the nucleus, then two syllables are identified.

Two implementations of the NAP are presented: a symbolic “top-down” model, and a continuous “bottom-up” model. In the top-down model, natural classes are assigned a numeric value (from 1 to 4) corresponding to abstracted periodic energy. For a C1C2V onset, the difference in sonority (V - C1) is added to (C2-C1); this measures the ability of the first consonant to compete as a syllable nucleus, and results in subtly different predictions than models based on sonority rises, falls, and plateaus. In the bottom-up model, the center of mass is calculated for periodic energy in the onset cluster and for the nucleus. If the distance between these two is too great, the CCV sequence will be parsed as two syllables.

The next third of the book presents two studies that test the predictions of the NAP. The first (Chapter 7) presents the results of three perception experiments, in which German- and Hebrew-speaking participants were presented with nonce words featuring non-homorganic CC onsets of varying sonority. In a forced-choice task, speakers were simply asked to judge whether they heard one or two syllables, and their responses and response times were recorded. Sonority of the stimuli was calculated using the top-down and bottom-up NAP models as well as four traditional models of sonority. Of the six sonority models, the top-down NAP performed best in explaining the results, followed by the bottom-up NAP model. The experiments are complemented by a corpus study of Modern Hebrew Segholate nouns. (Chapter 8) investigates whether word-initial onset clusters in derived plurals are permitted or avoided (by vowel epenthesis). Again, the top-down NAP is found to be the best-performing model. Data from these studies is presented in appendices and is available in OSF repositories.

Finally, Chapter 9 describes ProPer, a toolkit for calculating periodic energy, taking phonetic measurements with it, and creating visualizations of periodic energy and F0. These are freely available online and implemented in popular free software (Praat and R). Chapter 10 offers a general conclusion, with suggestions for future work.

EVALUATION

The book is a valuable contribution to the phonological study of syllable structure. “Syllable” and “sonority” have proven to be useful terms yet difficult to define, and this book offers a refreshing perspective. The core theoretical proposals, PRiORS and the NAP, are simple ideas with subtle and interesting implications. The combination of corpus and experimental studies, along with computational implementations, provides a solid empirical grounding.

The book is also commendable for its adherence to open science practices, and readers are encouraged to examine and consider using materials from the component studies. Code and data for Chapter 7’s NAP study is available at (https://osf.io/y477r/), while Chapter 8’s corpus study has code and a link to the data at (https://osf.io/wuf3j/). The ProPer system is available for use by other researchers on GitHub (https://github.com/finkelbert/ProPer_Projekt) and OSF (https://osf.io/28ea5/). The documentation is accessible, and uses freely-available software that is familiar to many linguists and speech scientists (Praat, R, and RStudio).

For a highly quantitative work, Albert (2023) offers much to phonologists who work with categorical, symbolic representations. The distinction between top-down, symbolic NAP and bottom-up, dynamic NAP presents a subtle tension that speaks to deep divides within linguistics concerning the relationship between discrete and continuous representations. Albert states his own position in §6.1 and §10.3, arguing that the top-down NAP is an outcome of learning the bottom-up dynamics. Interestingly, the top-down model outperforms the bottom-up model in the perceptual experiment (§7.8), lending support to more categorical models. The success of the top-down model makes this an attractive alternative to the SSP, and I encourage researchers to continue testing the predictions and examining the consequences that differ between the NAP and SSP.

In contrast to the explicitly-implemented NAP models, the PRiORS framework seems somewhat underdeveloped. There is an undeniable elegance to partitioning perceptual systems by rate of change, but it is unclear what this achieves. Perhaps this is a failure on the part of this reviewer’s imagination.

While quantitatively impressive, the actual success of the NAP as an improvement over the SSP is mixed. Among its successes is the NAP’s ability to distinguish between plateaus of high and low sonority, such as /mn/ vs. /fs/. It also succeeds at its goal of explaining why /s/-stop onset clusters are relatively common despite their falling sonority. Unfortunately, the same mechanism is not able to account for differences in the behavior of different voiceless stops. Why would a language permit /s/-stop clusters but not /f/-stop clusters? Likewise, while the NAP can account for the moderately common /sf/ clusters, it does not predict why /sf/ should be more common than /fs/. Perhaps these questions should be addressed with a different kind of account, not sonority, but shifting the locus of explanation offers a redefinition of the problem, rather than a solution.

Redefinition, then, is a key theme of the volume. The NAP is an intuitive and attractive new formulation of sonority, one deserving of attention from phonologists and phoneticians. Foregrounding periodic energy likewise offers a potentially-useful new tool in the study of prosody and phonotactics. ProPer produces attractive and thought-provoking visualizations, and along with PRiORS may spark creative developments as well. To its credit, these shifts in perspective open new kinds of questions: What are the possible NAP threshold values, and how do they differ across languages? Might the periodic energy centers of mass relate to other syllable landmarks, such as C-centers (Browman & Goldstein 1988), P-centers (Barbosa et al. 2005), or jaw movement trajectories (e.g. Erickson et al. 1998)? Could the attractor landscape of the NAP be integrated with other dynamic approaches such as Dynamic Field Theory (Schöner & Spencer 2016)? Can the NAP help explain the typology of syllable codas? And of course, if /s/-stop sequences are common, why are /f/-stop sequences rare?

Overall, Albert (2023) is a well-written volume with substantial merits. The organization into relatively short, coherent chapters encourages engagement with specific empirical components as well as the major ideas. Different chapters will appeal to readers with specific interests, be it in symbolic phonology, acoustic phonetics, or a combination. I hope to see the ideas in this volume applied to other data sources and rigorously tested and debated. Individual chapters could also meaningfully contribute to course syllabi, either for high-level theory or as interesting case studies, and the linked software is accessible enough for use by students with some background in Praat and R.

REFERENCES

Barbosa, Plínio A., Pablo Arantes, Alexsandro R. Meireles & Jussara M. Vieira. 2005. Abstractness in speech-metronome synchronisation: P-centres as cyclic attractors. In Interspeech 2005, 1441–1444. ISCA.

Browman, Catherine P. & Louis Goldstein. 1988. Some Notes on Syllable Structure in Articulatory Phonology. Phonetica 45(2–4). 140–155.

Erickson, Donna, Osamu Fujimura & Bryan Pardo. 1998. Articulatory Correlates of Prosodic Control: Emotion and Emphasis. Language and Speech 41(3–4). 399–417.
DFT

Schöner, G., & Spencer, J. P. 2016. Dynamic thinking: A primer on dynamic field theory. Oxford University Press.

ABOUT THE REVIEWER

Christopher Geissler received his Ph.D. in Linguistics from Yale University for his dissertation “Temporal articulatory stability, phonological variation, and lexical contrast preservation in diaspora Tibetan”. He served as a postdoctoral researcher in English Linguistics at Heinrich Heine University Düsseldorf, and is currently a Visiting Assistant Professor of Linguistics at Carleton College. His research focuses on speech timing, phonetic variation, and the relationship between articulatory phonetics and phonological representation. He values collaborating with students and improving pedagogy in linguistics.




Page Updated: 13-Jun-2024


LINGUIST List is supported by the following publishers: