LINGUIST List 28.3142

Wed Jul 19 2017

Review: Linguistic Theories; Phonology: McCarthy, Pater (2016)

Editor for this issue: Clare Harshey <>

Date: 07-Mar-2017
From: Joshua Griffiths <>
Subject: Harmonic Grammar and Harmonic Serialism
E-mail this message to a friend

Discuss this message

Book announced at

EDITOR: John J. McCarthy
EDITOR: Joe Pater
TITLE: Harmonic Grammar and Harmonic Serialism
SERIES TITLE: Advances in Optimality Theory
PUBLISHER: Equinox Publishing Ltd
YEAR: 2016

REVIEWER: Joshua M. Griffiths, University of Texas

Reviews Editor: Robert A. Coté


Harmonic Grammar and Harmonic Serialism (ed. by John McCarthy and Joe Pater) collects research on two alternative theories of constraint-based grammars that have modified and expanded on the notions originally proposed in classic Optimality Theory (OT) (Prince & Smolensky 1993/2004), namely Harmonic Serialism (McCarthy 2000) and Harmonic Grammar (Legendre et al. 1990). While the two theories discussed in this volume have made key modifications to the OT core architecture, their modifications have resulted in two drastically different theories. What each theory modifies is made explicitly clear in the preface: “ [Harmonic Serialism] questions the choice of parallel over serial evaluation, while [Harmonic Grammar] questions the assumption that constraints are ranked rather than weighted” (McCarthy & Pater 2016: viii). This collection is divided into three parts. The first section consists of two chapters authored by the editors, who provide an accessible introduction to both theories at the core of this volume and lay the foundation for the more technical chapters that follow in the other two parts. Part II, entitled “Analysis and Typology,” makes up the majority of the volume and, focuses primarily on the typological and theoretical predictions that these theories make, including a chapter analyzing the implications for overlap between the two theoretical frameworks. The final section of the book focuses on computational learning, and while this section may be the least accessible to most linguists, it does provide a basis for investigating the psychological reality of these two theories by relating these theories back to the connectionist tradition from which classic-OT originated.

The first chapter of the volume, Chapter 1, “Universal grammar with weighted constraints,” by Joe Pater introduces Harmonic Grammar (HG) and is structured primarily as a comparison and contrast between classic-OT and its predecessor HG. Through these comparisons and contrasts, Pater argues for adapting weighted constraints instead of the rankings that have been common in phonological analysis since the introduction of classic-OT, while still leaving open outstanding issues and questions for further research within this particular framework. Pater begins by arguing against Prince & Smolensky’s (1993/2004) initial claim that weighted constraints may predict typologically implausible patterns. Pater ultimately puts forth two main reasons for which a theory of weighted constraints is preferable to theories of strict domination. The first is the introduction of cumulative constraint interaction (or “gang effects”) as well as asymmetric trade-offs of constraints, while setting it apart from theories with similar predictions (i.e. OT with locally conjoined constraints, on which see Smolensky 2006). Second, Pater argues that HG allows the use of scalar constraints, which have been known to undergenerate in classic-OT. Pater concludes by expanding the pros and cons of both HG and OT to serial representations of constraint-based grammars similar to Harmonic Serialism as well as probabilistic grammars similar to MaxEnt grammar (Goldwater & Johnson 2003), while discussing the necessity of computational means in weighted constraint analysis. Overall, this chapter’s introduction to theories of weighted constraints is simultaneously easily accessible and very thorough, providing the necessary background for the half of the book dedicated to theories of weighted constraints.

Chapter 2, “The Theory and Practice of Harmonic Serialism,” by John J. McCarthy, serves as an introductory overview to Harmonic Serialism (HS). McCarthy begins by laying out the core differences in the architectures of HS as well as classic- (or parallel-) OT, contending that the key difference is the power of the generative components of these grammars (GEN) in these theories. In parallel-OT, GEN is unrestricted, meaning that any number of changes can be made from the input to a potential candidate. On the other hand, in HS, GEN is restricted and only one change can be made at a time, requiring a loop function. This ultimately leads to a serial grammar that is in some ways reminiscent of rule-based derivational phonology, which was the dominant framework for so long. This chapter continues outlining the differences and similarities between HS and parallel-OT, even including a section on how to construct an analysis in this particular framework. The final sections of this chapter argue for the use of HS over parallel-OT focusing primarily on HS’s implications for phonological opacity. This chapter introduces the reader to McCarthy’s newest iteration of constraint-based grammar in a clear and explicit fashion through the use of many cross-linguistic data including classic case studies of Epenthesis in Cairene Arabic and the complex interactions observed in Yawelmani.

Chapter 3, “Cross-level Interactions in Harmonic Serialism” by John J. McCarthy, Joe Pater, and Kathryn Pruitt” opens Part II, which focuses on the use of HS and/or HG for linguistic and typological analysis. As the chapter’s title suggests, the focus is HS, particularly how HS handles what the authors refer to as cross-level interactions (CLIs) and more importantly how CLIs serve as supporting evidence for HS. McCarthy et al. define a CLI as a generalization in which a phonological process must span more than one level of the prosodic hierarchy. The authors highlight that CLIs have been used as evidence in arguing against serial theories of grammar (cf Kager 1999; McCarthy 2002; Pater 2000). McCarthy et al. make it clear, however, that these arguments do not hold for HS for two primary reasons: “violation of the surface-true” and “full availability of structural operations.” The chapter continues discussing how past theories of phonology such as rule+constraint theories as well as parallel-OT have handled CLIs. Addressing a variety of cross-linguistic issues such as foot construction in Hixkaryana and the interaction of stress and syllable weight in Latin, the authors then continue to flesh out the two aforementioned arguments as a defense of HS. The authors conclude by presenting some of the limits of CLIs in HS as well as HS’s overall strengths in handling CLIs.

Chapter 4, “Parallelism vs. Serialism, or Constraints vs. Rules? Tongan Stress and Syllabification revisited” by Minta Elsman expands upon the comparison of parallel grammars versus serial grammars first discussed by Prince & Smolensky (1993). Similarly to the logic of argumentation introduced by McCarthy et al. in Chapter 3, Elsman exploits Prince & Smolensky’s initial argument for a parallel-OT as a means of supporting HS. Elsman first discusses previous analyses of Tongan stress and syllabification before proposing a HS analysis of the same data. Elsman’s analysis proposes three steps: the syllabification of V.V sequences, followed by the construction of a stressed foot at the right edge of a word and finally, a “fusion” in which one syllable is deleted and segments are reassigned to other syllables. Of important note is that syllabification is a multi-step process, built serially in steps that work to increase the harmony of the winning candidate. Elsman concludes by stating that even though she and others argue for HS as opposed to parallel-OT, the key argument is not the argument of a serial theory or a parallel theory, rather her primary concern is that constraint-based formalisms are superior to rule-based formalism, and that any argument in support of one theory over another must identify what formal differences between the theories favor one theory over the other.

Chapter 5 “Serial Restrictions on Feature/Stress Interactions” by Robert Staubs by provides a clear example of how HS is able to shed light on interactions between segmental phonology and prosody (stress), comparing his proposed HS analysis of sonority-driven stress assignment in some toy data with predictions made by a parallel-OT analysis of the same phenomena. He argues that the gradualness of HS prevents the overgeneration that can result from a parallel model such as classic-OT. Staubs’ argument centers primarily on the use of positional markedness constraints, which have been argued to over generate in parallel-OT (cf de Lacy 2002). Staubs ultimately argues that the restrictiveness of HS that stems from its gradualness requirement is a strength, not a weakness as has been argued for traditionally, since HS cannot overgenerate in cases where parallel-OT might.

Aside from Pater’s introductory chapter Chapter 6 “Positional Constraints in Optimality Theory and Harmonic Grammar” by Karen Jesney is the first chapter in the volume that focuses primarily on Harmonic Grammar. Rich in data, Jesney’s chapter discusses the typological predictions made by positional constraints in both classic-OT and HG. In order to make proper typological predictions, Jesney argues that classic-OT requires the use of both positional markedness constraints (cf Itô, Mester, and Padgett 1995) as well as positional faithfulness constraints (cf. Beckman 1997). Prior research, however, has argued that positional faithfulness constraints can generate opaque and non-local patterns that are not typologically sound (Jesney 2011). Jesney ultimately argues that through the use of cumulative constraint interaction, positional faithfulness constraints are not necessary in a theory of weighted constraints such as HG. Without positional faithfulness constraints, Jesney argues that HG is able to produce a more restrictive typology with a more general constraint set, and to avoid the highly specific constraints that have often been a critique of constraint-based theories of grammar.

One key component of parallel-OT is that constraints must be defined negatively, assigning violations as opposed to rewards (the “Infinite Goodness problem”). In chapter 7, “Positive Constraints and Finite Goodness in Harmonic Serialism” by Wendell Kimper, Kimper argues that Infinite Goodness is not a problem for HS, but rather positively defined constraints are feasible within HS albeit with some limitations (e.g. HS’s weakened GEN function limits the power of these constraints). He also argues that positive constraints are exceptionally useful and desirable in defining autosegmental spread. Kimper concludes by arguing that a CON that consists only of positively defined constraints would generate various pathologies by not counting segments and that proper implementation of these constraints would be within a CON that contains both positively defined and negatively defined constraints.

Chapter 8, “Contexts for Epenthesis in Harmonic Serialism” by Claire Moore-Cantwell, takes up the concern that classic parallel OT’s capacity to produce multiple repairs at once has the tendency to overgenerate, a problem she refers to as the “too-many-solutions problem.” In order to resolve this issue in HS, Moore-Cantwell posits a restriction on epenthesis, namely that it can be used as a repair strategy for syllable-structure and segmental markedness, but not to resolve issues of metrical markedness, including clashes, lapses, and non-binary feet. Furthermore, she proposes that epenthesis must occur in different steps of the derivation from syllable or foot building, and that it must always satisfy Selkirk’s EXHAUSTIVITY constraint, which penalizes any skipping of levels in the prosodic hierarchy. Moore-Cantwell’s argument ultimately describes the typology of the environments in which epenthesis can occur as well as what can be undertaken to avoid particular pathologies in HS.

Further continuing the discussion of epenthesis, Chapter 9, “Stress-Epenthesis Interactions in Harmonic Serialism” by Emily Elfner, is concerned with the serial representation and analysis of prosody and epenthesis, primarily how vowel epenthesis disturbs a language’s preferred stress pattern. Elfner argues that HS is particularly adept in handling this complicated issue since stress assignment and vowel epenthesis are two distinct phenomena. Elfner looks at both transparent instances of this interaction as well as opaque instances in a variety of different languages, including Dakota, Swahili, and Levantine Arabic. Of particular interest here is how Elfner combines aspects of both theoretical frameworks discussed in this volume. Elfner frames her analysis of Levantine Arabic in Serial HG, a serial evaluation of weighted constraints. Also of interest is the comparison she draws between her HS analysis and how other serial constraint-based grammars such as Stratal OT (Kiparsky 2000) and OT with candidate chains (McCarthy 2007) would handle the same phenomena, further strengthening this volume’s focus in highlighting the strengths of HS.

Chapter 10, “Compensatory and Opaque Vowel Lengthening in Harmonic Serialism” by Francesc Torres-Tamarit, begins by highlighting the inadequacies in studying compensatory lengthening (CL) as well as double flop in classic-OT. Since CL is a mora-preserving process, classic-OT cannot feasibly select the proper output because the deletion of the mora-bearing coda consonant would counterbleed the application of weight-by-position meaning that CL is inherently opaque. Torres-Tamarit, therefore, argues that approaching CL through the lens of HS solves the opacity problem, if syllabification is built gradually in HS and if the deletion of a coda consonant is thought of as a two-step process. Torres-Tamarit’s analysis works well for instances of classic CL as well as instances of double-flop, lending credence to both of these assumptions.

Chapter 11, “Cyclicity and Non-Cyclicity in Maltese: Local Ordering of Phonology and Morphology in OT-CC” by Matthew Wolf, reconsiders the classic example of Maltese morphophonological alternations and syncope that lead to cyclic stress. In Maltese, vowels in unstressed open syllables are typically deleted; however, the syncope process does not always occur in pronominal suffixes. It therefore follows that the vowels that cannot syncopate must be stressed earlier in a phonological cycle, blocking the syncope process. Interestingly, Wolf frames his analysis in neither Harmonic Serialism nor Harmonic Grammar, but rather Optimality Theory with Candidate Chains (OT-CC), a theory closely related to HS (McCarthy 2007). Like HS, OT-CC posits multi-step derivations from the Input to the Output, but HS focuses on the competition between the candidates themselves, whereas OT-CC frames the competition between the derivations (the ‘chains.’) Wolf shows that Maltese stress is much more complicated than had initially been argued, finding that opaque cases of cyclic stress are easily captured in OT-CC. He concludes by arguing for the use of OT-CC over Stratal OT (Kiparsky 2000). Although OT-CC is different from HS, Wolf’s chapter is a welcome addition to this volume by introducing readers to another serial theoretical cousin of classic-OT.

Chapter 12, “Learning Serial Constraint-Based Grammars” by Robert Staubs and Joe Pater, opens the third part of this book, which is focused on questions of learning and learnability regarding these two theories. Staubs and Pater focus on answering two questions. First, they intend to answer questions focusing on the learnability of hidden structures. In the case of HS and other serial theories of grammar, the steps between the underlying and surface forms are hidden. In order to answer the question of the learnability of hidden structures, Staubs and Pater consider a toy dataset with an opaque stress-epenthesis interaction, using MaxEnt grammar (Goldwater and Johnson 2003) to define the probability distribution of the candidates. Utilizing the probabilities obtained from the MaxEnt grammar, the grammar produces probabilities over the surface forms as the sum of the probabilities of the derivation that lead up to them. The learner is successful, but the question as to how the learner is able to calculate probabilities over infinite possible paths is still open. Pater and Staubs ultimately propose that derivations can be thought of as Markov chains. The second question Staubs and Pater address is the learning of variation. In order to do so, they apply the model developed in the first half of the chapter to a dataset of French variable schwa deletion and epenthesis. The learner is also successful in describing this dataset. Staubs and Pater leave open potential areas for future research. They state that their new methodology allows for the comparison of serial and parallel grammars. Furthermore, they also state that a gradual learning algorithm in this framework still needs to be developed. Although this chapter is more technical and less accessible than most of the preceding chapters, the methodology introduced is promising, combining the strengths of HG and HS.

Chapter 13, “Convergence properties of a Gradual Learning Algorithm for Harmonic Grammar” by Paul Boersma and Joe Pater, first defines what the authors see as the central research question in the domain of learnability in generative linguistics: “Is a given learning algorithm guaranteed to converge on a grammar that is correct for any language defined by a given theory of grammar?” (p. 389). They immediately differentiate this from the goals defined by language acquisitionists, meaning that although this chapter is concerned with questions of (machine) learnability, it is not concerned with the human ability to acquire language, despite the fact that these two lines of research are closely related. This chapter is primarily concerned with developing and describing an online gradual learning algorithm for HG (HG-GLA). The first section of the chapter briefly describes HG and the typical learning algorithm, describing in a detailed manner the steps the learner takes in modifying the weights of a constraint set. The following section of the chapter begins by mathematically formalizing the proposed HG-GLA and outlines how the perceptron convergence proof applies to the HG-GLA. Boersma and Pater show that the HG-GLA can also be applied to variants of HG including exponential, probabilistic and noisy models of HG. They also attempt to apply the HG-GLA to stochastic OT, but the learner would occasionally fail to converge. The chapter concludes by showing how the HG-GLA could be applied to situations in which learners do not have access to the full structure of the learning data (i.e. hidden structures). The results in all of these learning situations further support HG’s effectiveness as a model of generative grammar.


Although this volume is the latest installment in Equinox’s Advancements in Optimality Theory series, it differs greatly from its predecessors since it does not necessarily address OT, rather two of its theoretical cousins. It does, however, fit nicely into this series, since this volume serves as an introduction to the core principles of both HS and HG for those who already have a relatively firm understanding of Optimality Theory and other theories of constraint-based phonology. Many of the discussions posed in this volume relate HG and HS to OT, and the similarities and contrasts between these three theoretical frameworks.

McCarthy and Pater are to be commended for the logical structure of the book as well as the breadth of the information presented in this volume. The first part of the book (“Introductions to Harmonic Grammar and Harmonic Serialism”), which introduces the core tenets of the two theories in question, is effective in its stated goals, laying the groundwork for the reader, who may be unfamiliar with these topics, to understand the chapters in Part II. One minor note, however, is that Chapter 1, which approachably introduces HG as well as other theories and issues in theories of weighted constraints, may well benefit from further discussion comparing the advantages and disadvantages of the theories of weighted constraints addressed in the chapter, as well as when one theory may be more practical than the other. Despite this, both Chapters 1 and 2 would be effective texts in introductory phonology coursework, primarily at the graduate level.

Not all issues are equally represented and discussed in this volume. Part II, which makes up the bulk of the volume illustrates how both HS and HG can be used to address many issues which have proven to be difficult for standard Optimality Theoretic analyses. The studies presented in Part II can serve as templates in how to effectively employ HS or HG (or a hybrid of both theories as presented in Chapter 9). Moving on to Part III, however, there is little discussion on learning and learnability, which have been argued to be one of the tenets at the core of HS and other theories of weighted constraints. In addressing the importance of learning as it pertains to theories of weighted constraints, Pater (2009: 1000) has argued that “theories of language learning and processing are being used increasingly in the explanation of typological generalizations.” Despite the focus of learning and learnability in theories of weighted constraints, questions related to learning and learnability are not well-represented in this volume. Although both chapters in part III are informative, and chapter 12, effectively ties together both theories of interest in this volume, it seems that this part does not fit cohesively with Part II. Part III is much more complex and technical than the preceding chapters, since it requires a basic knowledge of the basic principles of machine learning.

Each chapter explicitly states avenues of further research relating to these theoretical frameworks, allowing other phonologists to explore questions of interest as they pertain to both HS and HG. Aside from the more technical underpinnings presented in part III, the volume is very readable to most phonologists, as there is a constant tone of contrast to standard OT presented in all of the chapters. In the end, this volume is a welcome introduction to key questions related to Harmonic Grammar and Harmonic Serialism. Since both HG and HS are in their incipient stages, this book should be welcomed by both seasoned phonologists as well as students of phonology.


Beckman, Jill. 1997. Positional faithfulness, positional neutralization, and Shona vowel harmony. Phonology 14 (1): 1—46.

de Lacy, Paul. 2002. The Formal Expression of Markedness, Doctoral dissertation, University of Massachusetts Amherst. Amherst, MA: GLSA Publications.

Goldwater, Sharon and Mark Johnson. 2003. Learning OT constraint rankings using a maximum entropy model. In Jennifer Spenader, Anders Eriksson and Östen Dahl (eds), Proceedings of the Stockholm Workshop on ‘Variation within Optimality Theory’, 111-120. Stockholm: Stockholm University.

Itô Junko, Armin Mester, and Jaye Padgett. 1995. Licensing and underspecification in Optimality Theory. Linguistic Inquiry 26: 571-614.

Jesney, Karen. 2011. Positional faithfulness, non-locality, and the Harmonic Serialism solution. In Suzi Lima, Kevin Mullin and Brian Smith (eds.), Proceedings of the 39th Meeting of the North East Linguistic Society, 403-416. Amherst, MA: GLSA. [ROA-1018].

Kager, Rene. 1999. Optimality Theory. Cambridge: Cambridge University Press.

Kiparsky, Paul. 2000. Opacity and cyclicity. The Linguistic Review 17: 351—367.
Legendre, Géraldine, Yoshiro Miyata, and Paul Smolensky. 1990. Can connectionism contribute to syntax? Harmonic Grammar, with an application. In M. Ziolkowski, M. Noske, and K. Deaton (eds), Proceedings of the 26th Regional Meeting of the Chicago Linguistic Society, 237—252. Chicago, IL: Chicago Linguistic Society.

McCarthy, John J. 2002. A Thematic Guide to Optimality Theory. Cambridge: Cambridge University Press.

McCarthy, John J. 2007. Hidden Generalizations: Phonological Opacity in Optimality Theory. London, Equinox

Pater, Joe. 2000. Nonuniformity in English secondary stress: The role of ranked and lexically specific constraints. Phonology 17:237-274

Pater, Joe. 2009. Weighted constraints in Generative linguistics. Cognitive Science. 33. 999-1035.

Prince, Alan and Paul Smolensky. 2004. Optimality Theory: constraint interaction in generative grammar. Malden, MA and Oxford: Blackwell [Revision of 1993 technical report, Rutgers University center for Cognitive Science]. Available on the Rutgers Optimality Archive, ROA-537.

Smolensky, Paul. 2006. Optimality in phonology II: Harmonic completeness, local constraint conjunction, and feature domain markedness. In Paul Smolensky and Géraldine Legendre (eds.), The Harmonic mind: From neural computation to optimality theoretic grammar, vol. 2: Linguistic and philosophical implications, 27-160. Cambridge, MA: MIT Press.


Joshua M. Griffiths is a PhD. student in the Department of French & Italian at the University of Texas at Austin. He is primarily interested in questions pertaining to phonology, cognitive science, and second language acquisition. He is particularly interested in phonological variation and the nature of the phonological representation of variable structures.

Page Updated: 19-Jul-2017