LINGUIST List 12.2550

Fri Oct 12 2001

Review: Dekkers, et al, Optimality Theory

Editor for this issue: Terence Langendoen <>

What follows is another discussion note contributed to our Book Discussion Forum. We expect these discussions to be informal and interactive; and the author of the book discussed is cordially invited to join in.

If you are interested in leading a book discussion, look for books announced on LINGUIST as "available for discussion." (This means that the publisher has sent us a review copy.) Then contact Simin Karimi at or Terry Langendoen at


  • Ash Asudeh, Review of Dekkers et al., Optimality Theory

    Message 1: Review of Dekkers et al., Optimality Theory

    Date: Tue, 9 Oct 2001 14:40:02 -0700 (PDT)
    From: Ash Asudeh <asudehcsli.Stanford.EDU>
    Subject: Review of Dekkers et al., Optimality Theory

    Dekkers, Joost, Frank van der Leeuw, and Jeroen van de Weijer, ed. (2000) Optimality Theory: Phonology, Syntax, and Acquisition. Oxford University Press, paperback ISBN 0-19-823844-4, $45.00, x+635pp.

    Reviewed by Ash Asudeh, Department of Linguistics, Stanford University.

    OVERVIEW This book consists of an introductory chapter by the editors and Paul Boersma, 16 papers, and three indexes (subject, language, name). The papers are divided into four areas: prosodic representation (4 papers), segmental phonology (3 papers), syntax (5 papers), and acquisition (4 papers).

    The introduction is a compact, but useful, overview of Optimality Theory (OT; Prince and Smolensky, 1993). The editors plus Boersma spend only a few pages discussing phonology, on the assumption that this is the best known domain for OT analyses. They discuss syntax somewhat more extensively, using principally Pesetsky (1998) and Grimshaw (1997) for exposition. The topic of acquisition takes up about half the introduction. There is an extensive comparison of OT learning algorithms to Principles and Parameters learning algorithms (particularly the Trigger Learning Algorithm of Gibson and Wexler (1994) and Berwick and Niyogi's (1996) refinement of it), based on Pulleyblank and Turkel's (1996) discussion of Tongue Root Harmony languages. Throughout the introduction, the editors tie in the contents of the volume as appropriate.

    The volume proper begins with the section on "Prosodic Representation" and Burzio's paper, 'Cycles, Non-Derived Environment Blocking, and Correspondence'. Burzio argues for output-output Correspondence (Burzio, 1994; Benua, 1995), whereby there is a faithfulness relation between morphologically related output forms. He argues that this dispenses with the notion of 'Underlying Representation'.

    Hayes ('Gradient Well-Formedness in Optimality Theory') considers the problem of gradient grammaticality (Sch�tze, 1996; Keller, 2000; Pullum and Scholz, 2001). He proposes a model in which constraints are associated with strictness bands. This is clearly related to the stochastic constraint evaluation and continuous constraint ranking of the Boersma contribution to the volume, and indeed much of the results in this article are incorporated in Boersma and Hayes (2001). In that article, as in this one, gradience is related directly to frequency, a view challenged by (Keller, 2000).

    Kager ('Stem Stress and Peak Correspondence in Dutch') considers the question of word-level stress assignment in Dutch. He argues that a Correspondence Theory account (McCarthy and Prince, 1995, 1999) is conceptually and empirically superior to a Lexical Phonology alternative.

    McCarthy's paper ('Faithfulness and Prosodic Circumscription') also concerns Correspondence Theory. McCarthy argues, principally from reduplication and infixation data, that a theory of prosodic faithfulness, which is independently required, eliminates the need for operational prosodic circumscription (McCarthy and Prince, 1990).

    Jacobs and Gussenhoven ('Loan Phonology: Perception, Salience, the Lexicon and Optimality Theory') begin the section on "Segmental Phonology". They consider loanword phonology, especially in Cantonese. Their paper is based largely on Silverman (1992) and Yip (1993). Unlike these papers, they argue that the notion of 'phonetic salience' is unnecessary and that a pure OT grammar makes all the necessary distinctions, so long as one assumes Smolensky's (1996) interpretive parsing and lexicon optimization (Prince and Smolensky, 1993).

    The paper by LaCharit� and Paradis ('Derivational Residue: Hidden Rules in Optimality Theory') also concerns loanword phonology. They argue for a Correspondence Theory of faithfulness rather than a Containment Theory (Prince and Smolensky, 1993). Based on this conclusion, they argue that OT has a crucial derivational core, in GEN, the mapping from inputs to candidates.

    Smith ('Dependency Theory Meets OT: A Proposal for a New Approach to Segmental Structure') presents a model in which Dependency Phonology characterizes the GEN and candidate set in an OT model.

    The "Syntax" section starts with the paper by Ackema and Neeleman ('Absolute Ungrammaticality'). They tackle the problem of ineffability (an input that should have no output). They propose that such cases all involve selection of the null parse as the optimal candidate. Although the null parse fails PARSE constraints, which require morphological/featural information in the input to be realized in the output, contentful candidates violate higher-ranked constraints such as EPP or PROJECT. In order for their proposal to generalize, A&N must assume that there is a series of evaluations, each containing exactly one parse constraint (starting with the highest-ranked one and so on down the hierarchy), where evaluation n+1 takes the output of evaluation n as its input.

    Anderson ('Towards an Optimal Account of Second-Position Phenomena') argues that 1) second position phenomena do indeed target the second position (it is not an epiphenomenon) and 2) second position clitics and verb second (V2) are related phenomena.

    Bresnan introduces the possibility of using Lexical Functional Grammar (LFG; Bresnan, 1982, 2001; Dalrymple, 2001) in OT syntax to characterize GEN and the candidate set, and to provide the set of formatives to which the constraints make reference. She shows how GEN can be characterized monostratally rather than derivationally, such that there is parallel and possibly imperfect correspondence between multiple grammatical representations and she uses the system to recast Grimshaw's (1997) pioneering OT syntax analysis. OT-LFG is explored further in other recent papers by Bresnan, and in the recent volume edited by Sells (2001).

    Legendre ('Morphological and Prosodic Alignment of Bulgarian Clitics'), like Anderson, argues that clitics are not syntactic elements, but are rather phrasal affixes that are the morphological realization of functional features. Also like Anderson, she argues that this characterization explains the fact that clitics orient to edges. She proposes a characterization of Bulgarian second-position clitics that uses certain alignment constraints with syntactic domains and others with prosodic domains.

    Boersma ('Learning a Grammar in Functional Phonology') begins the final section, "Acquisition". He presents what is essentially a condensed version of his thesis and book (Boersma, 1998). In his model, constraint ranking is continuous, with constraints assigned numerical rankings rather than hierarchical positions. Evaluation is stochastic: there is a slight probabilistically-conditioned reranking of constraints at evaluation time. Another major contribution of this paper is a model in which constraints are not innate, but are rather grounded functionally and learned from articulatory and acoustic data. Boersma provides a detailed example: the learning of Wolof tongue root harmony.

    Ellison ('The Universal Constraint Set: Convention, Not Fact') also effectively challenges the innateness of constraints. He presents six arguments for the universality of constraints, as is usually assumed in OT, and rejects them all. He concludes that the only sense in which the universality of constraints should be maintained is as a convention, like the International Phonetic Alphabet, which will make the work of linguists easier, but has no substantive empirical content and no psychological reality.

    Pulleyblank and Turkel ('Learning Phonology: Genetic Algorithms and Yoruba Tongue Root Harmony') present an innovative OT model in which learning uses a Genetic Algorithm (Holland, 1973; Koza, 1992). Thus, like in the stochastic model that Boersma presents, speakers can have largely similar grammars, with minor differences. As the authors note, this begins to give some handle on the problem of variation and the continuous, rather than ordinal, differences between dialects.

    Tesar's paper ('On the Role of Optimality and Strict Domination in Language Learning') further develops the best-known work on OT learning (Tesar and Smolensky, 1998, 2000). It considers how design principles of Optimality Theory, particularly strict domination of constraints and optimization, can be exploited in providing a theory of language learning. The paper contains a nice overview of optimization-based learning algorithms, such as Hill-climbing, its specialization Gradient Ascent, and Expectation-Maximization, It concludes with a characterization of parsing as optimization, which consists of "production directed parsing", more commonly called generation, and "interpretive parsing", which consists of holding the overt form constant across all candidates in the competition and selecting the candidate that is optimal. The rest of the paper presents two algorithms, the Error-driven Constraint Demotion Algorithm and the Iterative Learning Algorithm (ILA), which unlike the EDCDA does not rely on correct full structural descriptions being available to the learner.

    CRITICAL EVALUATION This volume is quite heterogeneous. Since it is not possible to give each paper its due individually, I will mainly discuss certain issues that arise, commenting on particular papers where relevant. But first let me consider the book as a whole.

    The key strength of this volume is its very heterogeneity. The editors have collected a number of important contributions from leading scholars working in Optimality Theory in three central fields of linguistic study. The fact that seven of the sixteen papers are on phonology is a fair reflection of the field, as OT has made more inroads in that area than in any other. The editors should be commended on including sections on syntax and acquisition.

    Despite the variety of topics explored here, there are themes tying many of the papers together. The acquisition section, which I found the most interesting (even though I do not work in this area), could more appropriately be called "learnability". These papers all address formal and algorithmic issues of learning OT grammars (Boersma, Pulleyblank and Turkel, Tesar) or key foundational issues about the universality and innateness of aspects of OT architectures (Boersma, Ellison). Turning to phonology, the first section ("Prosodic Representation") largely explores aspects of Correspondence Theory (Burzio, Kager, McCarthy). The second phonology section ("Segmental Phonology") picks up on this in places (LaCharit� and Paradis), but it is for the most part concerned with issues of loanword phonology (Jacobs and Gussenhoven, LaCharit� and Paradis). As for the syntax section, there are two key concerns: the architecture of a syntactic OT grammar (Bresnan, Broekhuis and Dekkers), and the interaction of prosody and syntax in the distribution of clitics (Anderson, Legendre). In each of the last three sections, there are important papers that are thematically isolated. Hayes is to be commended on taking gradience as a serious grammatical phenomenon, Smith makes the simple, but important, point that OT can be married with already well-developed phonological theories, rather than merely replacing them, and Ackema and Neeleman take on the important problem of ineffability in OT syntax.

    The fact that the editors have brought together not only OT papers on phonology, but also papers on syntax and acquisition is all the more remarkable given that they started putting the volume together when OT was pretty new. One indication of this is that there are various references to Barbosa et al. (1998) or papers therein as "to appear". The drawback is that the volume thus has a dated feel, despite its 2000 copyright date (in fact the book was released on 12/28/2000). Most readers with an interest in OT will already be familiar with many of the papers (from the Rutgers Optimality Archive, or from circulating drafts). As for readers wishing to learn about OT, an edited volume of this sort is the wrong place to start (although the introduction is worth a look). A better place would be the excellent Kager (1999) textbook, or the CD-ROM put together by John McCarthy, available from the University of Massachusetts Graduate Linguistics Student Association,

    Let me now turn to three interesting theoretical issues that I believe this book raises.

    I. COMPETITION AND EVALUATION Three foundational questions for Optimality Theory are:

    1. What is the nature of inputs?

    2. What is the nature of candidates?

    3. How can we tell if a constraint should apply to a candidate and whether it is violated by a candidate?

    The first two questions have received a fair amount of attention in the OT literature and are discussed explicitly in various papers here, particularly in the syntax section (part three). According to what in the introduction is called the "Semantic Identity Approach", the input to syntax is a bag of lexemes (cf. the related notion of Numeration in the Minimalist Program; Chomsky, 1995), and the candidates must be "truth-functional equivalents" (Broekhuis and Dekkers, p. 409). This approach is pursued in the papers by Ackema and Neeleman, Broekhuis and Dekkers, and possibly implicitly by Anderson (his paper leaves most of the details of his formal analysis unspecified).

    On the other approach, the "Structured Inputs Approach", which is exemplified by Bresnan's paper and is implicit in Legendre's, the input is a predicate-argument structure (be it an underspecified Logical Form (LF) or an LFG functional-structure) and the candidates are some more articulated version of the input (more fully-specified LFs or LFG constituent-structure/functional-structure pairs).

    However, I think that it is the third question above that really needs to be answered, as evidenced by various conceptual confusions that arise throughout this book. So far, there has been very little work on providing a logic and semantics for OT constraints, and most of the work postdates this volume. A recent paper by Hammond (2000) addresses the logic of the OT architecture, but stops shy of providing a logic for the constraints and their evaluation. Ellison (1994), Eisner (1997a,b) and Kuhn (2001a,b) have done some work on this front in a more computational setting, but their results need to find their way into the theoretical literature.

    But, why does it matter? Consider the constraint LE(CP) from Pesetsky (1998), which is used by Broekhuis and Dekkers:

    (1) Left Edge CP: CP starts with a lexicalized head from the extended projection of the verb.

    After much argumentation, based on assumptions of the Minimalist Program, B&D conclude that the structure of English subject relatives is:

    (2) the man [IP who saw Bill]

    It was surprising to find that this structure violates LE(CP), even though it does not even contain a CP! Of course, this was after reading the constraint as universally quantifying over CPs ("Every CP starts with . . . "). I still think this is the preferred reading for the statement above. But, it is possible to read (1) as existentially quantifying over CPs ("There is a CP such that it start with . . . "). The point is that if the constraints were made explicit in some logic, as discussed in some of the works cited above, it would be absolutely clear whether the quantification is universal or existential.

    Well, perhaps I'm demanding too much of a paradigm at such an early stage of development. OT is a relatively new framework (and it was really quite new when the papers in this book were written), and shouldn't we judge linguistic theories based on the predictions they make, their empirical consequences, rather than fussy details of formulation? I submit that this would be a grave error: it is impossible to judge the consequences of a theory if its content is not established first. More importantly, a formalism should make theoretical claims explicit and verifiable, it should throw light on the theory by making generalizations concisely and clearly. If the formalism creates more questions than it helps to answer, why have it? Why not just state the generalizations in clear, ordinary language?

    This is not any kind of damning criticism of OT. The kind of logic required seems to be only a first order predicate calculus, nothing complicated, and as I noted, this work has been initiated. However, the OT constraints in this book (and in general) are in need of an explicit formalization of the kind provided for other constraint-based theories, such as Head-driven Phrase Structure Grammar (King, 1989, 1994; Richter, 2000) and LFG (Johnson, 1995; Kaplan and Bresnan, 1982).

    Similarly, the authors should have made explicit the formatives (i.e. linguistic primitives) of their versions of OT and the extent of interaction between different grammatical subsystems (phonology, morphology, syntax, etc.). Legendre's paper is a step in the right direction. She proposes that syntactic constraints outrank prosodic constraints which outrank morpho-prosodic constraints (prosodic alignment constraints for morphologic features) which outrank morphological constraints. She proposes the "Constraint Intermixing Ban", which states that "Constraints belonging to different modules of the grammar may not intermix" (p. 458). Of course, this is more an instruction to the OT grammarian than a theoretical construct, and we would like it to somehow be derived from the nature of the constraints or the OT architecture, but it is a start.

    However, many of the papers are really quite unclear about the status of the formatives they assume and about the interaction of grammatical systems. For example, Anderson proposes a constraint EDGEMOST(clitic,left,S), that requires a clitic to be at the left edge of a clause (S). But, it is uncertain that the term "clitic" even has any theoretical content (Sadock, 1995; Zwicky, 1994). If so, how can it be a formative in a formal theory?

    Another example comes from Kager's paper. He notes that although the stress system of Dutch motivates a constraint LEFTMOST ("Align(PrWd, L, peak, L)"; i.e. the stress peak is on the leftmost syllable of the prosodic word), certain adjectives have their stress peak on the rightmost stem, motivating a constraint ADJ-PK ("Align(Adjective, R, peak, R)"; i.e. the stress peak is on the rightmost syllable of an adjective). But, note that this means that the inputs and candidates to a morphophonological process would have to contain syntactic category information. This has serious consequences for the theory of grammatical architecture, but its consequences are masked by the informal nature of OT constraints.

    By no means do I mean to single out just the papers I've mentioned here. Many of the papers in the phonology and syntax sections suffer from either the problem of uncertain constraint evaluation or potentially problematic claims about the formatives (the structure of inputs and candidates). Again, these are not problems for OT analyses of linguistic phenomena necessarily, but they are definitely areas that should receive much more serious attention than they receive in this volume.

    II. INPUTS ARE NOT OUTPUTS The second issue has to to do with interpretive parsing, as introduced in Smolensky (1996) (this point owes a lot to discussions with Ida Toivonen, although any conceptual or factual errors are solely my own.) This is adopted in the Jacobs and Gussenhoven paper and exploited in Tesar's Interpretive Parsing Algorithm. Consider an OT grammar on this view. In generation, or the production direction, we have the following (x_prod should be read as "x of production"; similarly for x_comp ("x of comprehension")):

    (3) Production: Input_prod -> GEN -> {x|x is a candidate_prod} -> EVAL -> Output

    Now, consider comprehension as envisioned by interpretive parsing. Tesar (p.601) writes, "The proposal is that language comprehension, like production, is an optimization process. The hearer is presented with an overt form, and selects the description of that overt form that is optimal with respect to his or her current constraint ranking. The difference is that here the candidate structural descriptions competing for optimality are candidates whose overt portions match the observed overt form. The interpretation assigned to an observed overt form is that structural description which, out of all descriptions whose overt portion matches the observed form, best satisfies the ranked universal constraints." In other words, the proposal is:

    (4) Comprehension (interpretive parsing):

    Input_prod -> GEN -> {x|x is a candidate_prod such that its overt portion matches the observed overt form} -> EVAL -> Output

    Thus, according to interpretive parsing, although comprehension is also an optimization process, it is a different kind of optimization process, because there is something held constant about the candidates.

    However, we might otherwise expect the comprehension direction to look like this (Anttila and Fong, 2000; Asudeh, 2001):

    (5) Comprehension (no interpretive parsing):

    Observed overt form -> GEN -> {x|x is a candidate_comp} -> EVAL -> Output

    That is, we might expect comprehension to simply be the reverse of production. One of the benefits of constraint-based theories, such as OT is purported to be, is that grammars are reversible (Strzalkowski, 1993; Copestake et al., 1995, 1999): the same grammar can be used for comprehension and production. There may have to be production- or comprehension-specific processes that interface with the grammar, but the core grammatical system could be the same. This has the benefit that only one grammar needs to be learned, represented, and used in computation. Reversibility is an important research area in the implementation of Lexical-Functional Grammars and Head-driven Phrase Structure Grammars, and other constraint-based architectures.

    By contrast, in OT with interpretive parsing, the grammar is not reversible: production and comprehension are handled differently. More importantly, it should be noted that the alternative without interpretive parsing (in (4)) is also not straightforwardly reversible. In particular, the formatives for the inputs of production/comprehension will not be the same as the formatives of the outputs. Thus the elements that constraints target in the output of production would be in the input to comprehension, rather than being in the output, where the constraints need to evaluate them. Faithfulness constraints, with their inherent asymmetry of evaluating outputs against inputs would find the inputs and outputs reversed. Similarly, the formatives that Markedness constraints target may be in one set of outputs, but not the other.

    Thus, OT-grammars seem to be non-trivially non-reversible, whether they use interpretive parsing or not. This is in conflict with the reversibility prized by other constraint-based architectures and is tantamount to having separate grammars for production and comprehension, which is conceptually undesirable.

    III. FORMALISM, FUNCTIONALISM, AND NATIVISM Ellison's paper considers the question of the universality of constraints in OT phonology. He considers six arguments (structured arguments, with premises and conclusions that follow from the premises) for the view common in the OT literature, which he calls UNIV-FACT (p. 526):

    (6) UNIV-FACT: There is (at least) one hierarchy of constraints objectively present in the mind of each language-user. Furthermore, the same constraint set is used in each hierarchy of each and every user.

    Ellison rejects each of the six arguments for UNIV-FACT that he considers, because in each argument at least one premise does not hold.

    Instead, he presents an argument that the constraint set should be treated universally as a convention (much like the International Phonetic Alphabet is a convention for phonetic transcription):

    (7) Languages should be analyzed (as much as possible) using a constraint set common to the community of phonologists.

    Ellison's argumentation is quite convincing; his paper certainly merits a reply if UNIV-FACT is still to be assumed.

    In fact UNIV-FACT is not the only universalist assumption in OT. It is also commonly assumed that the set of linguistic inputs is universal, due to richness of the base (Prince and Smolensky, 1993). One need look no further than this volume: "According to the principle of 'richness of the base' (Prince and Smolensky, 1993), the set of linguistic inputs is universal" (Tesar, p. 616). Let us call this the Strong Richness of the Base Hypothesis. A Weak Richness of the Base Hypothesis would merely hold that there can be no constraints on inputs in any given grammar, but not all inputs need be present in all languages. However, it is certainly the strong version that is prevalent in the literature. Lastly, the function that maps inputs to candidate sets, GEN, and the evaluation function, EVAL, are considered universal.

    Thus, OT holds that inputs are universal, the mapping from inputs to candidate sets is universal, and the constraint set is universal. A standard linguistic hypothesis is that universals are innately specified. This is in effect an inductive inference: if x is innate, x is universal. That is innateness, or nativism, is considered the best explanation of universality (and of course there is a long-standing tradition in the field of language acquisition seeking to demonstrate linguistic nativism empirically).

    So, given that Ellison's paper makes us reconsider the universality of the constraint set, it also makes us reconsider whether it is innate. Boersma's paper takes up this question. He presents a detailed model of the acquisition of the constraints of segmental phonology from overt data, without presupposing innateness of the constraints. That is, the constraints themselves are learned, not just their ranking. He shows how his Maximized Gradual Learning Algorithm works in general, and in the particular case of learning Wolof tongue root harmony.

    Boersma's model, which is set forth in much greater detail in his book (Boersma, 1998), is functional: its constraints are grounded in perception and articulation. Thus, Boersma's theory is not only non-innatist, but also functional.

    As mentioned above most OT models are innatist. Are they functional? In practice, they often are, as argued by Newmeyer (to appear). But this is only a contingent fact about the OT literature, that constraints are often functionally motivated. For most OT analyses, one could just as well strip away the functionalist rhetoric and take all constraints to be purely formal. If the constraints "look" functional, one could claim that it is just a coincidence. The majority of OT analyses are innatist and can be construed as being formal or functional in nature (in reality most OT analyses contain a mix of purely formal and functionally-motivated constraints).

    This allows us to build the following table:

    (8) INNATIST NON-INNATIST FUNCTIONAL Most OT models Boersma's OT model FORMAL Most OT models ???

    What about the cell marked with question marks? Can there be an OT model that is formal and non-innatist?

    I believe that there can be, and I will give a tentative sketch of such a model here. First, though, a few words about innateness. In their thought-provoking book, 'Rethinking Innateness', Elman et al. (1996) distinguish between representational nativism and architectural nativism. The former view postulates innate knowledge/content: dedicated, cortical microcircuitry whose developmental schedule and organization is specified in the genome. The latter view postulates innate systems or capabilities: specific knowledge or content is not innate, but the kind of information that a cortical system can handle is constrained, which indirectly constrains its representational capacity. Architectural nativism is the weaker stance. With respect to language, it means that the capacity for language is considered innate, and the overall language architecture is constrained, but the representational content of the language faculty must be learned. On this view there would be no literally innate principles or parameters, although the mature language faculty could be characterized as such descriptively. It is in the sense of architectural nativism that Boersma's model is non-innatist, as his language acquisition model does not postulate that the entire OT architecture is learned. The functions GEN and EVAL are still assumed to be given or innate.

    Now suppose that we are trying to give a formal, non-innatist OT model. Let us help ourselves to the existence of GEN and EVAL. But, let us not postulate any innate linguistic primitives (phonological features, syntactic features, etc.) or specific constraints. Let us suppose all constraints can be classified as FAITHFULNESS or MARKEDNESS constraints (Prince and Smolensky, 1993). This is architectural nativism: we are assuming that the language faculty has the structure of an OT grammar, such that a grammar is a triple of <GEN, EVAL, CON>, where the set CON can be partitioned into the two kinds of constraint. Let us further assume a general-purpose learning mechanism, such as a neural network, or a Bayesian learner, that learns the formatives of a language from the ambient linguistic data. Each time a new formative is identified. a FAITH. and MARK constraint is added to CON. For example, if the learner identifies a formative which we might call [+=\Gamma voiced] which distinguishes certain sounds, a constraint FAITH(voiced) and a constraint MARK(X voiced) will be added to CON (which value of voiced X is set to depends on how markedness is measured). Constraints are thus automatically generated as new formatives are distinguished. Interleaved with the learning of constraints is ranking of constraints using some OT learning algorithm (any of the three discussed in the book would do). This would yield a purely formal grammar: none of the constraints are functionally grounded; they are generated automatically as new formatives are identified. However, the grammar is based on the assumption of architectural nativism, and therefore could count as non-innatist.

    I want to state explicitly that I am not necessarily endorsing the view I have sketched. First of all, there isn't much to endorse, as the proposal is vague and has many hidden assumptions. Secondly, there are many poverty of the stimulus arguments in the literature for the stronger, representational nativist stance, although Elman et al. (1996), Ellison (this volume), Pullum and Scholz (to appear), and many others have challenged these arguments.

    But I think it is to the credit of OT, and in the context of this book to the credit of Ellison and Boersma, that it is possible to think about language as a formal system, without simultaneously taking a strong innatist stance.

    FINAL REMARKS The first sentence of this book reads, "The introduction of Optimality Theory by (Prince and Smolensky, 1993) can be considered the single most important development in generative grammar in the 1990s." Although this particular book may seem slightly dated, and much of its contents highlights the need for serious foundational thinking about OT, the claim just quoted is not easily refuted. Optimality Theory has in a short time swept the field of phonology and made inroads into other subfields of linguistics.

    OT raises many interesting issues and promises to be a fruitful research program for some time. The papers in this book represent early attempts (especially in the syntax and acquisition domains) to apply the theory to various phenomena and to address certain fundamental issues in OT, many of which have been addressed in more current research. Nevertheless, this book constitutes an important historical milestone in the development of Optimality Theory.

    ACKNOWLEDGMENTS I am grateful to Farrell Ackerman, Andrew Koontz-Garboden, Charles Reiss, Peter Sells, and especially Ida Toivonen for their comments. All remaining errors are my own.

    REFERENCES Anttila, Arto, and Vivian Fong (2000). The partitive constraint in Optimality Theory. Journal of Semantics, 17, 281-314.

    Asudeh, Ash (2001). Linking, optionality, and ambiguity in Marathi. In Sells (2001), (pp. 257-312).

    Barbosa, Pilar, Danny Fox, Paul Hagstrom, Martha McGinnis, and David Pesetsky (eds.) (1998). Is the best good enough? Cambridge, MA: MIT Press.

    Beckman, Jill, Laura Walsh Dickey, and Suzanne Urbanczyk (eds.) (1995). Papers in Optimality Theory, vol. 18 of University of Massachusetts Occasional Papers in Linguistics. Amherst, MA: Graduate Linguistic Student Association.

    Benua, Laura (1995). Identity effects in morphological truncation. In Beckman et al. (1995), (pp. 77-136).

    Berwick, Robert, and Partha Niyogi (1996). Learning from triggers. Linguistic Inquiry, 27, 605-622.

    Boersma, Paul (1998). Functional Phonology: Formalizing the interactions between articulatory and perceptual drives. The Hague: Holland Academic Graphics.

    Boersma, Paul, and Bruce Hayes (2001). Empirical tests of the Gradual Learning Algorithm. Linguistic Inquiry, 32, 45-86.

    Bresnan, Joan (ed.) (1982). The mental representation of grammatical relations. Cambridge, MA: MIT Press.

    Bresnan, Joan (2001). Lexical-Functional Syntax. Oxford: Blackwell.

    Burzio, Luigi (1994). Principles of English stress. Cambridge: Cambrige University Press.

    Chomsky, Noam (1995). The minimalist program. Cambridge, MA: MIT Press.

    Copestake, Ann, Dan Flickinger, Robert Malouf, Susanne Riehemann, and Ivan A. Sag (1995). Translation using Minimal Recursion Semantics. In Proceedings of the Sixth International Conference on Theoretical and Methodological Issues in Machine Translation (TMI-95), Leuven, Belgium.

    Copestake, Ann, Dan Flickinger, Ivan A. Sag, and Carl Pollard (1999). Minimal recursion semantics: An introduction. Ms., Stanford University and Ohio State University.

    Dalrymple, Mary (2001). Lexical Functional Grammar. Academic Press.

    Eisner, Jason (1997a). Efficient generation in primitive Optimality Theory. In Proceedings of the 35th annual ACL and 8th EACL, (pp. 313-320).

    Eisner, Jason (1997b). What constraints should OT allow? Talk handout, Linguistic Society of America, Chicago.

    Ellison, T. Mark (1994). Phonological derivation in Optimality Theory. In Proceedings of COLING, (pp. 1007-1013).

    Elman, Jeffrey, Elizabeth Bates, Mark Johnson, Annette Karmiloff-Smith, Domenico Parisi, and Kim Plunkett (eds.) (1996). Rethinking innateness: A connectionist perspective on development. Cambridge, MA: MIT Press.

    Gibson, Ted, and Kenneth Wexler (1994). Triggers. Linguistic Inquiry, 25 , 407-454.

    Grimshaw, Jane (1997). Projection, heads, and optimality. Linguistic Inquiry, 28 , 373-422.

    Hammond, Michael (2000). The logic of Optimality Theory. ROA 390-0400.

    Holland, John (1973). Genetic algorithms and the optimal allocation of trials. SIAM Journal on Computing, 2, 88-105.

    Johnson, Mark (1995). Logic and feature structures. In Mary Dalrymple, Ronald M. Kaplan, John T. Maxwell, and Annie Zaenen (eds.), Formal issues in Lexical-Functional Grammar, (pp. 369-380). Stanford, CA: CSLI Publications.

    Kager, Ren� (1999). Optimality Theory. Cambridge: Cambridge University Press.

    Kaplan, Ronald M., and Joan Bresnan (1982). Lexical-Functional Grammar: A formal system for grammatical representation. In Bresnan (1982), (pp. 173-281).

    Keller, Frank (2000). Gradience in grammar: Experimental and computational aspects of degrees of grammaticality. Ph.D. thesis, University of Edinburgh.

    King, Paul (1989). A logical formalism for Head-Driven Phrase Structure Grammar. Ph.D. thesis, University of Manchester.

    King, Paul (1994). An expanded logical formalism for Head-Driven Phrase Structure Grammar. Arbeitspapiere des sfb 340, University of T�bingen.

    Koza, John R. (1992). Genetic programming: On the programming of computers by natural selection. Cambridge, MA: MIT Press.

    Kuhn, Jonas (2001a). Formal and computational aspects of optimality-theoretic syntax. Ph.D. thesis, Universit�t Stuttgart.

    Kuhn, Jonas (2001b). Generation and parsing in Optimality Theoretic syntax: Issues in the formalization of OT-LFG. In Sells (2001), (pp. 313-366).

    McCarthy, John, and Alan Prince (1990). Foot and word in prosodic morphology: The Arabic broken plurals. Natural Language and Linguistic Theory, 8 , 209-282.

    McCarthy, John, and Alan Prince (1995). Faithfulness and reduplicative identity. In Beckman et al. (1995), (pp. 249-384).

    McCarthy, John, and Alan Prince (1999). Faithfulness and identity in prosodic morphology. In Ren� Kager, Harry van der Hulst, and Wim Zonneveld (eds.), The prosody-morphology interface, (pp. 218-309). Cambridge: Cambridge University Press.

    Newmeyer, Frederick J. (to appear). Optimality and functionality: A critique of functionally-based optimality-theoretic syntax. Natural Language and Linguistic Theory.

    Pesetsky, David (1998). Some optimality principles of sentence pronunciation. In Barbosa et al. (1998), (pp. 337-383).

    Prince, Alan, and Paul Smolensky (1993). Optimality Theory: Constraint interaction in generative grammar. Tech. rep., RuCCS, Rutgers University, New Brunswick, NJ. Technical Report #2.

    Pulleyblank, Douglas, and William Turkel (1996). Optimality Theory and learning algorithms: The representation of recurrent featural asymmetries. In J. Durand and B. Laks (eds.), Current trends in phonology: Models and methods. Salford, UK: University of Salford Press.

    Pullum, Geoffrey K., and Barbara C. Scholz (2001). On the distinction between model-theoretic and generative-enumerative syntactic frameworks. In Philippe de Groote, Glyn Morrill, and Christian Retor� (eds.), Logical aspects of computational linguistics (lecture notes in artificial intelligence, 2099), (pp. 17-43). Berlin: Springer Verlag.

    Pullum, Geoffrey K., and Barbara C. Scholz (to appear). Empirical assessment of stimulus poverty arguments. Linguistic Review.

    Richter, Frank (2000). A mathematical formalism for linguistic theories with an application in Head-Driven Phrase Structure Grammar. Ph.D. thesis, Eberhard-Karls-Universit�t T�bingen.

    Sadock, Jerrold (1995). Multi-hierarchy view of clitics. In Papers from the 31st regional meeting of the Chicago Linguistic Society. part 2: Parasession on clitics, Chicago, IL. CLS.

    Sch�tze, Carson T. (1996). The empirical base of linguistics: Grammaticality judgments and linguistic methodology. Chicago: University of Chicago Press.

    Sells, Peter (ed.) (2001). Formal and empirical issues in optimality-theoretic syntax. Stanford, CA: CSLI Publications.

    Silverman, Daniel (1992). Multiple scansions in loanword phonology: Evidence from Cantonese. Phonology, 9, 289-328.

    Smolensky, Paul (1996). On the comprehension/production dilemma in child language. Linguistic Inquiry, 27, 720-731.

    Strzalkowski, Tomek (ed.) (1993). Reversible grammar in natural language processing. Boston: Kluwer.

    Tesar, Bruce, and Paul Smolensky (1998). Learnability in Optimality Theory. Linguistic Inquiry, 29.

    Tesar, Bruce, and Paul Smolensky (2000). Learnability in Optimality Theory. Cambridge, MA: MIT Press.

    Yip, Moira (1993). Cantonese loanword phonology and Optimality Theory. Journal of East Asian Linguistics, 2, 261-291.

    Zwicky, Arnold (1994). What is a clitic? In Joel Nevis, Brian D. Joseph, Dieter Wanner, and Arnold Zwicky (eds.), Clitics: A comprehensive bibliography, (pp. xii-xx). Amsterdam: John Benjamins.

    BIOGRAPHICAL SKETCH I am in the fourth year of my Ph.D. studies at Stanford. I received a Master of Philosophy from the Centre for Cognitive Science, University of Edinburgh. My research interests are the syntax-semantics interface, grammatical theory, and psycholinguistics.