Epstein, Samuel David, and Norbert Hornstein, ed. (1999) Working Minimalism. MIT Press.
Reviewed by: Robin Schafer, University of Canterbury
Syntacticians are created through a series of small epiphanies: the discovery of principled patterns in a randomly selected grammar; the identification of a seemingly idiosyncratic construction with a cross-language phenomenon; the realization of a prediction. The beauty of a theory like Government and Binding (GB) or Principles and Parameters (P&P) is that it is possible to lead students to such points. Most importantly, it is possible to ask them to apply the theory to data they have not seen before. One of the frustrations of living in the age of minimalism is that these experiences are unavailable to the uninitiated: students cannot be told to apply the Minimalist Program (MP) to data of their own choosing. It is simply not clear how to start.
The solution to this problem is not found in the book Working Minimalism, despite its title. The work leans more to philosophy of linguistic analysis than to linguistics itself. It is a compendium of the obsessions of current theory, primarily answering the question, 'How does the MP address X treated previously as Y'? As such, it is highly readable and informative, but imparts the attitudes more than the procedures employed in working with the MP. The papers involve a rethinking of the operations, relations, and constraints of GB/P&P. Thus, the work suggests that the starting point of MP analyses are all and only the data questions raised by GB. A reworking of what has been done before is welcomed if the result is a neater or more discerning account. But as a reader looking for such results, I was disappointed.
Working Minimalism consists of an introduction and 12 papers. The intro by Epstein and Hornstein presents, in about five pages, a clear and useful synopsis of the MP. They provide readers with an annotated list of 10 key ingredients to the MP, which I repeat here in two sets: 5 bits that come in pairs and 5 conditions & postulates:
Two levels: LF & PF Recursion through generalized transformations (plus Strict Cycle Condition) Two basic options: Merge and (Copy) Move Two distinctions among features: Interpretability and Strength Two checking configurations: Spec-head and Head-head: NOTE: locality is defined on these configurations: there is no government
Full Interpretation: Interpret fully at interfaces Shortest Move Condition Last Resort Condition (motivated by Greed) Inclusiveness Condition: operations don't add features Theta role postulate: Roles are special: they are not features, they are assigned lexically.
The approach is derivational, but it is not clear whether derivations proceeds serially or in parallel, and the different authors make different assumptions in their individual works.
In addition, Epstein and Hornstein set out their view of the relation between GB/P&P and the MP. They explain that the successes of GB/P&P brought to the fore issues of theory evaluation like parsimony and simplicity. These economy conditions constitute the MP principles. It is the change in the role that economy plays -- from an evaluative metric in late GB to the basis of the principles in the MP -- that accounts for the level of abstraction that is consequently introduced into syntax. In reading this book I was hoping to find justification for this change to a less accessible method of linguistic analysis.
The twelve chapters that comprise the book include one by Hornstein and one by Epstein as well as the work of 10 other authors. The chapters are not organized into sections, but can be divided into three types: half the chapters argue how to get rid of a GB/P&P device; four offer new explanations for handling already defined phenomenon; and two update the reader on the operations of an MP grammatical device. I will not go through each chapter in detail, but I include mention of all the authors and topics below as I describe these three groups of papers.
The two papers that discuss MP particular devices are Jairo Nunes's paper on Linearization and Juan Uriagereka's on Multiple Spell-Out. Both these papers are elaborations of operations familiar from earlier versions of MP. The Nunes paper is a straight-forward, addressing the question why, if we adopt the copy theory of movement, traces are silent. Nunes starts by noting that traces cannot be primitives, since they are not in the numeration. He sets out to account for the facts that all links of all non-trivial chains are not phonetically realized, certain links often are, sometimes more than one is. His approach is to let linearization be a PF convergence requirement (following Kayne 1994). Copies are subject to the Linear Correspondence Axiom (LCA) and deletion is triggered by linearization considerations. Basically copies must delete because copies themselves are non-distinct from each other. So when the phonological component attempts to process COPY A - X - COPY B, it is forced to order the copied element both before and after X. Since this is impossible (i.e. since there is no output of linearization), a derivation in which deletion occurs converges as a minimal derivation. Which copy is deleted is determined by economy of formal feature elimination.
This account strikes me as technical: it's clever and innocuous, but it doesn't capture intuitions or offer insight into why there are traces. It doesn't appear to tie into the interactions between (or the convergence of) precedence and information structure, gapping and relevance, locality and the number of roles an expression is assigned in an event, or any other larger view of what characterize traces.
The Uriagereka paper has a more far-reaching consequences. It is often observed that if Spell Out applies only once, then the MP admits an s-structure level, despite its commitment to only two levels. This is a matter of mis-conceptualization according to Uriagereka. Spell-Out is like any other rule; it applies as much as it can subject to economy conditions.
Uriagereka, like Nunes, starts with Kayne's LCA, since linearization operates at Spell-Out. The LCA has two parts: the first states a relationship between command and precedence (command is sufficient for precedence), and the second cleans things up when a command relation doesn't directly hold between the two chunks to be linearized. If we have multiple Spell Out, the second step follows. That is, linearization applies only to chunks in which a command relation holds, so sub-sentential pieces are fed into the interfaces independently. Superiority and CED effects are discussed using this machinery. The output of linearization re-creates domains of government without reference to what minimalism views as a redundant relation.
All sorts of questions are raised here, (many of which must have been discussed since this article was first distributed). I mention the obvious one here: how do the various chunks spewed into PF get into the 'right' order? Some discussion of this with respect to antecedence and the role agreement plays is supplied. I am most disconcerted by the result that the syntax doesn't actually derive anything that corresponds to the actual stuff that is said --- nothing syntactic corresponds to the sentence. It's not that the actual empirical data is lost: it corresponds to the numeration and the PF representations. This is analogous to the change in status of constructions from primitive operations to a collection of independently required feature specifications, conditions and operations. What concerns me is that a lot of language structure is actually encoded in the string itself. We spend more time thinking about the non-evident structural relations, but it seems wrong to loose the connection between syntactic structure and what is said. This paper definitely leaves the reader with a good deal to think about.
The papers that attempt to get rid of some GB/P&P device are crucial to the claim that previously employed grammatical constructs are too rich. Roger Martin targets uninterpretable features for elimination from the theory, Norbert Hornstein targets Quantifier Raising, Hisatsugu Kitahara, the 'or' feature, Robert Freidin argues cyclicity is unnecessary, Howard Lasnik argues against reconstruction with A chains, and Sam Epstein proposes that c-command is reducible to Merge.
Several papers stand out in this group. Martin's paper is an excellent example of the Get Rid Of It group which doesn't succeed in eliminating the device investigated. Martin points out that the only two levels in an optimal language faculty are the two interfaces, LF and PF, so there should be only features interpretable at one of these two interfaces. No uninterpretable features should exist. Martin shows that some uninterpretable features can be eliminated, but Case cannot be. He concludes that Case must be conceptually necessary. The discussion is clear and well presented, but what is also fascinating is the nature of the argument, particularly this conclusion.
One might have thought that the conclusion should be that the language component is not optimal. Given the relevant conception of 'optimal,' few linguists would be surprised at such a conclusion. Formulated in the MP, what Martin's investigation reveals is what is minimally necessary. The question is: assume syntax is minimal, what must be in it? Martin provides a partial answer: Case. The minimal syntax required is viewed as conceptually necessary, hence as part of an optimal system.
The Hornstein paper on Quantifier Raising is a very fine example of the Get Rid Of It group of papers which arguably succeeds in eliminating a GB device. QR is a syntactic operation with a semantic motivation. In an MP world where all movement serves to check morphological features, this is unexpected. It is also unexpected under any theory that embraces the idea that semantics is essentially read off a structure determined by syntax. In either case, rules that fix quantifier scope are an anathema.
Hornstein argues that certain basic aspects of the MP allows us to get rid of QR. Crucial is the fact that the Case checking (in fact all feature checking) occurs outside of VP, and the view that movement involves copy and deletion. The proposal is remarkably simple and well supported: the scope an expression has depends on which member of an A-Chain survives deletion --- the copy internal to VP or the copy in an Agr specifier. Scope, in this conceptualization, is a property of a member of a chain, not of a syntactic category.
The beauty of this paper lies in the discussion of the separation of Case checking from Theta properties, a move which began in late GB (Principles and Parameters) work. Hornstein's discussion of the theoretical ramifications of QR is not only clear with respect to the desiderata of MP, but contrasts this with the position adopted in GB and so actually unveils the development in thinking on QR from GB to MP.
Hornstein ends the paper with a defense of lowering which might seem odd from the commonly held perspective that optimal grammars eschew lowering. However, when deletion targets the higher copy or chain member, the result is reminiscent of lowering. So Hornstein argues there is reconstruction in A-chains (that is, he argues against arguments against reconstruction in A-chains). He's looking for other instances where we might find deletion of the highest member of a chain. I point this out because it is an aspect of this group of papers that there is a trade off or cost associated with the minimizing therein.
Papers like Hornstein's that argue for the elimination of some device seem to be the product of a program embracing economy, but they do not seem to require that economy constitute the principles of the theory (rather than serve as an evaluative metric). Papers like Martin's, on the other hand, really do seem to be the product of a program distinct from GB/P&P. If economy conditions were an evaluative metric, than an analysis involving Case features would be evaluated less optimally than one without them, all other things being equal. By making the economy principles the theory itself, the non-optimal bits that are necessary are rendered critical, or conceptually necessary. It is here that we observe a real difference in operation of the MP.
The third group of papers, those concerned with how to handle specific phenomena within the MP, includes Erich Groat on expletive there, Norvin Richards on multiple specifiers in WH-Questions and other constructions, Zeljko Boskovic on multiple WH-fronting and multiple head movement and Amy Weinberg on sentence processing. Groat's paper is truly enjoyable, if for no other reason than that it provides a score card of the various incarnations of the expletive there analysis since 1991. But the question raised is also one that reveals a lot about MP conceptualizations. Full Interpretation is a cornerstone to the Minimalist Program. Expletives are simply not compatible with Full Interpretation. So, how do we analyse expletives so that they can be conceived of as compatible with the theory (in fact, conceptually necessary)?
Groat argues, contra the analysis in Chomsky 1995, that expletives must bear Case. Under his analysis, D-features assigned to there are unnecessary. The expletive raises from a small clause in which it is generated together with its associate to check nominative Case. Through this analysis, two complications of assuming expletives do not bear Case are eliminated. First, if the expletive just checks an EPP feature (Chomsky 1995), feature checking occurs when the expletive is Merged into the representation. Thus this account requires that features can be checked either through Merge or Move. Groat points out that all other features are checked via Move. If the expletive raises to check Case, as Groat proposes, all checking is uniform and takes place only through Move. This seems to me to be conceptually critical prohibition: if Merge is cheaper than Move, and both operations were available for feature checking, but most features were checked via Move, the result would certainly raise eyebrows. Chomsky's 1995 analysis also requires a constraint that Move is possible only if it is the cheapest among competing, potentially convergent, derivational "steps." Since steps that lead to non-convergent derivations have to be excluded, this constraint introduces a global look-ahead. Groat's analysis doesn't require this constraint. Admittedly, however, it introduces its own non-minimal bits: the raising of the expletive is less optimal than simply Merging in the expletive.
Just as in the Martin paper, the point of the Groat paper is to look at an identifiable non-optimal part of the grammar. I wonder whether the conceptually necessity of Case a la Martin isn't tied to the conceptual necessity, or at least self-evident existence of expletive there. Interestingly Groat's discussion doesn't touch on matters of conceptual necessity. In fact this paper seems much more concerned with the phenomenon of expletives; here the theoretical concerns are secondary. This is a general property of the How to Handle It group of papers. In this same way, Norvin Richards paper is a very neat piece of exposition concerned with the nested dependencies evident in movement to multiple specifiers.
The theoretical matter that Richards couches his work in is cyclicity. In Chomsky 1993 cyclicity was captured in the condition that all operations must expand a tree. But head movement had to be treated as a widespread exception to this condition. Chomsky's 1995 proposal that strong features must be checked ASAP avoids this problem, plus Richards argues it predicts paths to multiple specifiers ought to cross, rather than nest. Richards shows this prediction is borne out in a discussion of Multiple WH-movement in scrambling and question formation. He then considers other instances of crossed paths (Object Shift, Negative Fronting, etc.) and argues that these too involve "multiple attraction by a single attractor" or head.
These How to Handle It papers are instructive instances of applied (or working) minimalism, but do not seem to be particularly indebted to the MP. Their focus is phenomenological; the theoretical application seems to be secondary, in some cases almost an afterthought The facts and generalizations persist independent of the theory within which the data are currently discussed, and so support it only indirectly.
In this context it is important to mention Amy Weinberg's contribution, A Minimalist Theory of Human Sentence Processing. This paper stands out from the others in the volume in subject matter; it is also an instance of the How to Handle It groups of papers that is intimately connected with the MP, particularly with Uriagereka's revision of the Linear Correspondence Axiom and his notion of multiple Spell Out. Weinberg uses the MP to explain initial analyses of ambiguous structures and to provide a theory of the revision that occurs when the processor encounters disconfirming data. It is an interesting re-working of the data.
For example, it is well established that a noun phrase following a potentially transitive verb is preferentially interpreted as a direct object (see Frazier and Rayner 1982. Making and correcting errors during sentence comprehension: Eye movements in the analysis of structurally ambiguous sentences. Cognitive Psychology 14, 178-210.). A sentence like (a) contains the ambiguous string the girl knows the answer.
a) The girl knows the answer to the problem was correct
The noun phrase the answer is preferentially interpreted as the object to knows, and Frazier and Rayner showed higher reading time per character in sentences of this sort, as well as a higher probability of regressive eye movements, then in corresponding cases lacking the ambiguity.
This preference for an argument analysis Weinberg captures via feature checking. Weinberg states that under a direct object attachment, the noun phrase will be assigned more features at the point of attachment than under the subject of complement clause analysis. Thus the object attachment is preferred. In her discussion of these sorts of cases, Weinberg must assume that theta-features are checked like other features. She also makes extensive use of Larsonian shells (including for adverbial phrases).
The idea that the processor is (economically) driven to check as many features as possible in as few operations as possible contrasts with a Garden Path/Construal view that deterministic, structural parsing principles like Minimal Attachment account for this preference. How does the system know how many features will be checked prior to making the attachment? It is very disappointing that there is no comparison between what Weinberg proposes and this more widely known Garden Path alternative, in which questions of this sort could have been directly answered. Weinberg does provide a comparison with Constraint Satisfaction models. It is intriguing in that she concedes that a verb's frequency of occurrence as a main verb vs. as a head of a reduced relative may well be the determining factor underlying processing behaviors observed with classic garden path sentences like the horse raced past the barn fell. The role of grammatical constraints, she asserts, is to determine whether reanalysis is available.
Weinberg's proposal concerning reanalysis is built on the notion of multiple Spell-Out: reanalysis is available where it is triggered before Spell-Out has eliminated access to structure. So in the example in (a) above, there is reanalysis. Where there is no reanalysis we find a garden path effect, in traditional Late Closure example like (b).
b) Since Jay always jogs a mile seems like a short distance to him.
Here in the absence of punctuation the string since Jay always jogs a mile is ambiguous, and people show a preference for attaching a mile as the object of jog. Again for Weinberg, this preference is captured by feature checking as in example (a), but she further argues that the attachment of the disambiguating element seems triggers Spell-Out of the initial portion of the input, so eliminating the structure, preventing re-analysis. It was not clear to me what she predicts to be the difference in behavioral indicators (reading times, eye movements) of reanalysis (as in (a)) vs. garden path (b).
I hope this discussion has indicated that there are a number of worthwhile papers in the collection. Overall, my disappointment with this book might be attributed to an expectation for something more instructive than this collection of essays or a failure to appreciate the distinction between a theory and a program. But I don't think that my disappointment is ultimately due to my ignorance, inasmuch as I think my expectations for the book are fair.
The work should come together better as a collection. Most of the essays are nice independent pieces of research, but more needs to be done to fit them together. A discussion, for instance, of the inconsistencies in the papers is required: Lasnik argues against reconstruction in A chains, Hornstein argues against this position. Hornstein deletes movement copies freely at LF; Nunes deletes them at PF according to formal feature content. Epstein and Hornstein state that theta roles do not check features, Weinberg claims they do. Is everyone consistent with Epstein's reformulation c-command? Is (the often employed) deletion an operation in MP? Does the system compare multiple derivations from a numeration, or (as Weinberg asserts) work serially? And finally, does the MP require only principles operating at the interfaces and reducible to economy conditions as Uriagereka indicates, or is it a means for critiquing the devices of actual theories as Freidin suggests? It isn't incumbent upon the editors to resolve all issues, but it is important to present the points of debate and do so coherently. In addition, a synopsis of where we are at the end of this work is necessary: what do the writers agree is conceptually necessary? What elements have been introduced in eliminating the old, unnecessary devices? I think deliberation of this sort would give readers of Working Minimalism a sense that the MP is moving forward rather than looking backwards.
Robin Schafer is an assistant professor in Linguistics at the University of Canterbury teaching syntax, morphology and psycholinguistics.
|