This book "asserts that the origin and spread of languages must be examined primarily through the time-tested techniques of linguistic analysis, rather than those of evolutionary biology" and "defends traditional practices in historical linguistics while remaining open to new techniques, including computational methods" and "will appeal to readers interested in world history and world geography."
Date: Mon, 8 Sep 2003 23:33:47 +0200 From: Gisbert Fanselow Subject: Movement in Language: Interactions and Architectures
Richards, Norvin (2001) Movement in Language: Interactions and Architectures, Oxford University Press.
Gisbert Fanselow, University of Potsdam
The insights of Chomsky (1964), and, in particular, Ross (1967) lead to the establishment of a new research topic in syntax: constraints on movement. This new line of research generated an impressive number of empirical insights, and culminated in attempts such as Chomsky (1981), Chomsky (1986), or Baker (1988) of finding one or two simple principles from which all constraints on movement can be derived.
However, empirical data that might prove fatal for a purely syntactic account of the restrictions on movement already came in around 1980. Huang (1982) observed that some of the local domains that restrict movement in English also constraint scope assignment in Chinese questions, in spite of the fact that question words do not undergo (visible/audible) wh-movement in this language but may stay in situ. Earlier, Erteschik (1973) had made the discovery that the degree of acceptability of extractions from certain domains is a function of information structure. These findings did not lead to a dismissal of syntax-based accounts of constraints on movement, however. Rather, models were developed in which the assignment of semantic scope to operators is conceived of as the construction of a formal level of representation (viz., Logical Form. LF), which involves essentially the same type of operations that we find in visible syntax, including movement (see, e.g., Huang (1982)). According to the GB-model (Chomsky (1981)), the ultimate target of a syntactic derivation is Logical Form. Much of the derivation of LF consists of a sequence of movement operations. During the derivation, there is a point (identified as a level of representation, S-structure, in early approaches, and simply called Spellout, nowadays) at which the phonological and the syntactic aspects of the derivation split up. Movement taking place before Spellout has a phonological effect (visible displacement, overt movement), movement taking place after Spellout (covert movement) has no phonological effect, it is invisible/inaudible.
1. Overview of Movement in Language
This classical view of the distinction between covert and overt movement prevailed through the eighties, but in the nineties, its assumptions were questioned: is the difference between overt and covert movement really expressible in terms of a Spellout point in the derivation, or does it have to be specified independently, so that covert operations may precede overt ones? Are overt and covert movement really identical? Norvin Richards has written his book Movement in Language. Interactions and Architectures (MiL) as a contribution to this discussion (chapter 6), and he argues for a neo-classical concept of movement, in which the difference between overt and covert operations is (in principle) one of the timing relative to Spellout.
MiL claims that the neoclassical view is supported by the existence of similarities among languages that only have overt (Bulgarian) or covert (Chinese, Japanese) wh-movement, respectively, as opposed to languages such as English that employ both types of movement. Models in which the difference between overt and covert movement is one of timing are particular in predicting that Bulgarian and Chinese type languages have common properties (because all wh-movement steps take place at the same point in the derivation, before Spellout in Bulgarian, after Spellout in Chinese), whereas the different instances of wh-movement in a multiple question are carried out in different parts of the derivation in English type languages.
In addition, MiL offers analyses for a number of phenomena that are formulated in terms of the neoclassical view and support it to the extent that these are compelling. These detailed analyses of various phenomena related to movement make the book extremely interesting and valuable. Chapter 2 presents evidence for the idea that UG allows two different types of multiple questions: those, in which all wh-phrases cluster in the CP-domain, and those in which this clustering happens within IP. Chapter 3 discusses strict ordering effects among multiple specifiers of the same category (wh-phrases in Bulgarian, clitic sequences, certain types of A-movement, etc.) and argues that they can be derived from the Shortest Move condition and a particular way of encoding cyclicity in grammar.
Chapter 4 is concerned with a fundamental problem of the (neo-) classical model: there seem to exist positions P in natural languages that are normally targeted by covert movement, but are passed through by overt movement to higher positions Q. How can movement (to P) applying after Spellout precede movement (from P to Q) before Spellout? Richards solves this problem by formulating a model in which Pesetsky's Earliness Principle and a constraint related to the phonological realization of links in a chain imply that some instances of "covert" movement may take place before Spellout.
Chapter 5 gives a detailed discussion of "minimal compliance": sometimes, constraints such as subjacency or the superiority condition do not have to be fulfilled by all links created by movement - rather, it suffices that one dependency is in line with the constraint and thereby licenses the later creation of dependencies violating it.
2. Two ways of forming multiple questions
The theory developed in MiL presupposes and elaborates on a proposal originally made by Rudin (1988): in multiple questions, the wh-phrases may either be all adjoined to IP (as in Serbo-Croatian), or they may be made multiple specifiers of CP (as in Bulgarian). If long distance wh- movement proceeds via the specifier position of CP only, we understand why CP-absorption languages tolerate extractions from wh-clauses, while IP-absorption languages do not. According to MiL, "IP-absorption" languages are further characterized by allowing scrambling. They lack superiority effects with local wh-movement (wh-objects may be placed in front of wh-subjects in multiple questions), and they do not show weak crossover-effects (as English does in ?who does his mother like). When multiple wh-phrases from the same clause interact, they have the same scope in IP-absorption languages, but CP-absorption languages are different: wh-phrases with different scope are possible, and they prefer crossing dependencies. Richards argues that this distinction also applies to languages with covert wh-movement only, and to languages such as German and English which combine overt and covert wh- movement in multiple questions.
The discussion in MiL sheds an interesting new light on the possible scope of a proposal that was originally made for languages with multiple fronting of wh-phrases. Two remarks are in order, however. First, some of the properties by which IP- and CP-absorbing languages are distinguished are straightforward consequences of scrambling. That scrambling languages show neither superiority nor weak crossover effects, was, e.g., observed by Haider (1986), and he related this property to the additional ordering possibilities created by scrambling. Since objects may be scrambled to a position P c-commanding the subject, the data in (1) have a derivation compatible with the conditions responsible for superiority and weak crossover: whenever object wh-movement starts in the position P c-commanding the subject, it neither crosses a wh-subject nor a pronoun which it binds.
(1) a. ich weiss wen t-WH [wer t-SCRA liebt] I know who.acc who.nom loves b. wen liebt [t-WH [seine Mutter t-SCRA] who.acc loves his mother
The question arises, then, whether the differences between German and English with respect to the descriptive properties of wh-movement do not just reduce to the fact that German is a free word order language, while English is not. Such a solution would be incorrect only if one could show that A-scrambling must not precede wh-movement, so that wh- movement in (1) cannot start from the position t-WH preceding the subject, but must originate in the object position t-SCRA c-commanded by the subject. Such a constraint on the interaction of scrambling and wh-movement has in fact been postulated by Müller & Sternefeld (1993), and it seems to be a consequence of the general approach pursued in MiL, since a chain resulting from a succession of A-scrambling and wh- movement contains two strong positions (see below). But empirically, the claim that wh-movement must not be preceded by A-scrambling is hard to defend, given data such as (2), in which the movement of the wh- operator was strands the rest of the object noun phrase in front of the subject, i.e., in a scrambling position (see Fanselow 2001).
(2) Wasi hätte denn [DP,acc t für Aufsätze] selbst Hubert nicht what had PTC [ t for papers ] even Hubert not rezensieren wollen review wanted 'What kind of paper would even Hubert not have wanted to review?
Apart from the question of whether the CP-IP-absorption distinction is really supported by English-German-contrasts, it also cannot be taken for granted that the properties in the clusters associated with CP- vs. IP-absorption always go hand in hand. Swedish does not show superiority effects in simple multiple questions (so it should be an IP-absorption language), but it is quite liberal with respect to wh-islands (a property claimed to be characteristic of CP-absorption languages) and does not have scrambling (but Object Shift). Spanish is like Swedish in this respect, but its word order is much more flexible.
Of course, one cannot exclude that the existence of languages that do not fall in line with the clustering of properties predicted in MiL is due to additional parameters and further structural distinctions. Nevertheless, the above remarks concerning German, English, and Swedish relativize the merits of an attempt to extend Rudin's proposal beyond the multiple wh-movement languages.
3. Tucking in
The superiority effect observed in English multiple questions has been a topic of syntactic theorizing for more than thirty years, and a number of diverging theories have been proposed. It was again Rudin (1988) who enriched this discussion with new data from multiple filler languages: in Bulgarian double questions, both wh-phrases must be fronted in overt syntax, and the order in which they appear in clause initial position must be identical to the order in which they were merged in IP. Rudin's own account involves the adjunction of wh-phrases to the specifier position of CP, which is unsatisfactory from a theoretical point of view, given that this analysis violates the strict cyclicity of derivations (but see Grewendorf 2001 for a modern version of this account).
MiL accounts for the contrast in (3) in the following way (chapter 3). Movement to Spec,CP is subject to a Shortest Move/Minimal Link Condition requirement: only the wh-phrase closest to the attracting position moves. Therefore, the subject koj is the first category in the derivation of (3) that moves to Spec,CP. Since Bulgarian is a multiple wh-movement language, the object kogo must be moved as well. The strict order effects in (3) follow if the second specifier position created by moving kogo must be created _below_ the position of the XP moved first, i.e, if XPs are "tucked in" below the phrase moved previously in multiple specifier constructions. This presupposes a specific definition of cyclicity that Richards takes over from Chomsky (1995).
(3) a. koj kogo vizda who whom sees b. *kogo koj vizda
Chapter 3 is particularly interesting because Richards shows that the scope of the phenomenon captured by the tucking in - operation goes beyond multiple questions in Bulgarian: Object shift, cliticization, and certain types of A-scrambling and quantifier raising are further cases in point. It is a fairly new discovery that in quite a number of constructions, the c-command relations among moved phrases must be the same before and after movement!
Unfortunately, MiL does not contain a detailed comparison of its strictly derivational "tucking in"-model with the strictly representational accounts offered by Müller (2001) and Williams (2003). E.g., Müller proposes a (violable) constraint according to which c- command relations among phrases must be identical at all levels of representation (PF, LF, etc.). The representational models are compatible with a derivation of (3a) proceeding in a traditional way (kogo moves first). Independent evidence for the tucking in-idea thus seems to be called for, and would be extremely valuable, since it would strongly support a derivational model of grammar. MiL contains a brief discussion (pp. 49-53) of Bulgarian constructions in which local wh- movement mitigates subjacency violations of later non-local wh- movement. If the licensing local movement must precede the licensed long movement, insights into the order by which wh-phrase move to the specifiers of a CP seem possible, and Richards claims the empirical facts support his view. However, I do not find the "crucial" contrast between a "*" sentence (his (21) on p. 53) and a "??"-sentence (his (20) on p.52) too impressive, in particular, since it is based on the intuitions of a single native speaker only.
4. Strong and weak features
A timing model of the contrast between overt and covert movement in terms of a Spellout point in the derivation is confronted with the problem that some constructions seem to involve an application of covert movement that precedes overt movement steps. Richards dedicates the fourth chapter of his book to a discussion of this problem. His approach is framed in terms of the standard minimalist assumption that movement serves the purpose of feature checking, and that there are two types of features: strong and weak ones. In contrast to "standard" minimalism, the strong-weak distinction is framed in terms of effects on PF-chains (p 105): a strong feature is an instruction that the position (in the chain) that checks this feature must be pronounced. Weak features do not imply any constraints in terms of pronounciation. On this basic assumption, Richards build a simple and elegant algorithm for determining whether movement is overt or covert. The key idea lies in the assumption (p. 105) that PF must receive unambiguous instructions about which element in the chain must be pronounced.
ABOUT THE REVIEWER:
ABOUT THE REVIEWER Gisbert Fanselow is a professor of syntax at the University of Potsdam, Germany, His research has a focus in free word order phenomena (scrambling, discontinuous noun phrases), aspects of wh-movement (scope marking constructions, MLC). He has done some experimental work on preferences in local ambiguities and processing influences on grammaticality judgements.