LINGUIST List 9.44

Tue Jan 13 1998

Review: Kitahara: Elementary Operations

Editor for this issue: Andrew Carnie <>

What follows is another discussion note contributed to our Book Discussion Forum. We expect these discussions to be informal and interactive; and the author of the book discussed is cordially invited to join in. If you are interested in leading a book discussion, look for books announced on LINGUIST as "available for discussion." (This means that the publisher has sent us a review copy.) Then contact Andrew Carnie at


  1. jlegate, Kitahara review.

Message 1: Kitahara review.

Date: Mon, 15 Dec 1997 22:15:54 EST
From: jlegate <jlegateMIT.EDU>
Subject: Kitahara review.

Hisatsugu Kitahara, (1997) Elementary Operations and Optimal Derivations, 
	MIT Press, Cambridge, Mass. 140 pages, $15.00.

Reviewed by Julie Legate, <>

This book is very firmly situated within the Minimalist approach to syntactic 
theory that was begun by Chomsky (1991) and perhaps most fully articulated in
Chomsky (1995). It adopts much of the basic architecture of the 1995 version 
of Minimalism (henceforth MP), while deriving several of its principles and 
assumptions. The first three chapters of the book propose some comparatively
minor alterations to the MP system, and demonstrate that these alterations
allow several stipulations of the MP to be dropped, while retaining or 
improving the framework's empirical coverage. The final chapter retains the
proposals of the previous chapters while putting forth a new condition that is 
a more clear departure from the MP approach. With this condition, Kitahara is
able to account for several notoriously problematic wh-constructions.

Kitahara's first chapter consists of a review of the Minimalist syntactic
framework. He discusses the conceptual foundation of the approach (the 
unflinching application of Occam's Razor to every aspect of the 
computational system), the guiding principles of global economy, as well as 
the internal mechanisms of the computation (including the creation of 
syntactic trees through successive operations of Merge and of morphological 
feature-driven Move). 

His second chapter contains the core proposals of the book. Kitahara
replaces the operations of Merge, Move, and Erase by two "Elementary 
Operations": "Concatenation" and "Replacement". Concatenation is the
procedure which joins two objects alpha and beta to form a new object K. 
Replacement, on the other hand, substitutes an object alpha for an object 
beta, where beta is contained within a larger object sigma. The MP operations 
are redefined using these Elementary Operations as follows (p35):

(i) Cyclic Merge = Concatenation
 Cyclic Move = Concatenation
 Noncyclic Merge = Concatenation + Replacement
 Noncyclic Move = Concatenation + Replacement
 Erase = Replacement

He further redefines the Shortest Derivation Condition (Chomsky 1991, 1993; 
Epstein 1992) in terms of these operations, as in (ii) (p26):

(ii) Shortest Derivation Condition (SDC)
 Minimize the number of elementary operations necessary for 

Given this technology, he proceeds to derive several of the principles and
assumptions of MP. 

First, he considers cyclicity, providing a detailed summary and comparison 
of the approaches taken in Chomsky 1993, 1994, and 1995. He derives 
that, in simple cases, cyclic convergent derivations should be preferred 
over noncyclic convergent ones, since cyclic operations yield a shorter 
derivation. As shown in (i) above, cyclic operations involve only one 
Elementary Operation: Concatenation, whereas noncyclic ones require two:
Concatenation and Replacement. 

Next, he turns to the MP principle of Procrastinate: covert movement 
(i.e. movement which occurs in the computation after the derivation is sent to
PF for pronunciation) is preferable to overt movement. He separates the
discussion into head movement, object shift, and expletive insertion. He 
adopts the MP assumptions that (a) movement results in two instances of 
identical elements, one in the merged position and the other in the moved 
position (i.e. the copy theory of movement); (b) if an element is overtly 
attracted, the entire category moves, whereas if an element is covertly 
attracted, only the formal features move; and (c) only one of the identical 
elements created by movement is interpreted at LF. Finally, he proposes a 
novel interpretation of effect of strong features in the grammar, as
in (iii) (p37):

(iii) Strong Feature Condition
 Spell-Out applies to sigma only if sigma contains no category with
 a strong feature.

Regarding head movement, in languages with overt verb raising, T has a strong 
V feature and thus overt raising is necessary for convergence. In languages 
without overt verb raising, the SDC selects derivations with covert, rather 
than overt, verb raising. Although both covert and overt head movement 
involve one operation of Replacement (since head movement is necessarily 
non-cyclic), Kitahara claims that overt head movement, being category movement 
requires an additional instance of Replacement, thus resulting in a longer 
derivation. His reasoning is as follows. If a verb moves overtly, its 
semantic features are carried along. Therefore, at LF it will be necessary to 
delete one of the instances of the semantic features, since elements are only 
interpreted once. This requires an application of Replacement that is not 
needed for covert movement, since covert movement only affects the formal 
features, not the semantic features as well. 

Notice that the same reasoning will not extend to phrasal movement. Overt 
movement will require one instance of Replacement to delete the semantic 
features of one member of the chain (just considering a simple two-membered 
chain), however covert movement will also require one instance of Replacement, 
since covert movement is necessarily non-cyclic. Thus, the SDC cannot choose 
between derivations with overt object shift and those without. This predicts 
the optionality of object shift in languages like Icelandic, without resorting 
to the MP 'optional strong feature' analysis, which is a simple restatement of 
the facts. 

In languages without overt verb movement, object shift is predicted to be 
impossible, assuming that the object shifts to the outside specifier of vP, 
the inside specifier being the merged position of the subject, and that 
multiple specifiers are not equidistant from a higher head, unless head
movement renders them equidistant. Thus, the shifted object would block
movement of the subject to TP, unless the verb has raised to T. Kitahara
acknowledges (p144, fn26), however, that he cannot explain languages like 
French, that display overt verb raising but prohibit object shift.

Finally, Kitahara considers the timing of expletive insertion. Notice that 
although MP assumes that Merge is cheaper than Move, Kitahara's reanalysis 
predicts that Merge be equally economical to cyclic Move, since both consist 
of one application of Concatenation. Since the timing of expletive insertion 
is the primary empirical motivation for the MP assumption, Kitahara 
demonstrates that this timing is equally captured within his system. 

Consider the familiar sentences (iv) and (v).

(iv) There seems to be a man in the room.
(v) *There seems a man to be in the room.

In (iv), the expletive was inserted in the embedded [spec, T] and raised to 
the matrix TP. In (v), on the other hand, "a man" was raised to the embedded 
TP and the expletive was inserted directly into the specifier of the matrix 
TP. MP claimed that (iv) is preferred over (v) because it is cheaper to merge 
the expletive into the embedded TP than to move the associate. Kitahara 
claims that these facts follow from his SDC. Since expletives are assumed to 
have no semantic features, overt raising of "there" will not require an 
application of Replacement to delete an instance of semantic features in (iv). 
In (v), on the other hand, overt raising of "a man" will require Replacement 
to delete one of the resulting two instances of the semantic features of "a 
man". Therefore, the derivation in (iv) is shorter than that in (v) and thus 

As a side point, notice that this analysis requires that the formal features 
of "a man" raise covertly directly to the matrix T. If these features raised 
to the embedded TP covertly, the non-cyclic movement would result in an 
additional application of Replacement, and (iv) and (v) should be equally 
economical (an equivalent situation to object shift). 

Kitahara concludes Chapter Two with a note about the timing of expletive 
insertion in Icelandic Transitive Expletive Constructions. He assumes the MP 
analysis that the associate (i.e. the subject of the transitive) moves into 
the inner specifier of TP and the expletive merges into the outer specifier of 
TP, the verb appearing between the two as a verb-second phenomenon. He notes 
that although this situation is the opposite of the English situation 
discussed above, i.e. here category movement precedes expletive insertion, 
this is predicted by the restriction against downwards movement. Assuming the 
associate must move to adjoin to the expletive at LF, the associate must 
appear lower than the expletive, in order for this movement to be raising 
rather than lowering. 

In Chapter Three, Kitahara demonstrates that Chomsky's (1995) Minimal Link 
Condition can explain phenomena which had previously received disparate 
analyses in the literature. 

(vi) Minimal Link Condition (MLC)
 H(K) attracts alpha only if there is no beta, beta closer to H(K) than 
 alpha, such that H(K) attracts beta. 

Kitahara begins with Relativized Minimality (Chomsky 1993) violations, as in
(vii), and Superiority Condition violations, as in (viii). 

(vii) *John seems it is t(John) certain to be here.

(viii) *What did you persuade whom to buy t(what)?

The MLC accounts naturally for these facts: in (vii), "it" is closer
than "John" to the matrix T, and thus blocks attraction of "John"; in (viii),
"whom" is closer to the matrix CP than "what" and thus blocks attraction of 

Next, Kitahara considers Proper Binding Condition violations, like that shown 
in (x).

(ix) Proper Binding Condition
 Traces must be bound.

(x) *Which picture of t(who) do you wonder who John likes t(which picture
 of t(who))?

He argues that a Proper Binding Condition analysis of (x) is no longer
available in Minimalist approaches. This condition can no longer apply at 
S-structure, since S-structure has been eliminated from the model, and LF
reconstruction of "picture of t(who)" could create a configuration in which
the trace of "who" is bound. 

Instead, Kitahara offers an MLC solution. He observes that (x) involves two 
violations of the Minimal Link Condition. First, "which" is closer to the 
embedded CP than "who" and thus blocks the attraction of "who" (note that 
"picture of who" would be necessarily carried along with "which" to the 
embedded CP by an independent convergence condition). Second, given the 
illegitimate attraction of "who" to the embedded CP, "who" becomes closer to 
the matrix CP than "which", and thus blocks the attraction of "which". 
Assuming that 

(xi) A derivation employing a greater number of illegitimate steps
 induces a greater degree of deviance (p72)

the derivation in (xii) below is preferred over (x) because (x) involves two 
violations of the MLC whereas (xii) involves only one.

(xii) ??Who do you wonder which picture of t(who) John likes t(which
							 picture of who)?

Kitahara extends this analysis to crossing versus nesting dependency data. 

(xiii) Nested Dependency Condition (Pesetsky 1987)
 If two "wh"-trace dependencies overlap, one must contain the other.

The paradigm cases are those in (xiv) and (xv):

(xiv) ??What did you wonder whom John persuaded t(whom) to buy t(what)?
(xv) ?*Whom did you wonder what John persuaded t(whom) to buy t(what)?

In (xiv), Kitahara observes, the MLC is disobeyed once, in the raising of 
"what" over "whom" to the matrix CP. In (xv), on the other hand, the MLC is 
disobeyed twice, once in the raising "what" over "whom" to the embedded CP, 
and a second time in the raising of "whom" over "what" to the matrix CP. Thus, 
following (ix), the grammar prefers (xiv) over (xv). 

Finally, Kitahara considers scrambling and topicalization in German and 
Japanese, demonstrating that certain restrictions on these phenomena can also 
receive an MLC treatment. He assumes that both phenomena are feature driven, 
and that the scrambling/topicalization feature of the attracted element is 
interpretable, and thus remains accessible to the computation after checking.
The basic pattern considered is that it is not possible to scramble an element 
from a constituent and then scramble the remnant, however it is possible to 
then topicalize the remnant. German examples are provided in (xvi) and (xvii):

(xvi) scrambling + scrambling of remnant
 *dass [t(das Buch) zu lesen] keiner [das Buch] t(t(das Buch) zu lesen)
 that (the book) to read no one the book (the book to read) 
 versucht hat
 tried has
 "that no one has tried to read the book"

(xvii) scrambling + topicalization of remnant
 [t(das Buch) zu lesen] hat keiner [das Buch] t(t(das Buch) zu lesen)
 (the book) to read has no one the book (the book to read)
 "No one has tried to read the book"

Under these assumptions, (xvi) violates the MLC twice, first by scrambling the 
DP "that book" over the closer VP "that book to read", and second by 
scrambling the VP "t(that book) to read" over the now-closer DP "that book". 
(xvii), on the other hand, does not violate the MLC at all. Assuming that the 
features that drive topicalization and scrambling are distinct, "that book" 
would be the closest available element with the scrambling feature to the 
attracting head, since "that book to read" would have a topicalization feature 
rather than a scrambling feature. Similarly, the VP "t(that book) to read" is 
the closest available element to be attracted for topicalization, since "that 
book" has a scrambling feature not a topicalization feature. 

In Chapter Four, Kitahara discusses the differences in deviance between 
derivations which involve one violation of the MLC by a wh-element. He 
provides the generalization in (xviii), and examples in (xix)-(xxii) (p83-85):

(xviii) An MLC violation involving adjuncts, subjects, or quasi objects [i.e.
 "how many" phrases] is far more severe than an MLC violation involving

(xix) Adjunct
 *How do you wonder [whether John fixed the car t(how)? 

(xx) Subject
 *What do you wonder [whether t(what) was fixed t(what)]?

(xxii) Quasi-object
 *How many pounds do you wonder [whether John weighed t(how many)]?

(xxii) Object
 ??What do you wonder [whether John fixed t(what)]?

In order to explain this phenomenon, Kitahara proposes the following condition 

(xxiii) Chain Formation Condition
 An application of Move forms 1 or >1 chain(s) only if it is legitimate

and assumes that traces may be attracted (at least covertly). He claims that
the illegitimate wh-movements in (xix)-(xxii) do not form a chain. 
Therefore, the wh-elements will not be able to be interpreted at LF, 
causing the derivation to crash. This accounts for the ungrammaticality of
(xix)-(xxii). In (xxii), however, Kitahara claims that the formal features of 
the trace of "what" raise covertly to check accusative case, and that it is 
this movement that saves the derivation. (Notice that covert movement of the 
traces of the wh-elements in (xxiv)-(xxvi) will not occur. Adjuncts and 
quasi objects do not check case, and subjects move overtly to check case.) 
According to the Chain Formation Condition, the movement of the formal 
features of the object, being legitimate, may form one or more chains; in 
particular, it forms a chain between the raised position of "what" in the 
matrix CP and the merged position of "what". Thus, the derivation can be 
interpreted at LF, and has only the status of a MLC violation. 

Kitahara extends the analysis to (xxiv).

(xxiv) "where"/"when" adjuncts
 ?? Where/when do you wonder [whether John fixed the car t(where/when)?

He assumes that these adjuncts are the complement of a null preposition. 
Therefore, the formal features of the trace of "where"/"when" must raise 
covertly to check case with the null preposition, again creating the necessary 
chain between the moved position of "where"/"when" in the matrix CP and its 
merged position. 

This chapter concludes the book.

Although this book stands solidly on the foundations of previous Minimalist
syntactic research, it remains accessible to those who are not well-versed in 
Minimalist theory. It provides very clear explanations of the details of 
previous Minimalist approaches, as well as Kitahara's own proposals. 
Furthermore, all relevant derivations are presented step-by-step, at a pace 
designed to accommodate the non-specialist. Thus, it presents a good 
opportunity for those interested to learn about research and issues in
Minimalist syntax. 

Those who are familiar with Minimalist research should find this to be an 
interesting reworking and application of 1995-style Minimalism. Anyone 
convinced by recent discussions of computational complexity and local economy 
(see Collins 1997, Johnson & Lappin 1997, Yang 1997, among others), however, 
will be dissatisfied with the approach, as it continues to rely on global 
economy conditions. Since, of course, not everyone has been convinced by the 
discussion, this is more a note to prospective readers than a criticism. On
a similar note, a crucial assumption for the analyses is that the grammar can 
count, which is controversial, but not obviously false. 

Note that, regarding cyclicity, the notoriously problematic case of head-
movement, which Chomsky (1995) managed to incorporate into "cyclicity"
requirements (forcing it to apply before introduction of another head into the 
derivation), again falls outside the analysis of "cyclicity". Since all head- 
movement will require an operation of Replacement, there is no longer any 
clear way to force it to apply before another head is introduced. 

Perhaps more serious is the reformulation of the Strong Feature Condition. 
The various Strong Feature Conditions of previous Minimalist approaches are 
simplified in the first chapter to (ii) above, repeated in (xxv) below.

(xxv) Strong Feature Condition
 Spell-Out applies to sigma only if sigma contains no category with a
 strong feature.

The difficulty with this formulation is that it renders the Strong Feature
Condition an S-structure condition and thus anti-Minimalist, since Minimalism 
took great pains to eliminate all S-structure conditions. Notice that it 
would be trivial to reformulate all previous S-structure conditions in a 
manner parallel to (xxv)--"Spell-Out applies to sigma only if sigma contains 
no traces which are not bound"--thus reducing the Minimalist claim that 
S-structure is redundant to a matter of terminology only. 

Furthermore, in the last chapter, a further condition involving strong 
features had to be introduced in order to rule out certain noncyclic
derivations. This additional condition, given in (xxvi), is essentially a
weakened version of Chomsky's (1995) formulation of the Strong Feature

(xxvi) alpha and beta cannot be concatenated if some sublabel of alpha and 
 some sublabel of beta are both strong (p95)

Thus, the proposed simplification of the Strong Feature Condition actually
results in positing two conditions, one of which is an S-structure condition.

Another seemingly anti-Minimalist proposal is the Chain Formation Condition,
given in (xxiii) above and repeated in (xxvii) below. 

(xxvii) Chain Formation Condition
 An application of Move forms 1 or >1 chain(s) only if it is legitimate

Minimalist theory claims that the computation of human language meets the
Inclusiveness Condition, i.e. no new elements are added during the course of
the computation. Instead, the computation arranges and rearranges items 
selected from the lexicon. Therefore, under Minimalist theory, the notion of a
"chain" as an independent entity does not exist, as it would have to be added
during the course of the derivation, violating Inclusiveness. Instead, 
"chain" is simply a convenient term used to refer to the identical elements in 
a derivation. The Chain Formation Condition, however, crucially requires
chains to have an independent existence in the computation.

These comments aside, this book does represent a step forward in the 
Minimalist research program. Kitahara is able to derive several assumptions/
principles which previously could only be stipulated. The account of 
optionality in Icelandic object shift is more satisfying than the "optional 
strong feature" approach, although, as was noted, it does raise some cross-
linguistic considerations (e.g. French). The systematic application of the 
Minimal Link Condition to data captured by various other conditions was sorely 
needed, if only to confirm the intuitions that an MLC would have equal, or 
superior, empirical coverage. Finally, the analysis of "wh" extraction
asymmetries presented in the final chapter, is one of the few Minimalist 
treatments of this phenomenon. All in all, the reader will find this book
to be very well considered, clearly explained, and thought-provoking. 


Julie Anne Legate is a PhD student in the Department of Linguistics and
Philosophy at MIT. Her research interests include syntactic theory and Irish 


Chomsky, Noam. (1991) Some notes on economy of derivation and 
 representation. In Principles and parameters in comparative grammar, ed. 
 Robert Freidin, 417-454. MIT Press, Cambridge, Mass.

Chomsky, Noam. (1993) A minimalist program for linguistic theory. In The 
 view from building 20, eds Kenneth Hale & Samuel Jay Keyser, 1-52. MIT 
 Press, Cambridge, Mass.

Chomsky, Noam. (1994) Bare phrase structure. MIT Occasional Papers in 
 Linguistics 5. MITWPL, Cambridge, Mass.

Chomsky, Noam. (1995) The Minimalist Program. MIT Press, Cambridge, Mass.

Collins, Chris. (1997) Local Economy. MIT Press, Cambridge, Mass.

Epstein, Samuel D. (1992) Derivational constraints on A'-chain formation.
 Linguistic Inquiry 23, 135-159. 

Johnson, David & Shalom Lappin. (1997) A Critique of the Minimalist Program.
 Linguistics and Philosophy.

Yang, Charles D. (1997) Minimal Computation. Master's Thesis, Department 
 of Electrical Engineering and Computer Science, MIT.

- ----- End of Forwarded Message
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue