Review of  Parameters and Universals

Reviewer: Denis Bouchard
Book Title: Parameters and Universals
Book Author: Richard S. Kayne
Publisher: Oxford University Press
Linguistic Field(s): Syntax
Subject Language(s): French
Issue Number: 12.1602

Kayne, Richard S. (2000) Parameters and Universals, Oxford
University Press, hardback, viii, 369 pp., Oxford Studies in
Comparative Syntax.

Reviewed by Denis Bouchard, Universit� du Qu�bec � Montr�al

This book is a collection of 15 papers by Kayne, which were
previously published from 1985 and on. The papers all deal
with issues of comparative syntax, and how they interact
with principles of Universal Grammar. The book is organized
thematically rather strictly chronologically. This allows us
to follow Kayne as, progressively, by reworking his
analyses, he tries to narrow down a parameter to the minimal
units of syntactic variation. We see how this quest has
brought him over the years from broad comparisons between
French and English, to comparisons among the Romance
languages and among the Germanic languages, and ultimately
to microparametric syntax, in which several very closely
related dialects are studied. These fine-grained comparisons
approximate an experiment in which a syntactician would
"take a language, alter a single one of its observable
syntactic properties, examine the result to see what, if
any, other property has changed as a consequence. If some
property has changed, conclude that it and the property that
was altered are linked to one another by some abstract
parameter" (p. 5). Kayne enthusiastically compares
microparametric syntax to the development of the earliest
microscopes: it "allows us to probe questions concerning the
most primitive units of syntactic variation" (p.9).

The book contains some remarkably precise descriptions of
distributional properties (I particularly enjoyed chapter 8
on person morphemes and reflexives in Romance) and
challenges us to explore why and how many closely related
languages show shades of variation. As always in Kayne�s
work, we are provided with numerous references to works of
authors both current and from earlier periods: this in
itself is enough to make this book an invaluable reference
work for anyone working on clitics, participle agreement in
Romance, or agreement in English. I need not emphasize how
influential a figure Kayne is, with major contributions to
the description of distributional properties of various
languages. It makes a great tool to have several of his
papers under one cover rather than dispersed in variously
accessible publications.

With a tool such as microparametric syntax that can reveal
clusterings of syntactic properties, we are in a position to
devise a theory that accounts for the facts "by showing that
the several properties in question can all be traced back to
a single relatively more abstract parameter setting" (p.3).
To be truly successful, the account must not simply restate
the facts, but show how the properties follow from deeper
assumptions that explain what are possible variations, and
why. We can evaluate the success of Kayne�s approach by
examining how he generally proceeds, which is very constant
across the chapters.

Consider the variation in Romance concerning the
distribution of object clitics which is presented in chapter
5. We are first given a list of observations about the data.

List A
-French tensed clause: clitic precedes V
-Italian tensed clause: clitic precedes V
-French infinitival clause: clitic precedes V
-Italian infinitival clause: clitic follows V

The account proposes that clitics must adjoin to an I-type
functional head. Only left-adjunction is allowed. A tensed V
is assumed to move through both AGR and T in order to pick
up a suffix in each case, as in (1):

(1) ...[[Cl+V+T] AGR] ... [T e] ... [V e]

Assuming in addition that a Cl cannot adjoin to a trace, the
only possibility in a tensed clause is for the clitic to
"adjoin to the I-position in which the verb finds itself at

In the Italian infinitive, V adjoins to Infn, a functional
head with nominal properties, and then V+Infn adjoins to T�.
The Cl can adjoin to T because T is not a trace since an
infinitive V is not obliged to merge with T and AGR.

(2) ... V+Infn ... Cl+T ... [Infn e] ... [V e]

"The precise identity of the free abstract I node in [(2)]
is not immediately clear, however, since the infinitive verb
shows neither an overt AGR suffix or an overt T suffix in
Italian." (p.63)

French infinitives also involve raising V to Infn, but no
additional movement of V. Furthermore, Cl adjoins to V+Infn
rather than to T:

(3) T ... Cl+V+Infn ... [V e]

This account actually amounts to a second list as follows:

List B:
-French tensed clause: Cl left-adjoins to the inflected V;
there are functional heads to the right of V, but these are
excluded for Cl because they are traces.
-Italian tensed clause: Cl left-adjoins to the inflected V.
-French infinitival clause: Cl left-adjoins to V+Infn
-Italian infinitival clause: Cl left-adjoins to T; V+Infn
precedes T.

List B merely restates in other terms the observations from
list A. Moreover, while the description of the variation
between French and Italian infinitives is very easy to state
as in list A, list B requires additional stipulations:
-V+Infn adjoins to T� in Italian
-the Cl must be precluded from adjoining to V+Infn in (2)
(either after V+Infn has moved, or before, with pied-piping)
-V+Infn must be precluded from adjoining to T in (3)

When other facts from Romance are added to list A, list B is
directly augmented accordingly.

List A (augmented):
-Occitan infinitive: Cl precedes T-ADVs, which precede V
-Sardinian infinitive: Cl precedes V, which precedes T-ADVs

List B (augmented):
-Occitan infinitival clause: Cl left-adjoins to T; Cl is
precluded from adjoining to V+Infn
-Sardinian infinitival clause: V+Infn left-adjoins to T; Cl
left-adjoins to V+Infn+T.

In all these cases, list B just restates the facts listed in
A. Why a different positioning occurs in a given language is
not explained, but stipulated.

As another example, consider the analysis of the scope of NO
ONE and its interaction with particles, presented in chapter
13. The observations are as follows:

List A:
-When in object position in an embedded clause, NO ONE may
have scope in this clause (narrow scope), or scope in a
clause containing this embedded clause (wide scope)
-a particle may appear next to the V or after the object
-wide scope for NO ONE is difficult if a particle follows it

Kayne accounts for these facts by assuming that a negative
phrase like NO ONE must move overtly to the Spec of a Neg
functional category, followed by movement of the remnant VP
to the Spec of a WP (mnemonic for 'word order'). The scopal
ambiguity comes from the assumption that the abstract Neg
category may appear in the embedded or the matrix clause.
This provides two derivations as follows:

Narrow scope: I will force you to turn down no one
... force you to no one turn down (NO ONE preposing)
... force you to turn down no one (embedded VP-preposing)

Wide scope: I will force you to turn down no one
... no one force you to turn down (NO ONE preposing)
... force you to turn down no one (matrix VP-preposing)

The fact that a particle may follow an object is derived by
raising the particle out of the VP before the latter is
fronted to a position crucially ordered after NegP, as

Narrow scope: I will force you to turn down no one
... force you to down turn no one (particle preposing)
... force you to no one down turn (NO ONE preposing)
... force you to turn no one down (embedded VP-preposing)

Wide scope: I will force you to turn down no one
... down force you to turn no one(particle preposing)
... no one down force you to turn (NO ONE preposing)
... force you to turn no one down (matrix VP-preposing)

In this last case, with wide scope for NO ONE followed by a
particle, is less felicitous. Kayne attributes this deviance
to the assumption that long-distance particle preposing
"cannot readily apply out of an infinitival complement" of
this sort.

Again, the account simply replaces list A by a list B that
restates the facts (in a more convoluted way):

List B:
-a sentence containing NO ONE in an embedded clause also
contains an abstract NegP that may appear in embedded or
main clause; NO ONE moves to Spec,NegP; remnant VP-preposing
is obligatorily triggered.
-a particle may raise to some position left of VP; remnant
VP-preposing is then obligatorily triggered.
- long-distance particle preposing is difficult. (Note that
this seems to apply only if NO ONE is the object, not if it
is SOMEONE, for example. Moreover, we are given no reason
why long distance is difficult for particles, but not for NO

What do you do in this kind of approach if you find a new
distributional observation in list A? You add one or more
elements corresponding to it to list B (a functional
category, a feature, a movement, etc.). The only difference
is that instead of asking why X appears in position P,
assumptions are added and the question changes to why X
moves to P. But no answer is provided to this second
question, so we are left with no answer to the question of
the distribution of X. The trigger for moving X is the
position it occupies on the surface, as Kayne unabashedly
indicates by using the functional category W, without an
indication of what makes X occupy position P rather than any

Engineering solutions of this type that restate the
distributional observations are not a characteristic of
Kayne's work alone. The whole approach evolving under the
Minimalist Program functions under exactly the same
premisses. Thus, though the features that trigger movement
sometimes get more specific labels, the labels actually play
no role and could be left unspecified (as is often done)
since the features are always -Interpretable. As Chomsky
(1995:278) indicates, "the sole function of these feature
checkers is to force movement..."

For Kayne, for Chomsky and their followers, there is a
correlation between the position of the proposed feature or
category and the position of some X, but it is actually a
correlation of a fact with itself, since a distributional
fact is correlated with a W-element whose sole purpose is to
encode this fact. This is ad hoc, just a delayed
stipulation. It is not revealing since it does not
anticipate any new fact. Since there is no restriction on
introducing W-elements other than the final surface result
to be attained, it is always possible to get a derivation
that will rule the sentence in or out, whatever is required.
The theory faces a serious problem of restrictiveness: the
link between the interpretive representation and the surface
form can vary arbitrarily. The trigger controlling the
movement is determined by the response itself.
This kind of approach that allows W-elements to be
introduced very freely is subject to a criticism very
similar to one leveled against Skinner�s approach. Borrowing
from the terminology of Chomsky (1959), we could say that a
typical example of 'trigger control' for checking theory
would be the response to a W trigger in position P by a
movement of an X. Suppose instead of an X we had a Y move to
P in another language. We could only say that each of these
responses is under the control of some other triggering
property of a functional category. This device is as simple
as it is empty. Since properties are free for the asking, we
can account for a wide class of responses in terms of
functional category analysis by identifying the 'controlling
triggers'. But the word 'trigger' has lost all objectivity
in this usage. Triggers are no longer part of the
independent properties of the construction; they are driven
back into the construction. We identify the trigger when we
hear the response. In short, movability makes X move.

At a recent conference, Kayne said that remnant VP-movement
is so fruitful that the ball is in the camp of its opponents
to show it is wrong. It is indeed quantitatively fruitful in
the sense that, given any input structure, various movements
applied to various categories, triggered by various
features, will eventually correctly arrive at any desired
surface order. It may take many "constrained" steps, but
overall, the system always allows it since triggering
features and categories are inserted at will. But without a
precise indication of what triggers the mechanisms in those
particular cases, the analysis is not very revealing. The
goal is not just to propose a way to derive the correct
order. How it is arrived at is as important. Remnant
movement is a device that 'corrects' structure in a way
almost identical to that of 'predicate raising' in
Generative Semantics. Chomsky (1972: 79) comments on
McCawley's analysis of 'kill' as follows: "In the proposed
underlying structure, John caused Bill to die (or John
caused Bill to become not alive), the unit that is replaced
by kill is not a constituent, but it becomes one by the
otherwise quite unnecessary rule of predicate raising. Such
a device will always be available, so that the hypothesis
that Q is a constituent has little empirical content." As
Chomsky (1970) also remarks in a similar context, it is
difficult to argue against the claim that there exist
relations between some kind of representations of meaning
and of form, and that such an open theory makes it possible
to simulate rules of semantic interpretation in the syntax,
by generating constituents of arbitrary structure in the
base, claim that they are associated with the desired
semantic property, then use a filtering transformation at
the desired point in the derivation to match these arbitrary
structures with the surface structure.

These "engineering solutions" provide a certain degree of
description, but they are not genuine explanations. In fact,
they cannot be explanatory in principle since W-elements are
designed as being ad hoc, and the rest of the theory is
totally geared to them. That is why it is possible to take
every one of the propositions of one of these accounts, take
the contrary propositions, look at languages, and be
incapable to tell which theory is right since the surface
orders are properly ruled in or out just as easily. For
instance, instead of a basic order Spec-Head-Compl, leftward
movement and a restriction that heads adjoin to heads and
phrases to phrases, assume the order Compl-Head-Spec (or any
other), rightward movement and a restriction that heads
adjoin to phrases and phrases to heads. It is just as easy
to get the ordering results of the examples above, as the
reader can verify.

It is impossible to calculate the consequences of the theory
since the key W-elements rest on observational correlations,
not on logical or causal relations. The task has been
reduced to translating lists of the A type to lists of the B
type by suggesting which formal feature appears in which
construction and in which language--in essence then, a
taxonomy of features. This kind of know-how is not very
different from knowing how to change centigrade to
Fahrenheit: it adds little to our intelligibility. However,
it is possible to extricate the study of language from this
descriptivist mood. The solution is to follow the normal
scientific practice of relying on properties which are
logically anterior to those under study, to properties
external to language that determine its functioning.

The brain in which the L-system is represented also contains
a conceptual system with its own properties, and this brain
is set in human bodies that have particular sensorimotor
systems that determine the kind of form which can
participate in the L-system. The L-system would not have the
properties it has if this other aspect of the brain or the
physiology of humans were significantly different: UG is a
mental state shaped by these logically anterior properties.

The role of external systems is acknowledged in principle in
the minimalist program through the notion of interfaces.
However, the proposals stop short when it comes to actually
linking analyses directly to anterior properties. Two
examples will illustrate this. First, consider Chomsky's
attempt to provide external motivation for the property of
"displacement." Very little is actually said about it. In
Chomsky (1995: 317), he only makes the very general comment
that displacement could be motivated by invoking
"considerations of language use: facilitation of parsing on
certain assumptions, the separation of theme-rheme
structures from base-determined semantic relations, and so
on." He goes a bit further in Chomsky (1998): in standard
terms of Deep and Surface structure, "the former enters into
determining quasi-logical properties such as entailment and
theta structure; the latter properties such as topic-
comment, presupposition, focus, specificity, new/old
information, agentive force, and others that are often
considered more discourse-oriented, and appear to involve
the "edge" of constructions.
Theories of LF and other approaches sought to capture the
distinctions in other ways. The "deep" (LF) properties are
of the general kind found in language-like systems; the
"surface" properties appear to be specific to human
language. If the distinction is real, we would expect to
find that language design marks it in some systematic way --
perhaps by the dislocation property, at least in part."

The rationale for separating these two kinds of semantics is
not given, but we may assume that the separation has a
functional role: the distinction is better
expressed/perceived by the language user if each type is
associated with a different position. However, it remains to
be shown that the distinction is real in the first place.

Note that the proposal is very tentative here: there is no
precise external notion from which linguistic properties
could be derived by logical or causal relations. So far, the
only effect of the proposal seems to be that some followers
propose that many external notions of a pragmatic or
discourse nature get syntactified as functional categories.
For instance, in his analysis of adjectives, Cinque (1994)
assumes functional categories such as Quality, Size, Shape,
Color, Nation, Speaker-oriented. Beghelli & Stowell (1997)
suggest that different scopings of various QP-types
correspond to different Spec positions of functional phrases
such as WhP, NQP (negation), Distributive Phrase, RefPhrase,
Share Phrase ("interpreted with "dependent" specific
reference"). Munaro & Obenauer (2000) propose a category
'Evaluative CP' which they say captures "the fact that the
speaker, in the lively expression of a feeling of
surprise/annoyance/disapproval, conveys his personal
evaluation of the event referred to" (p. 22). Adopting
Rizzi's (1997) proposal of a split-CP and suggestions by
Pollock et al. (1999), they also adopt syntactic categories
such as Interrogative Force, Focus, Operator, Topic. This
use of functional categories that extends to discourse
notions like various illocutionary forces and pragmatic
notions such as speaker attitudes is eerily similar to
various performative categories proposed in Generative
Semantics. Several arguments have been presented against
representing illocutionary force or presuppositions of
sentences in their syntactic structure (Chomsky 1970, 1972,
Anderson 1971, Jackendoff 1972, among many). Any attempt to
revive this syntactification should take these objections
into account in order not to repeat past errors.

Second, consider Kayne's use of the Linear Correspondence
Axiom. The LCA approach is a step in the direction of
linking the analytical apparatus directly to interface
elements, to motivate it on external grounds. The general
idea is that the articulatory apparatus of human beings
which produces the sounds of language has physical
properties which forces strings of sounds forming words to
be produced sequentially: words occur in an irreversible,
asymmetric temporal sequence. Syntactic hierarchical
structure derives from attributing a functional significance
to the adjacency of words: adjacency then translates
structurally into sisterhood (an idea already present in
Tesni�re 1959:20). The proposal is appealing since it
crucially relies on a very salient property of languages
which derives from the physiology of humans. However, the
external condition only forces two elements like a head and
a complement to be ordered: it does not single out one
particular order. Yet Kayne assumes that the order Head-
Complement is universal.
In order to have PRECEDE as the crucial relation rather than
FOLLOW, Kayne must postulate an abstract node A for every
phrase marker, with A asymmetrically c-commanding every
other node. The terminal element dominated by A is a, an
abstract terminal that precedes all the other terminals.
Therefore, a crucial element of the ordering component of
the analysis does not arise from an observational
proposition about the physical properties of the AP system.
Instead, this is an abstract element with no physical
reality in the phonetic output. Moreover, this abstract
element a is ad hoc: its existence and the stipulation that
it precedes all other terminals serve no other purpose but
to introduce an organizing concept that allows a correlation
to be made between precedence and asymmetric c-command.

The proposals of both Chomsky and Kayne do not yet provide a
way to rely on properties which are logically anterior to
those under study. However, strong external motivation is
available on which to base explanations and go beyond
description. A famous example from the past may helps us
find a path towards a solution.

The variation under discussion above concerns how units
combine in the syntax, how relations between these units may
be expressed differently in the surface forms. But variation
also arises in the expression of the individual units: the
form used to express a particular semantic unit like APPLE
varies considerably from one language to another. As
Saussure put it, a meaning unit is arbitrarily paired with a
form. This is hardly discussed anymore because it is
considered more or less as a solved problem. Interestingly,
the reason is that the solution is based on external,
anterior properties. Adding to the interest for the problem
at hand, Saussure came up with his proposal as he tried to
bring some intelligibility to comparative studies.

For decades, comparatists had established similarities
between languages strictly on the basis of what they called
phonetic laws. Saussure wanted to know what properties
language must have in order for forms to be able to obey
strictly phonetic laws, without any relation to meaning. His
answer is that the relation between meaning and form must be
arbitrary. Extrapolating, we can say that the laws are
"natural" in the sense that they derive from elementary
conditions imposed by our phonatory system: changes occur
between sounds which are closely relation in their
articulation. The arbitrariness of the meaning-form pairings
derives from the fact that the forms must come from our AP
system in order to be usable and that the sounds produced by
our phonatory articulators are such that they cannot have a
meaningful, iconic relation with the meanings to be
expressed. As long as it satisfies elementary conditions of
articulation, any form is a valid 'signifiant' for any
particular 'signifi�'. Moreover, since the meaning-form
pairings are arbitrary, the sign must be constant,
conventionalized. In short, the kind of variation found at
the level of units is delimited by arbitrariness and
conventionalization of the sign. This in turn follows from
the physiological makeup of human beings, hence from
properties logically anterior to linguistic theory.

If we look in a similar fashion at variation in how
relations between units may be expressed in the surface
forms, we see that a similar external basis may be found for
the variation. It so happens that our sensorimotor system
provides diverse means to encode the fundamental associative
function which is required in the recursive system to obtain
semantic combination. As indicated in Bouchard (1996), this
gives rises to arbitrariness which is simply another facet
of Saussurean arbitrariness. In a typical task of Grammar,
there are two units of meaning A and B, with a relation that
holds between them, and this must be given a perceptual
form. Having associated each of A and B with a form, there
are four ways to physically indicate that a relation is
being established between the two elements, and language
uses all four.

JUXTAPOSITION: A and B are ordered temporally next to one
another, deriving the structural relations of sister and
immediately contain.
SUPERIMPOSITION: B is a modulation superimposed on A, such
as intonations to express affirmation, question, exclamation
in English, or lexical meanings and grammatical functions in
tone languages.
MARKING of A: the dependent gets a marking, such as Case
marking for an argument.
MARKING of B: the head gets a marking, such as predicate
marking in polysynthetic languages.

Linearization is but one of these modes of coding relations
(Juxtaposition), and the choice among them is arbitrary, and
conventionalized once it is made (cf. order vs. morphological
Case marking to express grammatical relations). This is so
because any of these modes can satisfy the requirement to
encode semantic a relation in a sign. When Juxtaposition is
chosen, whether A precedes or follows B is also arbitrary,
and so must be conventionalized. Thus, a head parameter is
optimal since it derives from external properties. All that
is required is the necessary rule MAP, which maps a unit or
relation of meaning with a unit or relation of form,
respectively: any of the available forms are on a par. A
head parameter, i.e., juxtaposition and conventionalization
of an order, is just one of the forms to express a relation.

A theory based on fixed positions and movement stipulates
that a particular mode of coding, i.e., Juxtaposition, is a
property which all languages have. I suggest as a universal
the 'choice' among the modes of coding which derive from
unavoidable properties of the articulatory-perceptual and
the conceptual-intentional systems of human beings. This is
a way out of lists and taxonomies which weaken a theory. An
analysis strongly based on logically anterior properties
accounts for facts not by adding to the theory, but by
restricting its axiomatic notions. This strengthens the
theory. It leads to more fruitful accounts of linguistic
properties, beyond engineering descriptions. The hypothesis
is that this kind of analysis based on external properties
can extend to microparametric variations like those
described by Kayne. This, however, is not the place to
attempt such a demonstration.

Denis Bouchard is a professor at the Universit� du Qu�bec �
Montr�al. His research interests include syntactic theory,
French syntax, comparative syntax, and lexical semantics.
Collaborative work on Sign languages has lead him to take
fully into account the fact that language is a physical
activity and that this partly determines the nature of
Universal Grammar.


