  1. Simon Musgrave, Word: A cross-linguistic typology

Message 1: Word: A cross-linguistic typology

Date: Thu, 11 Sep 2003 18:18:59 +0000
From: Simon Musgrave <>
Subject: Word: A cross-linguistic typology

Dixon, R.M.W. and Aikhenvald, Alexandra, eds. (2002) Word: A
cross-linguistic typology Cambridge UK: Cambridge University Press
ISBN 0 521 81899 0 (Hardback) ppxiii/290

Announced at

Simon Musgrave, Monash University

This book collects some of the papers presented at a workshop in 2000.
It consists of an introduction by the editors which sets out a
framework for identifying words cross-linguistically, nine papers each
using that framework to deal with data from a specific language or
family of languages, and a concluding chapter. Indexes of authors, of
languages and families, and of subjects are included.

Chapter 1 (R.M.W. Dixon and Alexandra Aikhenvald, D&A) proposes a
framework for the typological study of word. D&A survey previous
approaches from the 1930s on, contrasting those which accept word as a
valid unit of analysis with those which reject it entirely. They also
draw attention to some confusions to be avoided: word and lexeme, and
word and orthographic word. The approach to word which D&A set out
relies on the distinction between phonological word (p-word) and
grammatical word (g-word). A p-word is a unit of at least one syllable
defined on the basis of one or more criteria from three sets of

i) segmental restrictions
ii) prosodic features
iii)constituting a domain for phonological processes.

A g-word is a grouping of grammatical elements which always occur
together, in a fixed order, and have a conventionalized coherence and
meaning. Additional diagnostics which D&A treat as tendencies include:
word-level morphological processes are non-recursive; inflectional
affixes are limited to one per word; pauses may occur between words
but not within them; and words can form complete utterances on their
own. The default, according to D&A, is that p-words and g-words
coincide, but mismatches are possible. The obvious case is that of
clitics, which in D&A's terms (and for many others) are g-words but
not p-words. They form part of p-words which contain two or more g-
words. Examples of other possible mismatches are given: compounds,
which are single g-words but may be two or more p-words; and rare
instances of more complex relations between the two sorts of word. The
chapter concludes with comments on orthography, whether word should be
taken as a universal of grammar, and the social status of the concept
word in speech communities (it is common for languages with no written
form to lack a lexeme for the concept). An appendix illustrates the
application of D&A's criteria to Fijian.

The second chapter is by Alexandra Aikhenvald (A), and proposes a set
of parameters which can be used to situate clitics on a gradient
between more word-like elements and more affix-like elements. The
second part of the chapter illustrates the application of this
approach to Tariana (Arawak, Amazonia). A suggests 15 parameters which
together can give a 'scalar definition of clitics' (43). The
discussion did not make clear to me exactly how this was to be
interpreted. At least one parameter (A -- DIRECTIONALITY) seems to me
to be a matter of discrete categories (proclisis, enclisis and
mesoclisis), and a value must be assigned for every clitic (excepting
sign language). How this parameter contributes to a scalar definition
is therefore puzzling. Also, other parameters seem to be scalar in
themselves (e.g. B -- SELECTIVITY and F -- PHONOLOGICAL COHESION). A
multidimensional scalar variable can be based on several
unidimensional scalar variables, but the mathematical possibilities
rapidly become daunting. Finally, it seems to me to be a weakness of
this approach that it will not easily handle interactions between
parameters. For example, a language might exist in which determiner
clitics occur at the left boundary of nominal constituents, but attach
phonologically to a preceding element. In such a case, selectivity
would be overridden by a phonological preference general in the
language. A seems to treat all fifteen parameters as on one level, and
this sort of effect might therefore be difficult to accommodate.
However, A's approach demonstrates its usefulness as a tool in the
second part of the chapter, where it provides the categories for
comparisons between affixes and clitics, and between various types of
clitics in Tariana.

Chapters 3-5 and 7-10 deal with the word in various languages as set
out here:

Ch Language Author 
 3 Cupik (Eskimo, Alaska) 	Woodbury (W)
 4 Eastern/Central Arrente Henderson (He)
 (Arandic, Australia)
 5 Jarawara (Araw´┐Ż, Amazonia) Dixon (D)
 7 Siouan (N.America)			Rankin, Boyle, Graczyk, Koontz
 8 Dagbani (Gur, Ghana) Olawsky (O)
 9 Georgian (Kartvelian, Georgia) Harris (Ha)
10 Modern Greek (IndoEuropean, Greece) Joseph (J)

All of these chapters, except 10, use the approach set out in chapter
1. I give brief summaries of each below, with a little more detail for
chapter 10 which raises important theoretical questions. Chapter 6 is
also discussed in more detail after these summaries; it deals with the
concept of word in sign languages and is quite distinct in approach.

Woodbury (W) argues that inflectional morphology provides an adequate
definition of g-words in Cupik, with only one complication in which
phrasal units (not compounds) can be treated as bases. W also argues
that stress assignment rules define the p-word in this language.
However, stress assignment and other phonological processes operate
differently where enclitics are present. W therefore proposes that the
maximal domain of lexical phonology, in this case a g-word plus
enclitics, should be considered the p-word. This necessitates the use
of another category, which W calls PW- to account for the g-word
phonological domain.

He's discussion of Easter/Central Arrente is not always easy to
follow, as it is based on Breen's analysis of the language having
underlying VC(C) syllables (see Breen and Pensalfini 1999). The
relation between underlying syllabification and surface patterns is
not easy to understand. Nevertheless, He establishes that various
phonological processes do operate in a domain which can be identified
as p-word. He admits that 'there is no simple definition of
grammatical word in ECA' (107), but describes a range of mismatches
between g-word and p-word. Of particular interest are complex
predicate structures, which often behave as two g-words (intervening
material is sometimes allowed), but one p-word for most processes.
Some evidence though points to the presence of two p-words, and He
suggests that it may be necessary to allow for a higher level p-word,
consisting of two simple p-words.

The most interesting point in D's discussion of Jarawara is that the
language has no clitics. Also of interest are the complexities of the
suffix system of this language. The interaction of lexical verbs,
suffixes and auxiliary verbs (which follow the main verb) is complex
and fascinating, and D's description sets out the details with great
clarity. The issue which is raised by this complexity is whether the
whole predicate structure (main verb, auxiliaries and associated
morphology) should be treated as a single g-word. D argues that the
definition of g-word in Jarawara is straightforward and the relevant
criteria rule out the analysis of predicates as single g-words. On
this basis, the mismatches between p-word and g-word in this language
are limited, with three possibilities for two p-words to match one g-
word (including reduplication and compounding), and one case where two
g-words form a single p-word.

RBGK present data from various Siouan languages and introduce an
explicitly diachronic thread in their discussion (in chapter 2, A has
some discussion of the development of clitics and affixes from free
forms, and in chapter 10, J discusses the relevance of the notion word
in analyzing diachronic processes). They argue that D&A's criterion of
fixed order within the g-word is problematic for at least some Siouan
languages, where variation in the order of morphological material
(especially locative prefixes and incorporated nouns) precludes any
templatic analysis. Some cases also seem to allow the possibility of
recursive morphological processes within a g-word. P-words are less
problematic: the domain defined by primary accent can be taken as a p-
word, although this can include an entire incorporated relative clause
in Crow and Hidatsa, and exact boundaries between words may not be
clear. Enclitics are also a subject of debate for Siouan languages,
with some authors treating them as affixes. Of especial interest here
is RBGK's discussion of the work of Boas and Deloria (1940). Ella
Deloria was a native speaker of Dakota, and the orthography of the
Dakota Grammar, which reflects her intuitions, presents a different
and not fully consistent view of the phenomenon. RBGK suggest that
synchronic complexity must be viewed from a diachronic perspective,
and that progress towards identifying words can be made on this basis,
even in complex and inconsistent language systems.

P-words in Dagbani are unproblematic, according to O. The domain of
stress assignment can be identified with the p-word, and other
phonological processes also take this as their domain. Morphological
criteria suffice to establish g-words in most cases, but compounds
raise some problems. Stress treats compounds as a single unit (i.e. a
p-word), but the elements are separate domains for vowel harmony, and
lexical tone is maintained. This suggests that an analysis such as
that He proposes for ECA in chapter 4, with two levels of p-word,
might be useful here also. Adjectives cause two further complications.
Firstly, there are lexicalized noun-adjective compounds with
fossilized meaning, but there are also similar structures, used
productively with compositional meaning. O argues that as the
properties of the two are identical, they must be analyzed
identically. Secondly, Dagbani has a small class of bound adjective
morphemes. These cannot occur in isolation, but take number inflection
like regular adjectives. O argues that they cannot be clitics, on the
grounds of their morphological complexity, but also that they are not
words or affixes. O's discussion of this class is not entirely clear.
In one place he appears to say that the whole construction forms a
single p-word and a single g-word, but elsewhere says that they
'display a significant mismatch between phonological and grammatical
word: while [bound adjectives] are grammatical words, they cannot
constitute a separate phonological word' (216). Dagbani also has a
range of clitics whose properties vary tested against a set of
properties (roughly a subset of A's parameters, but the correspondence
is not exact). Some psycholinguistic experiments on the acceptability
of pseudowords are briefly mentioned in an appendix; fuller discussion
of this material would have been valuable.
Ha's account presents Georgian as a language which fits extremely well
into the model proposed by D&A. Cohesion, the fixed order of elements,
and conventionalized coherence and meaning are reliable criteria for
identifying g-words. Only one morphological process (circumfixing)
seems problematic, with cases of circumfixes not surrounding all of
the material within their semantic scope, and with a bracketing
paradox in ordinal number formation. P-words can be identified as the
domain of primary stress, with additional weak phonological evidence
for the location of boundaries. Georgian also has a variety of
clitics. Aside from these, the only mismatch between p-words and g-
words noted by Ha is the case of the compounds which can be identified
as one g-word because of the lack of case marking on the first
element, but which bear two stresses.

In his discussion of the word in Modern Greek, J questions the
assumptions of D&A's model from chapter 1. J argues that word and
affix are a sufficient inventory of morphological entities, the
argument ultimately rests on Ockham's Razor. Having done away with
clitics, J also questions the value of p-words as a unit of analysis:
'if [little elements] are inflectional affixes, then much of what
might be called a 'phonological word' is simply created by regular
word-formation and inflectional processes' (248-9). J discusses
several types of evidence (nasal-induced voicing, irregularity, vowel
insertion, and stress) and concludes in each case that a) the evidence
does not support the cliticization analyses argued for by previous
researchers, and b) that the evidence is at least not inconsistent
with an analysis limited to words and affixes. In one case
(nasal-induced voicing), J's examination of the data reveals a
difference between two classes of 'little elements' (weak pronouns and
possessive pronouns) which are formally identical and have therefore
traditionally been taken together. In another case (vowel insertion),
he shows that correct statement of the generalization provides
confirmatory evidence of the affixal status of the weak pronouns. In
this fashion, J certainly makes the case that adoption of an analysis
which uses clitics may sometimes be an easy way out. However, he
admits that stipulations are required in order to make the more
rigorous approach work, and that both words and affixes must be
assigned degrees of typicality to capture their varied behaviour. This
is justified for J because a) the typicality scale is independently
necessary, and ii) even when clitics are allowed as part of the
system, stipulation is still necessary. Having reached this point, an
interesting question is how this approach compares to that of A in
chapter 2, and I return to this below.

In chapter 6, Zeshan (Z) addresses the issue of whether there is a
unit of organization in sign language (SL) which is equivalent to the
word. She suggests that signers recognize the sign as a salient unit
of language, and that equivalence between the sign and the word can
usefully be assumed. Z argues that the concept of p-word can also be
safely transferred, as signing has a temporal dimension. However, she
notes that the characteristics relevant to identifying units in this
organizational dimension of SLs have not yet been worked out. The
criteria D&A suggest for the identification of g-words are of limited
applicability to SLs, due to the nature of the languages. The elements
of a sign are produced simultaneously, therefore they must be
cohesive. But this property is due to the medium and cannot be taken
as evidence for grammatical status. Simultaneity also rules out order
as a criterion for g-words. However, D&A's third criterion,
conventionalized meaning, does apply. Z suggests that the property of
simultaneity minimizes the possibility of mismatches between the two
levels of organization in SLs: 'the question of word boundaries hardly
ever arises because each sign, simple or complex, is a self-contained
unit'(161). This does not rule out compounding and cliticization, and
Z presents evidence that both of these phenomena exist in SLs. More
serious challenges to standard concepts of the word come from the
properties of signs, according to Z. She discusses two problems which
arise from the gestural medium. Firstly, the two hands can act
independently making simultaneous signs (the signs are not truly
simultaneous, as Z emphasizes, rather one hand holds a position while
the other hand moves). How can we conceive of the relation between p-
words and g-words in this case? Z suggests (and I agree) that the
descriptive model used in the book is not adequate for this problem.
The second issue Z raises is that of iconicity. She argues that a
large part of the vocabulary of SLs is iconically motivated, that is,
the sign contains at least some component whose meaning is iconic.
Whilst phonaesthetic phenomena in spoken language can be treated as
marginal (but see Klamer 2002 for a recent view), iconically motivated
signs make up 50% of the vocabulary of some SLs and therefore must be
seen as a central part of these languages. The sub-category of iconic
signs which Z sees as most challenging to linguistic theory is the
class of signs which are partly iconic, that is, there is a sublexical
unit whose iconic motivation can be identified, but it cannot be
analyzed as a morpheme. Families of such signs can be identified with
meaningful elements in common but without morphological connections. Z
suggests that such SL phenomena throw in doubt central concepts in
linguistics, such as that of arbitrariness and double articulation,
and also many views of morphemes and phonemes.

The final chapter of the volume, by P.H. Matthews(M), is a summary and
a revisiting of the main issues raised. M takes the Latin grammarians'
view of the word as a starting point and argues that wordhood was not
problematic in Latin, but that the extension of the ancient
grammarians' views has been problematic. M then considers whether
linguists should take word as a universal concept, or a concept with
theoretical standing, and suggests that neither move is very useful.
He also revisits the issue of clitics, questioning how much different
linguists' use of the term have in common, and concluding that it is
used whenever sentences cannot be divided into words and words into
roots and affixes. M closes by querying whether Z's doubts in chapter
6 about dual articulation as a feature of language are as serious as
she thinks, arguing that the difference in medium imposes different
requirements, and might be expected to result in different levels of
redundancy being necessary for effective communication.

All the papers in this volume are valuable and interesting. The
majority of them also present large quantities of fascinating data. On
that basis, the book will amply reward any linguist who reads it. But,
having said that, I want to express some reservations about the
structure of the book and about what it does and does not contain.

Firstly, the format of the volume clearly reflects its genesis. The
papers were presented at a very focused workshop, with the first
chapter serving as a guiding template for contributors. This has the
virtue of making comparison across the datasets straightforward. But I
don^'t think that it is accidental that the two most stimulating and
challenging contributions (those of Zeshan and Joseph) are the two
which do not take the first chapter as a road map. Taking a set of
assumptions and seeing how far they are useful when confronted with
varying data is a valuable exercise, but being made aware of where the
assumptions fall short and where alternative assumptions might be more
useful is even more valuable, and this element is not prominent in
most of the book.

Secondly, the title of the book (Word: a cross-linguistic typology)
does not match the contents well to my mind. What the book contains is
a framework which might be useful to construct a typology, and a large
quantity of wonderful data that would assist in testing a typology.
But the synthesizing overview that would justify the title is missing.
What types of language can be established on the basis of the approach
to words developed here? What factors correlate or interact, both at
the level of the word and in its relation to the wider language
system? These are the sort of questions to which I would hope to find
answers in a book with this title, but I did not.

Lastly, it is disappointing that the one issue on which (some)
contributors differ strongly could not be presented more as an
exchange of views. The opinions of Aikhenvald (and most of the other
authors, I suspect) and Joseph are strongly opposed on the status of
clitics. A was aware of J's contribution, but only mentions it very
briefly in her chapter and without addressing J's epistemological
arguments. As far as their view of the data is concerned, I think that
the two have a good deal in common: both would agree that in between
clear cases of words and clear cases of affixes there is a 'messy
reality' (J), and they agree that it is possible to impose some scalar
arrangement on that messy reality. But that still leaves the question
of whether to segment the mess in two places or only in one, and the
theoretical and methodological implications of this issue are
important. J justifies his position with one methodological stance,
and it would have strengthened the volume greatly if this challenge
had been taken up by A or another contributor.

The standard of editing of the volume is high, with only a few
typological errors. There, is however, what I can only interpret as a
slip of the keyboard on p31, where D&A give a surprising version of
the hierarchy of phonological units.


Simon Musgrave is an Australian Research Council Postdoctoral fellow
at Monash University. He works in the project Endangered Maluku
Languages: Eastern Indonesia & the Dutch Diaspora. Besides
Austronesian languages, his interests include language typology,
constraint-based formalisms, language contact phenomena and data
management practice for linguists.
