Publishing Partner: Cambridge University Press CUP Extra Publisher Login

New from Cambridge University Press!


Revitalizing Endangered Languages

Edited by Justyna Olko & Julia Sallabank

Revitalizing Endangered Languages "This guidebook provides ideas and strategies, as well as some background, to help with the effective revitalization of endangered languages. It covers a broad scope of themes including effective planning, benefits, wellbeing, economic aspects, attitudes and ideologies."

We Have a New Site!

With the help of your donations we have been making good progress on designing and launching our new website! Check it out at!
***We are still in our beta stages for the new site--if you have any feedback, be sure to let us know at***

Review of  Linguistic Evidence

Reviewer: Elke Gehweiler
Book Title: Linguistic Evidence
Book Author: Stephan Kepser Marga Reis
Publisher: De Gruyter Mouton
Linguistic Field(s): Applied Linguistics
Computational Linguistics
Discourse Analysis
Linguistic Theories
Text/Corpus Linguistics
Subject Language(s): Dutch
German, Old High
Issue Number: 17.1540

Discuss this Review
Help on Posting
EDITORS: Kepser, Stephan; Reis, Marga
TITLE: Linguistic Evidence
SUBTITLE: Empirical, Theoretical and Computational Perspectives
SERIES: Studies in Generative Grammar 85
PUBLISHER: Mouton de Gruyter
YEAR: 2005

Elke Gehweiler, Freie Universität Berlin and Berlin-Brandenburgische
Akademie der Wissenschaften


The volume 'Linguistic Evidence', edited by Stephan Kepser and
Marga Reis is based on the conference 'Linguistic Evidence.
Empirical, Theoretical, and Computational Perspectives' that took
place in Tübingen from January 29 - February 1, 2004. It contains a
short introduction by the editors and 26 papers.


The introduction discusses several issues related to linguistic
evidence. As the central objects of linguistic enquiry -- ''language,
languages, and the factors/mechanisms systematically (co-) governing
language acquisition, language processing, language use, and
language change'' (1) -- cannot be directly accessed, they have to be
reconstructed from the manifestations of linguistic behaviour. As there
are many possible data types, e.g. introspection, corpus data, data
from (psycho-) linguistic experiments, synchronic vs. diachronic data,
typological data, neurolinguistic data, data from first and second
language learning, data from language disorders, gaining linguistic
evidence from the potentially available data is no trivial matter.
Linguistic evidence is quite a new topic of linguistic discussion. Until
the mid nineties there were largely two ways of gathering data.
Generativists largely relied on introspective data, whereas non-
generative linguists relied on informally gathered corpus data. But this
has begun to change. The authors attribute this turning point to the
book by Schütze (1996), who demanded a systematic approach to
speaker judgements. Since then, many scholars have shown that it is
necessary to control the many factors that influence speaker
judgements in order to obtain more reliable data. Furthermore the size
and availability of corpora has grown since the mid nineties, and with it
the importance of corpora as a source of evidence. Both
developments, Kepser/Reis claim, have paved the way for a
rapprochement between introspective and corpus linguistics and ''[i]t is
one of the main aims of this volume to overcome the corpus data
versus introspective data opposition and to argue for a view that
values and employs different types of linguistic evidence each in their
own right. Evidence involving different domains of data will shed
different, but altogether more, light on the issues under investigation,
be it that the various findings support each other, help with the correct
interpretation, or by contradicting each other, lead to factors or
influence so far overlooked. This ties in naturally with the fact ... that
there are more domains and sources of evidence that should be taken
into account than just corpus data and introspective data.'' (3).

In the first article 'Gradedness and Consistency in Grammaticality' Aria
Adli argues for graded grammaticality judgements. Adli criticises the
fact that in theoretical studies questionable introspective judgements
are quoted without prior empirical verification. One of the examples
Adli discusses in detail is the case of the 'que' --> 'qui' rule in French,
which is much cited in syntactic theorising. It essentially states that ''an
ECP [Empty Category Principle-EG] violation can be avoided in
French if 'qui' is used instead of the usual complementizer 'que' in
sentences where a wh-phrase has been extracted from the subject
position'' (7), and that there are clear differences in grammaticality
between such sentences with 'qui' and 'que'. Using data from a
controlled experiment with a graded concept of grammaticality Adli
shows that the 'que' --> 'qui' rule is largely a myth and suggests that
instead psycholinguistic factors are responsible for the differences in
(un)grammaticality of different sentence types containing these forms.

Katrin Axel's paper 'Null Subjects and Verb Placement in Old High
German' deals with Old High German (OHG) time and weather
expressions without the quasi-argument 'iz' ('it') and with constructions
where a referential subject is not overtly realised. Using three major
prose texts as her empirical basis, she shows that earlier OHG (8th
and 9th century) allowed genuine pro-drop and should therefore not
be classified as a semi pro-drop language. Her data show that null
subjects are (largely) restricted to root clauses in early OHG, which
are distinguished from subordinate clauses by the position of the finite
verb (verb-first/verb-second vs. sentence-final/sentence late). She
claims that this main/subordinate asymmetry can be accounted for if
we assume that null subjects are only licensed in post-finite position,
i.e. ''it is highly plausible that null subjects are only licensed in
configuration [sic] in which they are c-commanded by a leftward
moved finite verb: [V+AGR]k [pro ... tk]]. In OHG, the only way to
obtain the required configuration for null-subject licensing is verb
movement to C0'' (34). Axel further suggests that the distribution of
null subjects is influenced by morphological factors. In OHG there
were two alternative verb endings in the 1st person plural: a short '-m'
and a long '-mês'. Pronouns occurring with the short variant are
virtually always overt but frequently omitted with the long ending, but
only in post-finite position. Axel claims that although the Latinised
writing tradition may have had a certain impact, the widely-held
assumption that the omission of referential subject pronouns in earlier
OHG is a foreign feature cannot be upheld as it fails to explain why
null subjects were largely banned from pre-finite environments and
from contexts with 1st person plural endings in '-m'. Modern Standard
German does not allow referential pro-drop anymore, despite its
comparatively 'rich' verbal inflection. Referring to Sprouse and Vance
(1999) Axel argues that the replacement of null subjects by overt
pronouns needs not be related to any grammar-internal changes, but
rather to differences in parsing success, based on the assumption that
utterances with null pronouns are more difficult to parse. Axel finally
argues that the case of the OHG null subjects puts into doubt the
assumed incompatibility of referential pro-drop and verb second.
Neither does it confirm the relation between morphological richness
and null subjects.

The authors of 'Beauty and the Beast: What Running a Broad-
Coverage Precision Grammar over the BNC Taught Us about the
Grammar - and the Corpus' (Timothy Baldwin, John Beavers, Emily M.
Bender, Dan Flickinger, Ara Kim, Stephan Oepen) argue for a hybrid
approach to grammar engineering (referring to Fillmore 1992). After
reviewing some of the arguments for and against corpus data and
introspective data they present their methodology for building a broad
coverage precision grammar. In a first step they apply English
Resource Grammar (ERG) to a sample of the BNC. The grammar was
able to generate at least one parse for 57% of the sentences. The
43% that did not receive a parse were diagnosed and classified
manually. The authors distinguished seven categories of parsing
failure, which either represent gaps in the grammar (''missing lexical
entry'', ''missing construction'', ''fragment''), are due to preprocessing
errors or parser resource limitations, or represent noise
(''ungrammatical string'', ''extragrammatical string''). They then discuss
these categories further, and explain why the respective sentences
could not be parsed. Missing lexical entries for example fall into two
basic categories: missing lexical types for a given word token (e.g. the
grammar contains the noun 'table', but not the verb) and missing
multiword expressions. The authors argue that combining the two
sources of linguistic evidence - using corpora as primary source of
data, and enhancing and expanding that data with native speaker
judgments - can be of much use to grammar developers. The corpus
provides linguistic variety and authenticity, revealing new syntactic
constructions, which can then be analyzed with the grammar. Here,
insisting on a notion of grammaticality helps to recognise and
categorise the noise in the corpus. According to Baldwin et
al. ''precision grammar engineering serves both as a means of
linguistic hypothesis testing and as an effective way to bring new data
into the arena of syntactic theory'' (64).

In 'Seemingly Indefinite Definites' Greg Carlson and Rachel Shirley
Sussmann use experimental and non-experimental methods to show
that there is a sub-class of English definite articles which in their
interpretations are similar to indefinite articles, such as 'the' in ''Mary
went to the store'', where the identity of the store is not especially
important, in contrast to 'the' in ''Mary went to the desk''. First, the
authors show that weak definites have the same distributional
properties as bare singular count nouns (''He was in bed''). They are
lexically restricted, i.e. it is a lexical feature of the noun itself that
determines whether it can function as a bare singular/weak definite,
they do not allow any modification, a certain degree of semantic
enrichment is added to them, they only co-occur with lexical items of
certain classes, and their distributional properties preclude application
of the usual tests for definiteness/indefiniteness. In the second part of
the paper Carlson/Sussman show that experimental evidence
supports the existence of a separate class of weak definites. For their
experiment they selected six nouns that often function as indefinite
definites and matched them with comparable regular definite nouns
(e.g. ''After she finishes her breakfast, Lydia will read the newspaper''
vs. ''the book''). Each noun was put into a sentence containing a verb
that was known to support the indefinite definite reading. For each
sentence pair a visual context was created which depicted the scene
just before the action depicted in the sentence is carried out. The
participants saw this scene on a computer screen, while they heard a
spoken version of the sentence. They then had to choose the item on
display that they thought was most likely to be involved in the
upcoming action. In addition, their eye-movement was monitored while
they were listing to the sentence. Both target choice and eye
movement supported the existence of two separate classes of

Sonia Cyrino and Ruth Lopes ('Animacy as a Driving Cue in Change
and Acquisition in Brazilian Portuguese') use both diachronic data and
data from language acquisition to show that a feature that was
relevant for a change in Brazilian Portuguese is still operative in
language acquisition. Looking at historical data they first discuss the
grammatical change in object constructions where the 3rd person
neuter clitic 'o' is gradually replaced by a null element, leading to a
change in the grammar. They then go on to examine the present-day
acquisition of the null category, arguing that this shift became critical
for language acquisition, cuing a new grammar, and that it was the
semantic features of the antecedent that were the driving cue and
played a role in the acquisition of the object pronominal paradigm in
Brazilian Portuguese. The more general theoretical conclusions they
draw from this is that firstly, ''we may take cue-based theories
seriously and try to show how a cue can be operative after a change
occurred in a language, explaining the change itself'' (102), and
secondly, that this ''places some questions about acquisition proper
within the generative framework'' (102).

In 'Aspectual Coercion and On-line Processing: The Case of Iteration'
Sacha DeVelle discusses the phenomenon of iteration, which is a
prime example of aspectual coercion. Iteration ''describes the
encoding of a series of repetitions within a given situation'' (106). The
iterative interpretation is enhanced by the semantic punctual feature
of point action verbs ('jump'), which can reflect a single act ('dive') or
an iterative act ('knock'). Two studies (Piñango, Zurif, and Jackendoff
(1999), using a cross modal lexical decision (CMLD) interference task;
Todorova, Straub, Bedecker, and Frank (2000) using a reading time
task) have shown that if a point action verb is combined with the
durational adverbials 'for' or 'until' (e.g. ''The girl dived in the pool for
five minutes'') there is an increased processing load, which is
demonstrated by longer reaction times and emerges at or just after
the durational adverbial. The authors of both studies argue that this is
evidence for an enriched compositional operation. DeVelle however
argues that the processing differences between activity verbs and
point action verbs may also be due to the sentence stimuli used in the
two studies. A repetition of Piñango et al.'s (1999) study showed one
significant difference from the original study: the point
action/durational adverbial sentence pairs were overall interpreted as
more difficult to understand and less plausible than their activity
sentence counterparts. DeVelle claims that this may have influenced
Piñango et al.'s findings.

Studies on child language acquisition have argued that the acquisition
of epistemic expressions begins between two-and-half and three
years of age, but that epistemic expressions remain very rare until 4;5
(year; month) or later. Experiments have however shown that the
linguistic epistemic system is not fully understood until the age of 8;0
or later, and that weak epistemic expressions like 'können'
or 'vielleicht' are still not understood by 6- and 7-year-olds. These
findings suggest that children understand (weak) epistemic terms
much later than they begin to use them. In 'Why Do Children Fail to
Understand Weak Epistemic Terms? An Experimental Study' Serge
Doitchinov presents the results of two experiments he has conducted
in order to find out whether children's late understanding of epistemic
terms is related to the development of their ability to understand
epistemic uncertainty (inference based hypothesis) or to their ability to
recognise scalar implicature (implicature based hypothesis). His first
experiment consisted of three tasks: (i) the 'modal expression task'
which investigated to children's ability to understand weak epistemic
expressions correctly; (ii) the 'implicature task', to assess the
children's understanding of scalar implicatures; and (iii)
the 'interference task' which examined their ability to deal with
epistemic uncertainty. The second experiment was conducted to
further assess the children's ability to recognise scalar implicatures.
The results of the two experiments suggests that the acquisition of
epistemic terms depends on the development of children's ability to
understand epistemic uncertainty; this ability seems not yet fully
mastered by eight years of age. Doitchinov argues that younger
children's capacity to use weak epistemic terms is limited. They
probably first use weak epistemic terms only in very familiar situations -
this does not contradict previous claims. According to Doitchinov the
results however also suggest that they have difficulties in inferring
epistemic possibility, and that they occasionally overgeneralise the
use of strong epistemic terms in their talk.

Linguistic descriptions of negative polarity items agree that the
occurrence of polarity items is licensed by semantic and/or pragmatic
properties. Furthermore it was argued that a negative polarity item is
only licensed if it occurs in the scope of a negator (cf. e.g. Haegeman
a. Kein Mann, der einen Bart hatte, war jemals glücklich. 'No man who
had a beard was ever happy'
b.*Ein Mann, der einen Bart hatte, war jemals glücklich. 'A man who
had a beard was ever happy'
c. *Ein Mann, der keinen Bart hatte, war jemals glücklich. 'A man who
had no beard was ever happy'

The paper 'Processing Negative Polarity Items: When Negation
Comes Through the Backdoor' by Heiner Drenhaus, Stefan Frisch and
Douglas Saddy presents the results of two psycholinguistic studies
(acceptability speeded judgment tasks and event-related brain
potentials (ERPs)). They have used structures such as in (1) to
examine the specific lexical properties of a negative polarity items
like 'jemals' ('ever') and the licensing conditions that are due to
hierarchical constituency. Both experiments confirmed that there are
two licensing conditions for negative polarity items: the
semantic/pragmatic, and the structural/syntactic condition. Both
experiments however also showed that violation with inaccessible
negation ((1c) was more often accepted as correct than violation
without negation (1b)), indicating that the negator is (wrongly) used to
license the polarity item even if it is not in a c-commanding position.
Drenhaus et al. claim that this might be due to a ''competition between
semantic/pragmatic information and hierarchical constituency'' (159),
but that further systematic investigations of polarity constructions are

Veronika Ehrich's paper 'Linguistic Constraints on the Acquisition of
Epistemic Modal Verbs' discusses constraints on the acquisition of
epistemic modal verbs (MVs) in German. Ehrich first gives a detailed
description of the relevant semantic and syntactic properties of
German MVs and reviews some of the main findings of MV-acquisition
research. She then compares the results of her corpus study to
different competing (psycho-) linguistic approaches to epistemicity in
language and language development. Ehrich concludes that syntactic
progress, semantic diversification and cognitive development are all
necessary prerequisites for the rise of epistemicity, but none of them
seems to be sufficient by itself.

In 'The Decathlon Model of Empirical Syntax' Sam Featherston
describes a new model of grammar, the 'Decathlon Model'.
Featherston has conducted studies on frequency (based on corpus
data) and studies on grammaticality (based on native speakers'
judgments, using a procedure which ''allowed informants to express all
the differences in ''naturalness'' that they perceive, with no coercion to
a given scale'' (189)). The grammaticality-judgment study has yielded
the following results: (i) judged well-formedness is a continuum - a cut-
off point between well-formed and not well-formed cannot be located,
(ii) each linguistic factor has an effect on well-formedness - more
violations cause a structure to be evaluated worse, and (iii) there are
no 'hard' constraints - no violation excludes a structure from the
grammar. The frequency data shows a different picture. Of the 16
structures tested in the judgments, one occurs once in the corpus (the
one judged second best), one occurs 14 times (the one judged best);
the remaining 14 structures do not occur at all. This shows that the
two data types are in fact not measuring the same factor and that
relative judgments say nothing about the probability of occurrence of a
structure. Featherston then introduces the Decathlon Model, which is
supposed to be both ''an outline architecture of a grammar and at the
same time an account of the differences between data types'' (196).
The Decathlon Model's 'Constraint Application' module ''applies
constraints, assigns violation costs, and outputs form/meaning pairs,
weighted with violation costs'' (197). These form/meaning pairs are
then sent to the 'Output Selection' module, which basically contains
the grammar and which selects the best candidate for output. The
existence of these two modules explains the different results for the
different data types: With judgments, what is returned is the output of
the Constraint Application function, whereas frequency measures
measure the output of the Output Selection module. Featherston then
goes on to discuss the advantages of the Decathlon Model over other
theories of syntax, the notion and the nature of well-formedness, and
the implications of his findings for the choice of data types in syntax.
Here he concludes that the data type for syntax must be relative
judgments: ''Frequency measures give us the same information as
relative judgments about the best (couple of) structural alternatives in
each comparison set, but they give us no information about any of the
others.'' (205) For syntactic theory this means that one has to chose
what one wants to model, as output selection and the grammar are
two separate processes.

In her paper 'Examining the Constraints on the Benefactive Alternation
by Using the World Wide Web as a Corpus' Christiane Fellbaum asks
whether data gathered from the web can give us new insights into
speakers' grammars and serve as evidence for linguistic theories. She
contrasts the constraints for the Benefactive alternation (consisting of
the PP alternant (''Chris bought a cake for Kim'') and the direct object
(DO) alternant (''Chris bought Kim a cake'')) that were formulated on
the basis of introspective data, with the data found on the web. Her
data show that the previously proposed constraints cannot fully
account for the data found on the web, although ''most data fall into
the kinds of patterns that previous researchers have suggested''
(237). Fellbaum e.g. shows (i) that the DO alternant can occur with
verbs of destruction, (ii) that it not necessarily requires
a ''created/prepared/obtained entity that becomes the Beneficiary's
possession'' (222) as had been claimed by other scholars, and (iii)
that there is no ''Latinate Constraint'', i.e. there is ''no restriction on the
Benefactive alternation that can be formulated in terms of etymology
or morphophonological properties of the verb'' (225). She further
shows (iv) that restrictions concerning the Benefactive cannot be
formulated in terms of aspect, and (v) that the constraints that had
been formulated concerning the nominal arguments of the Benefactive
seem to be no ''hard'' constraints. Fellbaum argues that although web
data do not permit us to formulate any hard constraints, two
observations can be made: in the DO alternant, the subject has to
have control over the event, and, unlike in the PP alternant, ''a benefit
is necessarily bestowed, resulting in a change of state of the affected
entity, the Beneficiary'' (237). She concludes with the observation that
constructed data often fails to capture the fuzzy nature of real
constraints and argues that all those grammatical phenomena that
could previously only be studied using one's intuition should now be
re-examined using natural occurring data, i.e. corpus data.

In 'A Quantitative Corpus Study of German Word Order Variation' Kris
Heylen attempts to overcome the limitations of ''traditional'' data
(introspection and ''encountered'' examples) by using a corpus-based
approach to study the word order variation in the German Mittelfeld.
Heylen first discusses the problems with traditional data types for
studying word order variation, arguing that they are unreliable and not
able to deal with gradient and multifactorial phenomena. He then
discusses the advantages of corpora over other data types. proposes
a corpus-based approach, arguing that (i) corpus data is primary data
in linguistics, (ii) corpora gives us easy access to large amounts of
data, (iii) corpus-data reflects gradient effects through relative
frequencies, and (iv) multiple factors can be studied directly by looking
at actual usage data. Heylen then presents the results of a corpus-
based study on word order, where he has examined ''the variation that
occurs when both a full NP-subject and a pronominally realised object
are present in the Mittelfeld'' (244). He takes into account seven
factors that might influence word order, and, using various statistical
models, examines the correlations between word order and these
factors (for each factors separately and for multiple factors
simultaneously). Although his analysis shows that the seven factors
investigated can explain some of the variation (e.g. the strong effect of
clause-type: ''the 'marked' order subject-first is especially common in
subordinate clauses'' (261)), Heylen argues that additional factors
have to be tested in order to be able to fully account for the variation.
He concludes with arguing that the results of the study are ''not yet
explanations'' (261) and that in order to formulate an explanatory
model for the variation corpus-data alone may not be sufficient as it is
only ''part of a whole set of data types that are necessary for sound
empirical language research'' (261).

There are a number of statistical word similarity measures, which are
based on fundamentally different assumption. The paper 'Which
Statistics Reflects Semantics? Rethinking Synonymy and Word
Similarity' by Derrick Higgins presents yet another model - local
context-information retrieval (LC-IR), which ''is based on web search
statistics regarding the frequency with which words appear adjacent to
one another'' (280). Higgins shows that LC-IR outperforms any other
purely statistical model and ascribes this to the fact that as it uses web
data there is no problem of data sparsity, and to the fact that is uses
the parallelism assumption, i.e. it ''predicts that similar words will occur
in grammatically parallel constructions'' (275). Other models, on the
other hand, are either based on the idea that similar words occur near
the same set of other words (the topicality assumption) or that words
occur near those words which are most similar to them (the proximity
assumption). Higgins goes on to discuss the implications his approach
may have for a theory of lexical semantics and acquisition, arguing for
example that grammatical parallelism is a cue used by language
learners to identify words as semantically similar or synonymous.

The paper 'Language Production Errors as Evidence for Language
Production Processes - the Frankfurt Corpora' (Annette Hohenberger,
Eva-Maria Waleschkowski) compares ''slips'' in German Sign
Language (DGS) to ''slips'' in spoken German in order to answer the
question ''which aspects of language production and monitoring are
modality-dependent and which are not'' (287). Using data from a DGS
corpus and a corpus of spoken German, as well as experimental data
from what they call ''the slip experiment'' to supplement the corpus
data, Hohenberger/Waleschkowski show that ''language processing is
basically modality independent'' (300), i.e. the fact that there are
identical types of slips in DGS and spoken German indicates
that ''producing speech and sign proceeds through the same planning
stages and involves the same computational vocabulary'' (300). The
observed differences in slip-types are argued to be related to
differences in information packaging strategies in DGS and spoken

The aim of Mary Aizawa Kato and Carlos Mioto's paper 'A Multi-
Evidence Study of European and Brazilian Portuguese wh-Questions'
is to compare contemporary European Portuguese (EP) and Brazilian
Portuguese (BP) wh-questions using equivalent written corpora as
well as speakers' intuition. They then aim to provide a theoretical
interpretation of the results, using Lightfoot's Principle and Parameters
(PP) model of language change (Lightfoot 1999) as their framework.
Their empirical research showed that there is an intersection of
licensed patters in EP and BP, but that there are also differences.
Compared to what had been found in previous studies, their empirical
study revealed two facts: (i) ''spoken EP does not exhibit VS [verb-
subject - EG] order in non-cleft questions'' (316) and (ii) ''BP VS order
in non-cleft questions is not restricted to unaccusative verbs'' (316).
Kato/Mioto's most important theoretical conclusion is that the VS order
in EP wh-questions reflects the derivation of thetic sentences in

Gerard Kempen and Karin Harbusch ('The Relationship between
Grammaticality Ratings and Corpus Frequencies: A Case Study into
Word Order Variability in the Midfield of German Clauses') compare
the results of a graded grammaticality-study on word order in the
German Mittelfeld (Keller 2000) to data from two corpora. Keller had
found that none of the constraints (C1) Pronominal < Nominal, (C2)
Nominative < Non-nominative, and (C3) Dative < Accusative
are ''absolute'' in that their violation gave rise to extremely low
grammaticality judgments (C1 and C2 were found to have equal
strength, whereas C3 was very weak). If such constraints
were ''psychologically real'', it could be assumed, the differences in
acceptability would be reflected by different corpus frequencies.
Kempen/Harbusch however found that this is not the case: ''a
systematic discrepancy emerged between the frequency counts and
the grammaticality ratings'' (330). The argument orderings that were
rated average or low were absent from the corpora, i.e. ''the
grammaticality judgments tend to be more lenient than the corpus
data'' (337). The authors claim that this discrepancy exists because
what was rated in Keller's study was actually the discrepancy between
the to-be-judged argument ordering and the order(s) licensed by
the ''strict production-based linearization rule'', a mechanism which
yields equivalent output, i.e. ''the grammaticality ratings appear
sensitive to the number and seriousness of violations of the rule''
(342). There seems to be a critical value, the ''production threshold'',
which separates the grammaticality continuum. Structures with
grammaticality values above this threshold will occur in corpora with
moderate-to-high frequencies, all other structures will have zero or
very low frequencies.

In 'The Emergence of Productive Non-Medical '-itis': Corpus Evidence
and Qualitative Analysis' Anke Lüdeling and Stefan Evert use the
German suffix '-itis' to show that the problem of (morphological)
productivity can only be understood when different types of evidence -
quantitative and qualitative - are combined. Medical '-itis' is rule-
based, or categorial, and therefore fully productive, it is originally used
in medical contexts meaning 'inflammation (of)', it is bound and
combined with neoclassical elements denoting body parts
(e.g. ''Arthritis'' 'inflammation of the joints'). Non-medical '-itis' is
similarity-based, and difficult to characterise in categorial terms. Its
meaning can be described as 'doing too much of X'; Lüdeling/Evert
argue that it likely developed from medical '-itis' the meaning of which
was generalised to mean 'illness'. Their qualitative analysis of '-itis'
has shown that there is evidence for two morphological processes
with different properties. Lüdeling/Evert now use corpus data to find
out (i) whether both processes differ with respect to productivity - here
it could be expected that the productivity for the rule-based process
should be higher, and (ii) whether (and how) the productivity of each
process changes over time - here one would expect that ''the
established medical rule-based use of '-itis' does not change over
time, but non-medical '-itis', which is similarity-based and therefore
dependent on the stored examples, can show short-term qualitative
changes as well as changes in productivity'' (356f). They apply and
discuss different statistical models to test the synchronic and
diachronic productivity of both types of '-itis'. The quantitative
properties of the two processes however do not confirm the two initial
hypotheses, which leads Lüdeling/Evert to suggest that
probably, morphological theory does not need to make a distinction
between rule-based and similarity-based processes'' (366).

Wiltrud Mihatsch's paper 'Experimental Data vs. Diachchronic
Typological Data: Two Types of Evidence for Linguistic Relativity'
explores the interaction of perceptual and typological factors in lexical
change, comparing diachronic data (from a database containing paths
of lexical change in the domain of body parts in a sample of over 30
languages) with experimental data from the psycholinguistic literature.
Lucy (1992) and Imai/Gentner (1997) had found that ''the number
marking system may influence the categorisation of entities that are
ambiguous between a classification according to shape and one
according to substance with respect to their shape'' (373). Speakers
of languages with obligatory number marking (e.g. English) tend to
classify according to shape, speakers of languages without obligatory
number marking (e.g. Japanese) tend to classify such objects
according to material. Presupposing that ''lexical change reflects
fossilized categorization processes'' (375), i.e. that concepts are
always conceptualised via existing labels for other concepts and some
of these new concepts get lexicalised, Mihatsch looks at whether the
concepts of EYEBALL, EYELID, EYEBROW, and EYELASH, the words
for which tend to be less stable and change over time (in contrast to
e.g. HAIR, EYE, or SKIN), are conceptualised according to substance
or according to shape in different languages. EYEBALL is virtually
always named on the basis of round objects, whereas in the case of
EYELID, EYEBROW, and EYELASH there are different naming
strategies. EYEBROW, and EYELASH for example can be
conceptualised on the basis of HAIR or WOOL, i.e. in terms of material
(mostly in languages without obligatory plural marking), but also via
their elongated, arc-like shape (in languages with obligatory plural
marking). The results indicate a very strong interaction between noun
type and conceptualisation, and therefore, according to Mihatsch,
point towards ''a moderate version of linguistic relativity'' (381).

In 'Reflexives and Pronouns in Picture Noun Phrases: Using Eye
Movements as a Source of Linguistic Evidence' Jeffrey T. Runner,
Rachel S. Sussman, and Michael K. Tanenhaus first show that native
speaker judgments on binding in picture NPs, i.e. noun phrases
headed by a ''representational'' noun such
as 'photograph', 'picture', 'film', are not solid. Reflexives in picture NPs
lacking a possessor may violate Binding Theory (BT) (e.g. ''John
knows that there is a picture of himself in the morning paper''). These
reflexives have been called logophors (cf. Reinhard/Reuland 1993),
i.e. ''reflexive noun phrases which are not ... subject to structural
Binding Theory, but rather are constrained at least in part by
discourse variables'' (395). Picture NPs with possessors appear to
show the complementary distribution predicted by BT, but two studies
by Keller and Asudeh (2001) have shown that native speakers
accepted equally reflexives and pronouns bound to the subject of the
sentence in examples like ''Hanna found Peter's picture of herself/he''.
The three authors then present the results of an experiment that
investigated the use of reflexives and pronouns in possessed picture
NPs. In the experiment participants had to work with a display and
three dolls, Ken, Harry, and Joe, which each had three pictures, one
of himself and one of each of the others. The participants were then
presented with potentially ambiguous instructions like ''Have Joe touch
Ken's picture of himself''. Thus, participants' target choice provided a
kind of judgment. ''If a participant choose a picture indicating a
particular reading, this means that reading is acceptable or possible.''
(398) In addition to target choice the eye movements of the
participants were being monitored, to see which potential referents
were being considered by them. The authors found that ''pronouns in
picture NPs with possessors are constrained by Binding Theory and
that reflexives are not'' (403), and that ''instead these reflexives
behave like logophors'' (404). Runner et al. furthermore show that
BT ''cannot be viewed as an early filter that constrains the set of
potential referents'' (408) as BT-inappropriate referents were
considered early on in the processing for both reflexives and
pronouns. They conclude with two more general implications of their
study: (i) reflexives in picture NPs should all be treated as logophors,
and (ii) their experiment could serve as an example for other studies
that aim at complementing introspective data with psycholinguistic

Uli Sauerland, Jan Anderssen, and Kazuko Yatsushiro ('The Plural is
semantically unmarked') first show that the 'Strong Theory' of the
plural - the plural implies cardinality greater than one and is marked -
does not hold, and that there are many cases where ''the plural does
not mean the same as explicitly adding 'two or more''' (414) (consider
for example ''You're welcome to bring your children'' vs. ''You're
welcome to bring your two or more children''. Using evidence from
adult competence and from adult and child performance, the authors
instead argue for a 'Weak Theory' of the plural, which ''is
characterized by the assumption that the plural is not subject to an
inherent lexical restriction as the singular is'' (429). According to
Sauerland et al. the plural is rather subject to pragmatic comparison
with the singular, and can therefore not be used in most examples
where the singular is possible. Their findings, according to the
authors, imply (i) that ''semantic and morphological markedness need
to be distinguished'' (430), and (ii) ''that the interpretation of the plural
always involves an implicit comparison'' (430).

Tanja Schmid, Markus Bader, and Joseph Bayer present the results of
an experiment based on a questionnaire that compared German
infinitival non-coherent constructions, where the infinitival complement
forms an independent constituent which may be extraposed (e.g. ''...
dass Maria prahlt, alle Verwandten zu kennen'') and coherent
constructions, where the infinitival complement does not form an
independent constituent (e.g. ''*... dass Maria scheint, alle Verwandten
zu kennen''). Their paper 'Coherence - an Experimental Approach'
addresses the questions (i) whether experimental evidence verifies
the validity of their (non-) coherence-tests and the verb class
differences proposed in the literature, and (ii) what the factors are that
give rise to coherence. Four constructions - topicalisation of the verbal
complex, 'long' scrambling of a pronoun, 'long-distance' passive, and
wide scope of negation - were used as tests for coherence; two
configurations - extraposition of the infinitival complement, and narrow
scope of negation - were used to test non-coherence. The intraposed
construction, which is assumed to be structurally ambiguous (''... dass
Max mir [nur das Lexikon zu kaufen] empfohlen hat'' vs. ''... dass Max
mir nur das Lexikon [zu kaufen empfohlen hat]''), was tested, too.
Schmid et al. report the following findings: (i) their coherence tests can
be considered valid as the different results correlate significantly, (ii)
the ambiguous intraposed construction patterns with the coherence
tests, and (iii) there is evidence that verbs within a given class behave

In his paper 'Thinking About What We Are Asking Speakers to Do'
Carson T. Schütze argues that it is important to evaluate the status
and quality of the various types of linguistic evidence. Specifically he
asks whether the data obtained from ''naive'' speakers is reliable,
i.e. ''whether we are asking them to do things that they can
understand and are capable of doing, and whether we can be
confident that they are actually doing what we have asked of them''
(457). Schütze examines a number of case studies in detail, finding
that in particular experiments that ''address our questions of interest ...
directly'' (477), i.e. experiments where the linguist has a particular
hypothesis in mind, can yield questionable results. Schütze shows that
these ''bad'' results can have various reasons: in one example the
instructions for the participants were unclear and inconsistent, or
researchers did not take into account that certain ''scenarios'' that
were evoked by their elicitation tests could influence the results, or
they failed to see that other factors than the ones tested influenced
the answers of the participants, etc. Schütze argues that these
shortcomings can be overcome by sticking ''as closely as possible to
the ways in which language is actually used for everyday purposes,
rather than contriving artificial unfamiliar tasks'' (477) and that
experiments that are used to gain direct information about underlying
linguistic knowledge have to be improved.

The question Augustin Speyer pursues in his paper 'A Prosodic Factor
for the Decline of Topicalisation in English' is whether there is a
connection between the loss of the verb-second constraint (V2) and
the decline of topicalisation - ''the movement of a non-subject
constituent to the left edge of a sentence'' (487) like in ''Beans, John
likes'' -, which both occurred at about the same time in the history of
English (starting between 1150 and 1250). The fact that pronouns
behave differently from full NPs (the use of pronouns in topicalised
sentences remains stable after a sharp drop after 1250 whereas the
use of full NPs gradually declines) suggests, according to
Speyer, ''that the connection might have something to with one of the
properties that pronouns have, but not full noun phrases, or vice
versa'' (490). Speyer then goes on to discuss the pragmatic and
prosodic properties of topicalised sentences, and introduces a
constraint which he thinks might have caused the decline of
topicalisation, the 'Trochaic Requirement' (TR), which indicates
that ''some weak element ... between two accents is compulsory''
(494). In German topicalised constructions this constraint is naturally
fulfilled, due to V2 (''Bohnen hasst Maria''), but Present Day English
speakers have to (i) either insert an empty timing slot (after 'beans'
in ''Beans, John likes''), ''thus creating a dummy weak element'' (496)
or (ii) avoid topicalised constructions. Schütze argues that the TR
constraint also held in the history of English. As in the Middle English
Period V2-word order became more and more marked and was
therefore used less and less, speakers avoided ''accent clash'' by
avoiding topicalised constructions - the rate of topicalisations
decreased. This is confirmed by the fact that pronouns, which are
naturally weak elements, do not seem to be affected by the avoidance
of topicalisation.

There are three different analyses of coordination. The ''deletion
analysis'' (cf. e.g. Chomsky 1957) assumes that conjuncts are derived
via a deletion mechanism, e.g. ''[The man is carrying the ladder] and
[THE MAN IS CARRYING the bucket]'' (caps indicate deleted material).
In the ''phrasal analysis'', ''coordinate phrases ... are base-generated
directly by phrase structure rules'' (507), which either results in multi-
headed constructions (cf. e.g. Jackendoff 1977) or in analyses that
treat conjunctions as heads (cf. e.g. Kayne 1994). The ''node-sharing
analysis'' allows for three-dimensional syntactic-structures with single
nodes being shared by more than one phrase marker (cf. e.g.
Moltmann 1992). Using data from two comprehension studies in
agrammatism, and data from reading-time experiments, Ilona Steiner
('On the Syntax of DP Coordination: Combining Evidence from
Reading-Time Studies and Agrammatic Comprehension') aims at
finding out which of the three analyses is most plausible. The results
of the two comprehension studies in agrammatism allowed her to
discard the deletion approach; the reading time data provided
evidence for the node sharing analysis and allowed her to distinguish
between a phrasal analysis and the node sharing analysis. Both types
of evidence however, taken together, indicated that the node-sharing
analysis is most plausible.

The paper 'Lexical Statistics and Lexical Processing: Semantic
Density, Information Complexity, Sex, and Irregularity in Dutch' by
Wieke M. Tabak, Robert Schreuder, and R. Harald Baayen combines
a survey of the distributional properties of regular and irregular verbs
in Dutch verbs with an experimental lexical decision study, which
addressed the predictability of these properties for lexical processing
in reading. The authors established various factors with the help of
which the regularity of a verb can be predicted, e.g. lemma frequency,
family size, neighbourhood density, argument structures, auxiliaries,
inflectional entropy, noun-verb frequency ratio, spoken-written
frequency ratio. To test whether these systematic differences between
regular and irregular verbs are reflected in on-line processing, the
authors conducted a lexical decision study the results of which
challenge many previous hypotheses about regular vs. irregular
verbs. Tabak et al. for example found that error analysis and
response latencies pointed to a procession advantage for regulars. In
both analyses, this advantage was most prominent for past tense
forms. This finding challenges Pinker's model (1991, 1997), which
predicts that ''regulars should be more difficult to process than
irregulars, because regulars would require decomposition into stem
and affix in addition to lexical lookup, and therefore should elicit longer
instead of shorter latencies'' (550). The more general picture that,
according to Tabak et al., emerged from the study is ''that the
distinction between regular and irregular verbs is not a simple one.
Regulars and irregulars differ not only with respect to their formal
properties, but also with respect to their semantic properties and the
information structure of their inflectional paradigms'' (552). The
authors conclude that ''the fascinating and enigmatic phenomenon of
regularity and irregularity in the mental lexicon'' (552) requires further

In his paper 'The Double Competence Hypothesis: Diachronic
Evidence' Helmut Weiß shows how the ''writing-competence'' that
underlies the production of historical texts (which are performance
data) can be modelled by combining two independently developed
approaches to theoretical and historical linguistics: the double
competence hypothesis (cf. e.g. Kroch 2001) - ''which assumes that
the competence underlying writing (''first order natural languages''
(N1)) is different from the competence underlying speaking (''second
order natural languages'' (N2)) since (i) it is acquired later and
independently of the latter, and (ii) it is functionally different - and the
hypothesis that there are several grades in languages' naturalness
(cf. e.g. Ferguson 1959), which assumes that in a monolingual speech
community the low variety (often a dialect) is acquired as native
language and spoken in everyday communication, whereas the high
variety is learned as second, non-native language, and only used in
writing and formal communication. In the 14th and 15th centuries,
when NHG started to evolve, the distance between these two
competences was still very great, whereas in the 19th and 20th
centuries, when NHG first became spoken and was acquired as native
language, the distance began to decrease. Weiß shows that
the ''mixed language'', which is characteristic of OHG texts is a
consequence of a diglossic double competence, and ''that a historical
syntactic pattern can be analysed in three ways: as the output of (i)
the N1 competence, (ii) the N2 competence, or (iii) as a hybrid form''
(570). He concludes with the claim that in modern historical linguistics
combining quantitative and theoretical tools is ''the right and only way
to overcome the weaknesses of diachronic data in general and the
consequences of double competence'' (571).


Most papers in the volume 'Linguistic Evidence' address issues
concerning linguistic evidence in relation to specific linguistic
problems, using and combining various data types (experimental data
and corpus data are perhaps the most frequently used data types
here). The volume shows that the question of how to gain linguistic
evidence is (or should be) important for all linguists and that linguists
can only gain when they use more than one data type. Evidence
involving more than one type of data provides a different, but definitely
a more comprehensive perspective on a given linguistic phenomenon -
whether it confirms one's hypothesis, or whether it contradicts it.
There are only few papers that explicitly address methodological and
theoretical questions concerning linguistic evidence (e.g. Featherston,
Kempen/Harbusch and Schütze), but as linguistic evidence is quite a
new topic of linguistic discussion it may well be hoped that we will get
more linguistic evidence-theory and -methodology in the near future.


Chomsky, Noam (1957) Syntactic Structures. Mouton: The Hague.

Ferguson, Charles (1959) 'Diglossia'. In: Word 15, 325-340.

Fillmore, Charles J. (1992) '''Corpus Linguistics'' or ''computer-aided
armchair linguistics'''. In: Jan Svartvik (ed.) Directions in Corpus
Linguistics: Proceedings of Nobel Syposium 82, Stockholm, 4-8
August, 1991. de Gruyter, Berlin, Germany, 35-60.

Haegeman, Liliane (1995) The Syntax of Negation [=Cambridge
Studies in Linguistics 75]. Cambridge: Cambridge University Press.

Imai, Mutsumi; Gentner, Deirdre (1997) 'A Cross-Linguistic Study of
Early Word Meaning: Universal Ontology and Linguistic Influence'. In:
Cognition 62, 169-200.

Kayne, Richard (1994) The Antisymmetry of Syntax. Cambridge, MA:
MIT Press.

Keller, Frank (2000) Gradience in grammar: Experimental and
computational aspects of degrees of grammaticality. Ph.d. thesis.
University of Edinburgh.

Keller, Frank; Asudeh, Ash (2001) 'Constraints on linguistic
coreference: Structural vs. pragmatic factors: In: Moore,
J.D./Stenning, K. (eds.) Proceedings of the 23rd Annual Conference
of the Cognitive Science Society. Mahwah, NJ: Lawrence Erlbaum.

Kroch, Anthony S. (2001) 'Syntactic Change'. In: Baltin, Mark/Collins,
Chris (eds.) The Handbook of Contemporary Syntactic Theory.
Oxford: Blackwell, 699-729.

Jackendoff, Ray (1977) X' Syntax. Cambridge, MA: MIT Press.

Lightfoot, David (1999) The Development of Language: acquisition,
change and evolution. Oxford: Blackwell.

Lucy, John A. (1992) Grammatical Categories and Cognition: A Case
Study of the Linguistic Relativity Hypothesis. [Studies in the social and
cultural foundations of language 13]. Cambridge: Cambridge
University Press.

Moltmann, Friederike (1992) Coordination and Comparatives.
Cambridge, MA: MIT Press.

Piñango, Maria; Zurif, Edgar; Jackendorf, Ray (1999) 'Real-time
processing implications at the syntax-semantics interface'. In: Journal
of Psycholinguistic Research 28 (4), 395-414.

Pinker, Stephen (1991) 'Rules of language'. In: Science 153, 530-535.

Pinker, Stephen (1997) 'Words and rules in the human brain'. In:
Nature 387, 547-548.

Reinhard, Tanya; Reuland, Eric (1993) 'Reflexivity'. In: Linguistic
Inquiry 34, 657-720.

Schütze, Carson T. (1996) The Empirical Basis of Linguistics:
Grammaticality Judgments and Linguistic Methodology. Chicago:
University of Chicago Press.

Sprouse, Rex; Vance, Barbara (1999) 'An explanation for the decline
of null pronouns in certain Germanic and Romance languages'. In:
DeGraff, Michael (ed.). Language Creation and Language Change:
Creolization, Diachrony and Development. Cambridge, MA: MIT Press,

Todorova, Marina; Straub, Kathleen; Badecker, William; Frank, Robert
(2000) 'Aspectual coercion and the on-line computation of sentential
aspect'. In: Proceedings of the twenty-second annual conference of
the Cognitive Science Society. Philadelphia, PA.

Elke Gehweiler is reasearch associate in the project Collocations in
the German Language at the Berlin-Brandenburgische Akademie der
Wissenschaften, Berlin, Germany, and in a project on
grammaticalization at the Freie Universität Berlin, where she is
currently preparing her Ph.D. thesis on the grammaticalization of
adjectives in English and German.

Format: Hardback
ISBN: 3110183129
ISBN-13: N/A
Pages: viii, 584
Prices: U.S. $ 132.30