Publishing Partner: Cambridge University Press CUP Extra Wiley-Blackwell Publisher Login
amazon logo
More Info

New from Oxford University Press!


Language Planning as a Sociolinguistic Experiment

By: Ernst Jahr

Provides richly detailed insight into the uniqueness of the Norwegian language development. Marks the 200th anniversary of the birth of the Norwegian nation following centuries of Danish rule

New from Cambridge University Press!


Acquiring Phonology: A Cross-Generational Case-Study

By Neil Smith

The study also highlights the constructs of current linguistic theory, arguing for distinctive features and the notion 'onset' and against some of the claims of Optimality Theory and Usage-based accounts.

New from Brill!


Language Production and Interpretation: Linguistics meets Cognition

By Henk Zeevat

The importance of Henk Zeevat's new monograph cannot be overstated. [...] I recommend it to anyone who combines interests in language, logic, and computation [...]. David Beaver, University of Texas at Austin

Email this page
E-mail this page

Review of  Polysemy

Reviewer: 'Eleni Koutsomitopoulou' ['Eleni Koutsomitopoulou'] Eleni Koutsomitopoulou
Book Title: Polysemy
Book Author: Claudia Leacock Yael Ravin
Publisher: Oxford University Press
Linguistic Field(s): Computational Linguistics
Linguistic Theories
Book Announcement: 14.2534

Discuss this Review
Help on Posting

Date: Tue, 23 Sep 2003 02:27:57 -0700 (PDT)
From: Eleni Koutsomitopoulou <>
Subject: Polysemy: Theoretical and Computational Approaches (paperback ed.)

Ravin, Yael and Leacock, Claudia, ed. (2002, paperback ed., 1st ed.
2000) Polysemy: Theoretical and Computational Approaches. Oxford
University Press.

Eleni Koutsomitopoulou, Georgetown University, Washington DC,
and LexisNexis Butterworths Tolley London, UK.


This book is a broad survey of the issue of polysemy in theoretical and
computational linguistics. It is a collection of 11 papers including an
overview of the subject by Ravin & Leacock.

What each paper is about: Each paper in this edition sheds light on a
different aspect of this multifarious issue. The theoretical approaches
deal with the issue of polysemy as part of semantics (see the papers by
Pustejovsky and Dowty), cognitive semantics (Fillmore & Atkins) and
Goddard and discourse (Cruse) and grammar (Fellbaum). The computational
approaches cover almost the entire spectrum of computational
methodologies: from lexical solutions ala WordNet, to NLP and

Ravin & Leacock's overview is a thorough survey of the issue and a
preliminary introduction to the various approaches that are presented
in the book. For instance, in the editors' review, polysemy is
discussed vis-à-vis homonymy and indeterminacy. Also discussed is the
role of context in sense disambiguation, as well as the various
underlying (formal as well as cognitive semantic) theories of meaning
and computational practices for word sense disambiguation.

Cruse's paper focuses on the role of context in polysemy, degrees of
word dependency on context, and semantic discontinuity and
"distinctness". Words that are stand-alone semantically (enough to be
relatively unaffected of context) are called "discrete words", whereas
words of lower "semantic density" are more easily affected and defined
by context. Polysemy in this paper is explored from a lexical-semantic
point of view as the result of a "wide spectrum of possibilities for
context-dependency" for individual words. The paper is a great
typology of context-word relationships with plenty of examples. An
interesting ramification of the lexical-semantic perspective is that
antonyms and hyponyms cannot assert context-independent meaning, or,
worst, there is no such thing as absolute hyponymic or absolute
antonymic sense/term. Cruse jokingly calls this new realization about
word meaning the "soft semantics" which is definitely on a par with
structuralism and perhaps even formalism. At the same time, Cruse also
appears pessimistic about prototype theories of meaning, as prototypes
are again representations of a chaotically behaving system of word

Fellbaum discusses "autotroponymy" (dubbed as polysemy) (from the Greek
"tropos" which means "manner") in the English verb and noun systems.
She argues that in English some verbs refer to specific ways/manners of
performing actions denoted by other verbs ("stammering" for "talking",
"sneaking" for "walking" etc). She points out that the "manner"
relation between verbs is highly polysemous in the English verb system
when compared, for instance, to the semantic relation of causative
verbs to the corresponding inchoatives (John opened the door. The door
opened.) This paper is a typology of changes in syntactic behavior in
alignment with the various meanings of polysemous verb and noun forms
(The kids behaved. vs. The kids behaved badly.). An interesting aspect
of this study is that it examines polysemy/autotroponymy as the
conflation between a "semantically specified sense" and its "more
general superordinate". The troponyms (i.e. the polysemous terms)
differ from their homophonous superordinates in their syntactic
arguments. They also differ from their co-troponyms either in their
syntactic properties or in their particular lexicalization ways (or
both). For instance compare "behave" (semantically specific sense) with
"behave well/bad/etc" (superordinate/troponym) and "be a good/bad/etc
boy" (co-troponym).

Pustejovski zooms in the issue of argument structure vis-a-via polysemy
within his generative lexicon theory. He argues that the known
phenomenon of lexical shadowing typically occurring in the case of
cognate object verbs such as "butter" (butter the bread) and "dance"
(dance a dance) also shows up in other classes of verbs such as those
noted in Fillmore and Atkins (1992) where the expression of an argument
completely shadows the expression of another argument to the verb (risk
my health/life -- risk illness/death).

Pustejovski also discusses various types of relations as denoted by
verb argument structures, such as "containment relation" ((in a) book,
(on a) disc etc) and "complex relation" (read the book, read the
articles, read the articles in the book, read the book of articles).
Since polysemy from this point of view refers to the semantic nuances
that are due to the presence and various configurations in the argument
structure, Pustejovksi also proposes a typology of "optionality" of
arguments, which defines the types of arguments that are optionally
expressed in a predicate. The article includes a general overview of
the basic premises of the author's theoretical framework. Although this
is a highly technical discussion that presupposes a fair amount of
familiarity of the reader with Pustejovski's particular theory of
generative Lexicon (1991, 1995), the article is relatively simple
conceptually and the points it makes are well-known in literature.

Fillmore and Atkins provide a lexicographic analysis of word sense
variety by examining the contents and structure of four British English
language dictionaries (CIDE, COBUILD, LDOCE, OALD). They make the point
that the number of different sense corresponding to a unique term in
actual corpora far exceeds the number of sense variations pinpointed in
the Dictionaries. Also absent in the dictionaries according to Fillmore
& Atkins are metaphorical senses of terms. This study also includes
crosslinguistic data by examining "matching senses" of a term by its
equivalence in bilingual corpora. They close by criticizing traditional
lexical semantics attempts to word sense disambiguation and proposing
the methods of word sense analysis of the Berkeley FrameNet project
(See general info about the project at:

Dowty casts doubts on the traditional view, that he calls the "fallacy
of argument alternation". According to this fallacy differing
constructions (syntactic forms) may express identical intended meanings
and correspond to identical propositions, an argument for the universal
nature of semantic structure in natural language. Dowty instead points
out that syntactic permutations serve to convey significant semantic or
conceptual variations, and hence they should not be discounted in the
name of propositional equivalence. To prove his point he examines a
number of argument permutation phenomena such as passivization, tough-
construction, middle construction, raising etc. In particular, he
focuses on comparing constructions such as the intransitive "swarm"-
alternation (Bees swarm in the garden. The garden swarms with bees.)
and the transitive "spray-load"-alternation (Mary sprayed paint on the
wall. Mary sprayed the wall with paint. Mary loaded hay onto the truck.
Mary loaded the truck with hay.) from Fillmore 1968. After presenting
the superficial commonality between these two different types of
constructions, Dowty argues that they are fundamentally different and
focuses on the former. The author goes as far as claiming that the
intransitive "swarm"-alternation is a phenomenon of semantic extension
and offers some pertinent historical linguistic evidence from German
and French languages.

Goddard is a proponent of the Wiersbicka's Natural Semantic
Metalanguage theory (NSM). He points out the capacity of NSM theory to
tackle both word-level and syntactic-level polysemy. The entire theory
is based on the notion of semantic primes that supposedly safeguard the
lexicon from obscurity and circularity in lexical sense definition.
Substitution is one of the tests for the validity of periphrases used
to express alternate meanings corresponding to a unique term in the
lexicon. The paper claims to offer a "semantic methodology" for lexical
definition and consequently for polysemy. The papers makes the
interesting point that grammatical constructions may also manifest
polysemy, and it proposes a treatment for figurative language (within
the same NSM framework) in relation to polysemy. A drawback of the NSM
approach seems to be that meaning is treated as a tractable phenomenon
and hence it is considered "accessible, concrete, and determinate", a
perception that classical meaning typologies have repeatedly failed to
prove true.

In computational linguistics, the treatment of polysemy falls into the
class of issues that are tackled under the term "word sense
disambiguation". Unlike their theoretical counterparts, the
computational approaches are more interested in the development of
efficient methods for word sense disambiguation rather than justifying
the various historical, stylistic and theoretical issues surrounding

Miller & Leacock focus on lexical representations for sentence
processing. They argue that what is missing from dictionaries and
semantic theories is a "satisfactory treatment of the lexical aspects
of sentence processing". They deduce this problem to an examination of
various methods for a more efficient representation of context. "Local
context" is defined primarily by the syntactic categories of a term,
i.e. the noun category of contexts, the verb category of contexts etc.
Some terms may belong to more than one syntactic category and hence to
more than one local contexts. Simple rule-based systems may address
this issue. Miller & Leacock recognize the role of semantic information
in determining the local context of a term's sense, and the fact that
semantic information is not always present in the local context. For
this reason they define a broader or "topical context". Topical context
is defined as the general topic of a text or discourse, and the same
term may mean different things as topic in different contexts.

For instance, consider the different meanings of "shot" in
marksmanship, in a chat with a bartender, or a photographer, in a
hospital, or in the context of a game of golf or basketball. The basic
hypothesis of the authors is that if the linguistic context provides a
clue about the primary discourse topic we can easily decide on the
intended meaning of "shot" in the particular linguistic context. They
then proceed to define how people define the topic of a discourse, and
present some theories that determine the topic based on a statistical
classification of the vocabularies and sub-vocabularies of a polysemous
word in a discourse (although initial attempts have been applied to
homonymous terms such as "crane" and "bass"). The problem is then that
polysemy allows for finer distinctions between senses than that in the
case of homonymy (for instance, "bass" is not only a distinction
between fish and deep voice but also between deep voice and the man who
carries it, the lowest frequencies in musical harmony, a bass horn or a
bass violin and so on). In other words, in the case of polysemy the
information of topical context alone may not be always sufficient.

Additional experimental comparison of three different statistical
classifiers (a Bayesian classifier, a content-vector and a back-
propagation neural network) showed that as the number of different
senses of a term increase so does the difficulty of the algorithm to
make accurate distinctions between them. In addition, some contexts
seem to be inherently harder to identify than others. Compared to
humans the three tested classifiers performed at about the same level
of accuracy. In addition, topical information was proved to be useful
when the polysemous terms were presented in sentences rather than in
the context of co-occurring terms. Combined local and topical
information methods may yield better results but still not as good as
those yielded in human comprehension tests. The authors suggest that
research in sentence processing in particular in argument structure and
coreference would help elucidate sense disambiguation issues.

Stevenson and Wilks are concerned with polysemy (or Word Sense
Disambiguation, WSD) in large corpora. They particularly point out that
evaluation methods for WSD are usually based on small trial selection
of text versus large corpora with dubious generality of results and
performance. Another problem with current approaches that Stevenson and
Wilks point out is the increased chances to meet novel word senses in
large corpora, senses not yet lexicalized in existing dictionaries.
Finally, the authors of this paper recognize that most research in NLP
may use different ways of encoding or conceptualizing information, but
in the case of WSD the variety of tools and techniques applied seem to
be taken as representing different types of WSD information themselves.
The above three issues render WSD a hard problem to solve.

For their experiments the authors used the machine-readable version of
the LDOCE dictionary in order to make use of both a large-scale
inventory of senses and a broad knowledge base for sense
disambiguation. In the process of analyzing the lexical knowledge
sources they faced the question of what is context to which they
replied by selecting "larger linguistic structures" such as sentences
and/or entire discourses, that offer the pertinent topics. For their
experiments they also focused on the issue of combining various
knowledge sources and they used a "memory-based learning algorithm"
that provided a filter that removed senses from consideration thereby
simplifying the WSD tasks, and also made use of various partial taggers
which uses different knowledge sources from the lexicon in order to
suggest a set of possible senses for an ambiguous term. For the
evaluation of their experimental results they merged a WordNet list of
manually tagged content words with the ontological hierarchy of the
LDOCE dictionary, which they used as a "gold standard" of texts.

The authors conclude that both high-level (word-level) and fine-grained
(sense-level) WSD is achieved at a level of over 90% accuracy with the
high-level WSD tests obtaining higher accuracy between the two.

Dolan, Vanderwende and Richardson present MindNet, part of MS-NLP,
which is a "broad-coverage, application-agnostic" NLP system developed
by Microsoft Research. MindNet "provides the representation
capabilities needed to capture sense modulation". Acquisition of new
senses and new words is also possible via MindNet. Context is crucial
in MindNEt and understanding of the meaning of a term equals to
"producing a response that has been tied to linguistically similar
occurrences of that word." The system learns by example (it is
characterized as a "highly processed example base"). Inferencing via
structured representations is also possible. These representations are
"directed labelled graphs" that help overcome the limitations of word
order and take advantage of hierarchical relationships outside the
realms of syntactic relations (e.g. in order to show the indirect
relationship between "car" and "truck", they use the graph: car-
Hypernym -> vehicle <- Hypernym-truck, where "car" and "truck" are
connected by virtue of their relationship to the same hypernym
"vehicle"). The paths between the terms are weighted in a way that
reflects their salience. A known current weakness of Mind-Net is that
it is a static representation of relations with fixed weights that
depend on the current associations the system contains. This means that
anything beyond the level of individual words and at best sentences
(for instance, inter-sentential relations and hence context and
discourse) lies beyond the capabilities of Mind-Net.

Schutze's paper offers us a glimpse at the phenomenon of polysemy from
a connectionist point of view. The author reminds us that models such
as those of Rumelhart et al. 1986 and McClelland et al. 1986 aim at
first to design a disambiguation algorithm that is psychologically
plausible and is also applicable at a large-scale. The author explains
the notion of semantic priming ("flower" will be read more quickly
after the sentence "They held the rose" was presented vs. a sentence
like "They all rose." containing an homonymous term) for sentence
processing and connectionist methods for disambiguation. He then
proceeds to explain word vectors, context vectors and sense vectors of
activations in his proposed algorithm. He concludes that similarity in
contexts is a crucial factor for determining word-level similarity and
hence a reliable guide for grouping (clustering) and disambiguation of
word-level senses. This paper presents promising work in polysemy in a
manner that is psychologically plausible, but it fails to view polysemy
as a generalized phenomenon affecting natural language not only at
word-level but also at a level of sentences and discourses.

Finally, each paper comes with a wealth of bibliographical references
pertinent to the particular model and strategy of analysis.


The book contains a wealth of useful information (principles, data
configurations, methods, strategies and viewpoints) on the manifold
problem of word sense ambiguation.

The theoretical linguistics papers focus on offering a typology of
linguistic data, which they examine and then group in order to
pinpoint the apparent regularities in them. At times (as in
Pustejovski's work) theoretical approaches additionally offer a formal
descriptive representation of the regularities in the data. The
undoubted merit of such a theoretical approach to polysemy lies in the
precision of the description and analysis of an otherwise not-so-
systematic and homogeneous phenomenon in natural language. The obvious
defect of such an approach lies in the nature of polysemy in natural
language. Rules cannot adequately describe future behavior or
presently undetected patterns in the data, since new rules need to be
invented to encompass new data. In addition, rules have no explanatory
power or value and describing natural language phenomena such as
polysemy in formal rules offers no real understanding of the way
language works.

The computational linguistics papers in this book focus on the
applicability of various methods and tools proposed from various
theoretical and computational sources and they render any pertinent
issues of performance and evaluation prominent in research. Word sense
disambiguation traditionally has been examined within the realms of
word-level analysis, lexica, corpora, thesauri, knowledge bases and
related tools and representations for paradigmatic relations. Most
researchers in polysemy point out this obvious inadequacy of the
computational approach, i.e. that it fails to take into account crucial
factors such as (linguistic and pragmatic) context, and instead it is
tied to a word-level partial solution of the problem. Computational
systems and theories that incorporate disambiguation efforts as part of
the set of their offering tool-set are usually more successful for
this reason. In such systems, broader linguistic context is taken
into account during the disambiguation process.

ABOUT THE REVIEWER Eleni Koutsomitopoulou is a PhD candidate in Computational Linguistics at Georgetown University (Washington DC) and a senior Indexing Analyst at LexisNexis Butterworths Tolley in London, Great Britain, where she currently lives. Her main research interests include Neural network models for Natural Language Processing, cognitive linguistics, indexing and pattern recognition applications for natural language.

Format: Paperback
ISBN: 0199250863
ISBN-13: N/A
Pages: 240 pp.
Prices: U.S. $ 24.95