Review of  Anaphora Resolution

Book Title: Anaphora Resolution
Book Author: Ruslan Mitkov
Publisher: Pearson Linguistics
Linguistic Field(s): Sociolinguistics
Issue Number: 14.2512

Date: Mon, 22 Sep 2003 17:26:32 +0200
From: Peter Kühnlein
Subject: Anaphora Resolution

Mitkov, Ruslan (2002) Anaphora Resolution, Longman, Studies in
Language and Linguistics.

Peter Kühnlein, SFB 360, Univ. Bielefeld

The book is suitable for everybody interested in the topic
of anaphora resolution, "including ... researchers,
lecturers, students and NLP software developers", as the
preface announces correctly. The book is divided up into an
introduction and nine chapters, which will be treated in
turn below. A detailed index at the end of the book contains
all the central keywords.

The layout of the book is as follows: after an introduction
into the linguistic fundamentals (Ch.1) concerning anaphora,
some difficulties for automatic (i.e., computer based)
resolution are highlighted in Ch.2. Ch.3 reviews briefly a
number of theories and formalisms that are related to
anaphora resolution. A historical overview (Ch.4), spanning
the 60s, 70s and 80s, is followed by a discussion of what
the author calls the main trends in recent anaphora
resolution (Ch.5). The role of corpora in that area is
treated in Ch.6, while Ch.7 is devoted to the explication of
the authors own approach, characterized as a robust,
knowledge-poor algorithm. A chapter on evaluation (Ch.8) and
one on outstanding issues (Ch.9) close the book. Each
chapter is closed by a summary and endnotes. The chapters
are to some degree self-contained, so that it is sufficient
to read only specific parts of the book because the details
needed are repeated. Chs.1-4 are introductory and suitable
for students without previous knowledge. Chs.5-8 contain a
discussion of recent work on the subject. Ch.9 gives an
impression of future research that will have to be done.
From Ch.3 onward, the focus is put on the resolution of
nominal anaphora.

Ch.1, which is devoted to the introduction of linguistic
fundamentals, is a good primer for students who have to
start from scratch. Cohesion, co-reference, and the notion
of a discourse entity are related to various forms of
anaphora. Mitkov frequently quotes examples from different
languages to sustain his claims. Intra- and extra-sentential
anaphora are distinguished, followed by a short discussion
of indirect anaphora. A distinction is drawn between
identity-of-sense and identity-of-reference anaphora. Types
of antecedents are introduced and the effects of their
different locations discussed. Anaphora are related to other
linguistic phenomena (cataphora, deixis, ambiguity). The
question of when anaphora are resolved in human language
processors is touched upon.

Ch.2 contains a discussion of different sources of knowledge
that have to be drawn upon by automatic anaphora resolution
and relates them to the linguistic basics that were
introduced in the first chapter. The knowledge sources
mentioned are: morphology, lexicon, syntax, semantics,
discourse and common-sense knowledge. Tools and resources
that are needed to implement the introduced anaphora
resolution factors are listed

Ch.3 introduces some of the theories that have been used in
anaphora resolution, mainly Centering Theory, Binding
Theory, the work on focus done by Grosz and Sidner in the
70s, and Discourse Representation Theory. Here, the
discussion of Centering Theory and Binding Theory gives a
good overview of some of the recent developments. The
discussion of "other related work" introduces only the
fundamentals of the respective theories, e.g., Kamp &
Reyle's (1993) basic framework.

Ch.4 is intended as historical excursion. This comes a
little bit as a surprise, as the author in his preface
pointed out that he would not cover work prior to 1986 in
detail, but refers to Hirst's (1981) book "Anaphora in
Natural Language Understanding" and Carter's "Interpreting
Anaphora in Natural Language" (1987). Now, this chapter
revisits earlier work at least to some detail. However,
being a concise description of the past developments, the
chapter is for sure of use for an impatient student. The
chapter covers STUDENT, SHRDLU, LUNAR, Hobb's algorithm,
BFP, SPAR, as well as distributed architectures as suggested
by Rich & LuperFoy and Carbonell & Brown. A section on
other work briefly summarizes alternative solutions. In
total, the work done in the early period of automatic
anaphora resolution is characterized as dominated by
knowledge-rich, i.e., costly, strategies.

Chapters 1 - 4 are obviously intended as introductions to
the respective topics. The following chapters 5 - 8 picture
the state of the art in anaphora resolution.

Ch.5, in contrast to Ch.4, deals with present-day research
that is considered as oriented toward knowledge-poor and
corpus-based work. The first section identifies the main
trends in present research, the following sections elaborate
on that work. Here, the book follows a mixed strategy of
presenting the strategies partly according to themes
("Collocation patterns-based approach", etc.), partly
according to researchers (Lappin and Leass, etc.). The
relevant algorithms are explained, and the evaluations,
wherever possible, discussed.

Ch.6 motivates the use of corpora in anaphora resolution and
surveys recent corpora that are appropriately annotated. The
survey is followed by an overview of annotation schemes that
are in use (UCREL, MUC, DRAMA, Bruneseaux & Romary, Poesio &
Vieira, MATE, Tutin, Rocha, Botley). The use of each
in tagging texts is exemplified. The author adds a
comparison of tools that are available or prospective for
the task of actually tagging texts. They comprise XANADU,
DTTool, Alembic Workbench, Referee, CLinkA, FAST, the tools
to be implemented in the ATLAS group, and a set of tools
that has been suggested by Day He discusses the
necessity of settling on an adequate annotation strategy and
gives some examples of resulting coding guidelines. The
chapter ends in a discussion of the topic of inter-annotator
agreement and respective measures.

In Ch.7 the author presents his own algorithm which he
describes as robust and knowledge-poor. The domain for which
the algorithm is developed is that of manuals for hard- or
software. The presentation is made in two broad steps.
First, the "original" algorithm is introduced and
discussed to some length. The pre-processing of the data and
the strategy for anaphora resolution are presented, a
description of the algorithm, an example and evaluation are
given. As indicators for candidates for antecedents, (i) a
class of Indicating Verbs is defined; (ii) lexical
reiteration counts as an indicator; (iii) NPs in section
headings are given a bonus; (iv) Collocation patterns are
matched; (v) NPs in coordinate constructions are assigned
higher plausibility; (vi) in certain ("sequential")
constructions, primacy counts as an indicator; (vii) for the
domain of manuals, indefiniteness is counted against a
candidate, as well as (viii) the status of being a
prepositional noun phrase.
The algorithm always identifies a single antecedent
as the most plausible candidate, which accounts for
its robustness. The paper describes modifications
for the treatment of anaphora in multiple languages
and corresponding evaluations. Mitkov stresses the
point that a bilingual implementation of his algorithm is
superior to a monolingual and surveys the work done here.
In the second step, a modified, fully automated version of
the resolver, called MARS, is introduced. MARS is a fully
automatic implementation of an improved version of the
original resolver, where "fully automatic" means that there
is no human intervention at any stage of the resolution of
the anaphora. MARS uses a Functional Dependency Grammar
parser as pre-processing tool. As for indicators, three more
are used than in the original implementation: (ix) pronouns
are allowed as possible antecedents and given a bonus; (x)
syntactic parallelism is awarded a boosting score; (xi)
frequent candidates are preferred. The paper describes the
algorithm which uses these indicators, and a genetic
optimization algorithm is introduced that leads to an
improvement in performance. The resolver is then evaluated
according to different criteria, e.g., with and without
optimization by the genetic algorithm. As with the previous
implementation, a version for non-English anaphora
resolution is described, this time for Bulgarian, as well as
the evaluation.

Ch.8 contains a discussion of evaluation in anaphora
resolution. A distinction is drawn between the evaluation of
the resolution algorithm and of the system as a whole. A
number of measures is introduced and the applicability to
algorithm and system . The measures are introduced in
contrast to earlier proposals by Aone and Bennett (1995) and
Baldwin (1997). For evaluation of the algorithm, "success
rate", "critical success rate" and "non-trivial success
rate" are distinguished. They are proposed for measuring the
performance of the algorithm. Once this is achieved,
comparative evaluations are envisaged (and indeed a couple
of comparisons made). Finally, the possibility to establish
the "decision power" or "relative importance" of indiviual
components of the algorithm is discussed. The measures
"(non-trivial/critical) success rate" then are applied to
the resolution system as a whole.
Additionally, "resolution etiquette" is proposed as an
indicator for the efficiency of determining non-nominal
anaphora. The topic of reliability of an evaluation in the
context of anaphora resolution is discussed. The author
proposes an evaluation workbench (which already is
implemented by Catalina Barbu) as a tool for "fair"
evaluation. At the end of this chapter, other work on
evaluation is surveyed.

The previous chapters 1 - 8 are presentations of previously
and recent work. The last chapter 9 concentrates on
outstanding issues.

Ch.9 briefly summarizes central topics of the book. This is followed
by a detailed discussion of three issues the author views as central
for the future development of the field: research in the factors that
are used by resolution algorithms, improvement of pre-processing, and
the need for annotated corpora. A number of other outstanding issues
are raised in the last section of the book. The author hints at the
freely available material that can be obtained from his projects URL
(which has meanwhile changed to


The chapters are to some extent self-contained, i.e., most
of the knowledge one needs to comprehend each chapter is
introduced right there. This has both the advantage that
each chapter (and, in some cases - e.g., the surveys of
implementations in Ch.4 - each section within a chapter) can
be read in isolation and the disadvantage of being redundant
to the same degree.

As to the layout of the book, it is somewhat strange that
Ch.8 on evaluation does not precede Ch.7, which introduces
Mitkov's own approach and puts much emphasis on evaluation.
Given the degree of self-containedness of the chapters,
there is no actual loss of readability.

With regard to future work in the area of automatic anaphora
resolution, it could be added that more diverse domains
would be desirable than those which are currently treated.
I would like to take an extreme example: There are
difficulties in, e.g., spoken language that do never occur
in written text like manuals. Here, it would be interesting
to see which strategy had to be pursued in order to come to
grips with multi-speaker sequences such as the following
taken from our own corpus. (aX) marks utterance X from
speaker a, (bY) utterance Y from speaker b.

(a1) Well, now you take
(b1) a bolt
(a2) an orange one with a slit
(b2) yes
(a3) and you put it through there
(b3) from above
(a4) from above
so that the three get fixed then
(b4) yes

One of the problems with this short stretch of discourse is
that the utterances in the dialog do not consist of complete
sentences or illocutionary acts. So many of the indicators
that have been used in the accounts that are detailed in the
book can not be applied for this case. But note that the
pronoun "it" in a3 has either "a bolt" (b1) or "an orange
one..." (a2) as an antecedent. Another interesting feature
of the sample discourse is the connection between a3 and b3,
which is clearly anaphoric in that b wants to know in which
direction the bolt has to be put through some hole in a bar.
(This, of course, is not a case of NP anaphora.) At the
moment, it seems clear that resolving anaphora in spoken
language would be too big a task.

The desire for more diversity by no means diminishes the
value of the book under review. To modify a quote from the
book that is intended to exemplify a case of anaphora:
"|The book| is not merely a survey of anaphora resolution:
|it| also presents the latest research by the author." I
would add: "Every library should have a copy of |it|."

ABOUT THE REVIEWER Peter Kühnlein is a PhD student at the University of Bielefeld. His main interests are theories of reference and multi-modal dialog as well as philosophy of language and philosophy of science. He is research assistant at the collaborative research center SFB 360.