Tue Dec 14 2004

Diss: Computational Ling: De Boni: 'A relevance-based..'

Date: 13-Dec-2004
From: Marco De Boni <>
Program: Department of Computer Science
Dissertation Status: Completed
Degree Date: 2004

Author: Marco De Boni

Dissertation Title: A relevance-based theoretical foundation for question answering

Dissertation URL:

Linguistic Field(s): Applied Linguistics
                            Computational Linguistics

Dissertation Director:
Suresh Manandhar

Dissertation Abstract:

While there is a very large amount of written information available in
electronic format, there is no easy way to automatically find a reliable
answer to simple questions such as "Who is the president of the US?".
Research in Question Answering (QA) systems address these issues by trying
to find a method for answering a question by searching for a precise
response in a collection of documents. Current systems are no more than
prototypes, and, while there is agreement amongst researchers on the
generic aim of QA systems, little work has been done on clarifying the
problem beyond the establishment of a standard evaluation framework. There
is consequently a significant lack of theoretical understanding of QA
systems and a considerable amount of confusion about their aims and evaluation.

This thesis addresses the need for a theoretical investigation into QA
systems by employing the notion of relevance to clarify their purpose and
elucidate their constituent structure, showing how the theory developed can
be applied in practice.

Initially we examine the concept of answerhood as applicable open domain QA
systems and we argue that there are limits as to what can be considered an
answer to a question. In order to understand the nature of these limits we
examine the concept of relevance, showing that to talk about an answer is
really to speak about the relevance of that answer in relation to a
question; we maintain that it is misleading to talk about absolutely
correct or incorrect answers: we should instead be referring to answers
which are more or less relevant to a question. We then examine the concept
of relevance, illustrating how it is composed of semantic relevance,
dealing with the relationship in meaning between question and answer;
goal-directed relevance, dealing with questioner and answerer goals;
logical relevance, dealing with the more formal relationship which
considers whether an answer provides the information which the question
sought; and morphic relevance, dealing with the form an answer takes in
relation to a question.

From the notion of relevance we built a model of QA systems which
illustrates the constraints under which they operate: we show how an answer
is constrained by the questioner and the answerer's prior knowledge, goals,
rules of inference, answer form preferences, as well as the questioner and
the answerer’s approach to giving relevance judgements from the point of
view of semantic, goal-directed, logical, morphic and overall relevance.

We then illustrate how the framework can be used to improve current
TREC-style QA systems by considering each component of relevance
individually and implementing that component starting from a “standard”
TREC-style QA system, YorkQA

