|
|
E-mail this message to a friend
|
|
Title:
|
From Extracts to Abstracts: Human summary production operations for computer-aided summarisation
|
|
Author:
|
Laura Hasler
|
|
Email:
|
click here to access email
|
|
Degree Awarded:
|
University of Wolverhampton
, School of Humanities, Languages and Social Sciences
|
|
Degree Date:
|
2007
|
|
Linguistic Subfield(s):
|
Computational Linguistics
Text/Corpus Linguistics
|
|
Director(s):
|
Ruslan Mitkov
Constantin Orasan
Michael Hoey
|
|
|
Abstract:
|
|
This thesis is concerned with the field of computer-aided summarisation,
which has emerged at the confluence of the separate but related fields of
human and automatic summarisation. Due to the poor quality of the
readability and coherence of automatically produced extracts,
computer-aided summarisation (CAS) is a viable working option to fully
automatic summarisation. CAS allows a human summariser to post-edit
automatically produced extracts to improve their readability and coherence.
In order to best utilise the concept of computer-aided summarisation,
reliable ways of improving the coherence and readability of extracts when
transforming them into abstracts must be established.
To achieve this, a corpus-based analysis of the operations a human
summariser applies to extracts to transform them into abstracts is
presented. The corpus developed here is a corpus of pairs of news texts
annotated for important information (i.e., human-produced extracts) and the
human-produced abstracts corresponding to these extracts. The creation of
this corpus simulates the computer-aided summarisation process to enable a
reliable investigation into the operations used. A detailed classification
of human summary production operations is proposed, with examples which
highlight the common linguistic realisations and functions of the
operations identified in the corpus. The classification is then used as a
basis for guidelines which can be given to users of computer-aided
summarisation systems in order to ensure that the summaries they produce
are of a consistently high quality.
The human summary production operations are applied to extracts using the
guidelines in order to evaluate them. Evaluation is performed using a
metric developed for Centering Theory, a discourse theory of local
coherence and salience, which constitutes a new evaluation method. This is
appropriate because existing methods of evaluating summaries are
unsuitable. A set of both automatic and human-produced extracts and their
corresponding abstracts are evaluated, and a comparison is made with
evaluations given by a human judge. The evaluation shows that when the
operations are applied to extracts using the guidelines, there is an
improvement in the readability and coherence of the resulting abstracts.
|
|
|
|
|
Page Updated: 27-Nov-2009

Please report any bad links or misclassified data
LINGUIST Homepage | Read
LINGUIST | Contact us

While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed on its pages, it cannot vouch for their contents.
|
|