Marcu, Daniel (2000) The Theory and Practice of Discourse Parsing and Summarization, The MIT Press, A Bradford Book, ISBN: 0-262-13372-5
Reviewed by Catalina Barbu, School of Humanities, Languages and European Studies, University of Wolverhampton, UK
It is an acknowledged fact that discourse exhibits internal structure, that contributes to its cohesiveness and texture. It has also been previously stated that the discourse structure can be used in different natural language processing applications, such as summarization, machine translation, natural language generation, anaphora and coreference resolution. Although research has been conducted in the theory of discourse parsing, previous attempts of building automatic discourse parsers have failed. This book represents the first serious attempt of tackling the discourse parsing problem both from a theoretical and practical point of view. In contrast with other attempts of deriving the discourse structure (see [Kurohashi&Nagao, 1994], [Asher&Lascarides, 1993]), Marcu's model only employs surface-form methods for determining the discourse markers and the textual units, without the need for deep syntactic and semantic analysis; it also employs a shift-reduce model of constructing discourse structures, which comes closer to the way humans construct the discourse trees than the incremental method proposed by other reseachers ([Polanyi1988], [Crsitea&Webber1997]).
The book is organised in 3 main parts: part I describes the linguistic and mathematical theories behind discourse representation, part II tackles the discourse parsing problem from a computational point of view and part III discusses the applications of discourse parsing in text summarisation.
Part I - Theoretical Foundations This first part introduces the concept of discourse structure as a factor of cohesion in the text. Most linguistic theories of discourse structure rely on the following assumptions: that text can be split into sequences of elementary units, that some units are more important than others, that discourse relations hold between units of different sizes and that trees can be used for modelling the structure of a discourse. The author investigates the problem of discourse parsing from two points of view: linguistic and mathematical formalization. Chapter 2 describes one of the most popular theories of discourse structure, RST, analysing the mechanism one employs to build a valid representation of a discourse. The author formulates two compositionality criteria of valid text structures that explain the relationship between discourse relations that hold between large spans of text and discourse relations that hold between elementary discourse units. Chapter 3 introduces a mathematical formalization of valid discourse trees, expressed in the language of first-order logic. The author then proposes a proof theory that provides support for deriving the valid text structures. H proves that the application of the proof theory is sound and complete with respect to the axiomatizaton of text structures.
Part II - The Rhetorical Parsing of Free Texts This part concentrates on the problem of the identification of rhetorical relations that hold between two nits of text. Two methods of rhetorical parsing are presented: one based on manually derived rules and one based on a machine learning approach.
1. The cue-phrase based rhetorical parsing algorithm The first approach is based on the assumption that cue phrases can be used as a sufficiently accurate indicator of the boundaries between elementary textual units and of the rhetorical relations that hold between them. The research that led to designing an algorithm for discourse parsing based on discourse markers was based on an extensive corpus analysis of cue-phrases. The first step in the algorithm is the identification of all the potential discourse markers of a text, then it determines the elementary units of the text and it builds the valid discourse structures. Three levels of granularity are considered: sentence, paragraph and section. Rhetorical relations holding between elementary units are hypothesised on the basis of the corpus analysis of the cue-phrases and the trees for each level of granularity are built using one of the algorithms previously described. The final discourse trees are built by merging the trees corresponding to each level.
The ambiguity of discourse Discourse is inherently ambiguous: more than one correct structure can be usually produced for a single text. One method for disambiguation is to give preference to trees that are skewed to the right, following the assumption that the human readers tend to interpret new textual units as continuations of the topic of previous units. Practically, this preference is expressed in weights associated to trees, the weight of a tree growing proportionally with the development of its right branches.
Evaluation The parser is evaluated against human-built trees and with respect to its suitability for text summarization. Precision and recall are calculated for each step in the building of the tree: identification of elementary units, spans, nuclearity and rhetorical relations. The results show that the parser's performance is consistently below that of humans. However, the author proves that it is still useful in text summarisation for selecting the most salient units of discourse.
2. Rhetorical Parsing by means of automatically derived rules This second method for discourse parsing presented in the book uses decision- tree classifiers for deriving the discourse structure of unrestricted texts. The corpus used for training contains 90 manually built discourse trees for texts extracted from 3 corpora. The first task is the discourse segmentation. A C4.5 classifier is used for classifying lexemes as boundaries of sentences, elementary disourse units, paranthetical discourse units or nonboundaries. The features used in learning model both: - the local context (characteristics of the lexemes surrounding the one under consideration: the part of speech tags of the lexemes in a window of size 5, the potential of a lexeme to be a discourse marker, an estimate of a lexeme to be an abbreviation) and - the global context (existence of certain punctuation marks before the estimated end of sentence, existence of a verb in the current unit). The evaluation of the discourse segmenter shows impressive results, with an accuracy in the range of 92.4-97.87% (depending on the corpus). Sentence boundaries were identified with a precision of 98.55%, which is similar to those obtained by specialised sentence splitters. For modelling the parsing of discourse trees, a shift-reduce parsing model is employed, where elementary discourse trees are processed at each step, either by promoting them through a shift operation or by combining them through a reduce operation. One shift and six reduce operations are implemented for enabling the derivation of any discourse tree. A C4.5 program is for learning decision trees and rules that specify how discourse segments should be assembled into trees, i.e. what action is taken at each step in building the tree. The learning cases are generated by decomposing the training trees into sequences of shift-reduce actions and associating a learning case to each action. Four classes of features are used for learning: -structural features (relating to the structures of the trees in focus) -lexical and syntactic features (regarding the lexemes delimiting the text span subsumed by the trees in focus) -operational factors (regarding the operation previously performed on the trees in focus) -semantic-similarity factors (similarity between the text segments subsumed by the trees in focus and similarity between words contained in the trees). The functioning of the classifier is explained by examples from the MUC corpus. Unfortunately, the low results reported show that the parser in many cases fails to identify the elementary discourse units and the rhetorical relations holding between discourse segments.
Chapter 8 is a discussion on previous research on empirical discourse analysis. Apart from briefly reviewing works in discourse segmentation, cue-phrase disambiguation and the discourse function of cue phrases, the author refers to previous attempts of building discourse parsers, comparing them with the discourse parsers presented in the book. This is followed by a discussion of possible further developments that would allow the performance of the discourse parser to approach human performance levels.
Part III - Summarization This chapter shows how discourse parsers as those described previously in the book can be used for selecting the most salient units of discourse in order to produce summaries. The idea behind discourse-based summarizers, which has been previously hinted, is that nuclei of a discourse tree correlate with what human judges consider to be important in a text, and should therefore appear in a summary. The discourse parser provides a way of computing algorithmically the importance of textual units, by associating weights according to the depth in the discourse tree where the node containing the unit occurs first as a promotion unit. The author evaluates the suitability of using the disourse structure for selecting the most important units in a text, reporting accuracies close to the level of human-constructed summaries. Furthermore, the performance of a summariser that uses the cue-phrase based discourse parser for building the discourse structure is evaluated. The interesting result shows that, although the overall performance of the parser is quite low, the performance of the summarizer (that uses only part of the information supplied by the parser) is still close to the human performance. Furthermore, the discourse based summarizer outperforms two baseline models and the Microsoft Office summarizer. The following chapters describe ways of combining traditional indicators of textual importance, like word frequency, with indicators given by the discourse structure.
Bibliography [Kurohashi&Nagao 1994] Sadao Kurohashi and Makoto Nagao. "Automatic detection of discourse structure by checking surface information in sentences". In Proceedings of the 15th International Conference on Computational Linguistics (Coling94), Kyoto, Japan, 1994 [Asher&Lascarides 1993] Alex Lascarides and Nicholas Asher. "Temporal interpretation, discourse relations and common sense entailment". Linguistics and Philosphy, 16(5), 1993 [Polanyi 1988] Livia Polanyi. "A formal model of the structure of discourse". Journal of Pragmatics, 12, 1988 [Cristea&Webber 1997] Dan Cristea and Bonnie Webber. "Expectations in incremental discourse processing". In Proceedings of ACL/EACL-97, Madrid, Spain, 1997
Catalina Barbu is a PhD student in Computational Linguistics at the University of Wolverhampton, UK. Her field of research is multilingual anaphora resolution.
- ----------------------------------------------- This mail sent through IMP: mail.wlv.ac.uk
|