LINGUIST List 12.688

Mon Mar 12 2001

Review: Reiter & Dale, Building NL Generation Systems

Editor for this issue: Terence Langendoen <terrylinguistlist.org>

What follows is another discussion note contributed to our Book Discussion Forum. We expect these discussions to be informal and interactive; and the author of the book discussed is cordially invited to join in.

If you are interested in leading a book discussion, look for books announced on LINGUIST as "available for discussion." (This means that the publisher has sent us a review copy.) Then contact Simin Karimi at siminlinguistlist.org or Terry Langendoen at terrylinguistlist.org.

Directory

[iso-8859-1] Korn�l R. Bangha, review: Reiter, Ehud, and Robert Dale (2000) Building Natural Language Generation Systems

Message 1: review: Reiter, Ehud, and Robert Dale (2000) Building Natural Language Generation Systems

Date: Mon, 12 Mar 2001 14:52:42 -0500
From: [iso-8859-1] Korn�l R. Bangha <banghakMAGELLAN.UMontreal.CA>
Subject: review: Reiter, Ehud, and Robert Dale (2000) Building Natural Language Generation Systems

Reiter, Ehud, and Robert Dale (2000) Building Natural Language Generation Systems, Cambridge University Press, 248pp.

Reviewed by: Kornel Bangha, University of Montreal

SYNOPSIS

This book explains how to build Natural Language Generation (NLG) systems - computer software systems which use techniques from artificial intelligence and computational linguistics to automatically generate understandable texts in English or other human languages. Typically starting from some non-linguistic representation of information of input, NLG systems use knowledge about language and the application domain to automatically produce documents, reports, explanations, help messages, and other kinds of texts. The book is based on one particular architectural decomposition of the NLG task which consists of three modules: document planning, microplanning, and surface realisation.

Chapter 1 - Introduction

NLG is presented both from a research perspective and from an applications perspective. From a research perspective, NLG is a subfield of natural language processing which in turn can be seen as a subfield of both computer science and cognitive sciences. The authors make a comparison between generation and understanding, the other part of natural language processing. From an applications perspective, most current NLG systems are used either to present information to users, or to (partially) automate the production of routine documentation. Six examples of NLG systems are presented in this chapter followed by a short history of NLG.

Chapter 2 - Natural Language Generation in Practice

This chapter considers alternatives to NLG and examines the circumstances under which it is appropriate to use NLG systems. The first question considered is when an NLG system is indeed the most appropriate. Advantages and disadvantages of alternatives like use of graphics, mail- merge systems and human authoring are considered. An important issue for the question is how to create a corpus to determine user requirements. Evaluating and fielding NLG systems are also discussed.

Chapter 3 - The Architecture of a Natural Language Generation System

This chapter gives an overview of the inputs and the outputs of NLG and of a specific architecture that embodies one particular decomposition of the process into distinct modules. One can caracterise the input of an NLG system as a four-tuple containing the knowledge source, the communicative goal, the user model and the discourse history. The output of the generation is a text. However, this is much more than just a stream of ASCII text. The generation process can be decomposed into three component modules, which will be referred to as the document planner, the mircoplanner, and the surface realiser. Each of these three modules accomplishes a content task and a structure task. An overview of these modules is presented here, each of them will be studied in the following chapters.

Chapter 4 - Document planning

This chapter is about document planning: it presents what the task of the first component of an NLG system is and how it works. The document planner is responsible for deciding what information - messages - to communicate (this being the task of content determination) and determining how this information should be structured for presentation (this being the task of document structuring). Content determination usually involves one or more of selecting, summarising and reasoning with data. Very few NLG systems simply generate massages that communicate all the input data. Document structuring is important because documents are not just random collections of sentences. They possess coherence and thematic structure, which is to say that the content is expressed in a way that is easy for humans to read and understand.

Chapter 5 - Microplanning

This chapter presents why microplanning is important and what it is concerned with: lexicalisation, aggregation and referring expression generation. Lexicalisation is the process of choosing word and syntactic structures to communicate the information. Aggregation takes a set of simple phrase specification and combines them to permit the generation of more complex sentence structures. Referring expression generation turns knowledge base entities into semantic content of noun-phrase referring expressions that will be sufficient to identify the intended referents to the hearer.

Chapter 6 - Surface Realisation

This chapter describes the surface realisation which is the third module of the NLG system proposed in this book. It has two parts: the linguistic realisation and the structure realisation. Linguistic realisation is the task of converting abstract representations of sentences into a real text; it corresponds to the content task of surface realisation. Structure realisation is the task of converting abstract structures such as paragraphs and sections into the mark-up symbols understood by the NLG system being used; this corresponds to the structural side of surface realisation. The tasks accomplished by the surface realisation are illustrated here by three NLG systems.

Chapter 7 - Beyond Text Generation

This final chapter looks beyond text generation and examines some of the issues that arise when one considers the generation of text contained within some medium. The authors present the role of typography, graphics and hypertext in NLG and the problems that arise when they are implemented in real systems. The speech output is also considered.

CRITICAL EVALUATION

The authors are concerned with both theoretical and practical questions about NLG. It is clear however, that their main interest is practical: how to construct NLG systems which are useful and work efficiently. This also means that the authors are concerned less with fundamental linguistic questions than with knowledge representation and with the mapping of knowledge structures into linguistic representations.

Most of the time, the content of the book is clear and easy to read. A great number of examples is provided.

The focus of the book is very large and each of the specific aspects of NLG can not be discussed in details: it is rather intended as a general introduction to the topic of NLG. Pointers to further readings provided at the end of each chapter can help the reader.

About the reviewer: Kornel Bangha prepares a Ph. D. of linguistics and Artificial Intelligence at the University of Montreal. His research is about how the process of interpretation of linguistic units in discourse is influenced not only by semantic factors but also by the context and by knowledge about the world.