|EDITORS: Srinivas Bangalore and Aravind K. Joshi
SUBTITLE: Using Complex Lexical Descriptions in Natural Language Processing
PUBLISHER: MIT Press
William Corvey, Department of Linguistics, University of Colorado at Boulder
As the editors note in the Introduction, an encoding of linguistic information
can be accomplished using either simple or complex primitives. Simple primitives
have the advantage of being straightforward to annotate, but require more
complex operations to compose larger structures; more complicated primitives
require more advanced local annotation of linguistic features but require only
general operations for composition. The approach taken in this text is the
latter, to “complicate locally, simplify globally (CLSG)” (2).
Supertagging is an approach of “almost parsing” (Joshi and Srinivas, 1994;
Srinivas, 1996; Srinivas and Joshi, 1998), whereby statistical techniques from
part of speech disambiguation may be applied to the elementary trees associated
with lexical items in Lexicalized Tree Adjoining Grammars (LTAGs) (e.g. Joshi
1985), and this processing allows for more efficient complete parsing. LTAGs
differ from context-free grammars (CFGs) in several important ways. First, LTAGs
and CFGs contain distinct primitive elements. Whereas the domain of locality of
a CFG is expressed as a rule, an LTAG instead encodes elementary trees, which
associate a verb with its arguments. This is done in order to localize all
dependencies within a single domain; the editors show that constraint
specification is often spread over several domains of locality in CFGs, which is
counter to the CLSG approach. Second, the LTAGs and CFGs accomplish phrase
composition differently. CFGs derive phrases via the application of rules; to
form larger LTAG trees, elementary trees may be composed by operations,
particularly adjoining (9). In LTAG, these parse trees are called “derived
trees,” as they are the result of attaching several elementary trees together.
LTAG “derivation trees” provide a record of the operations combinations
performed to produce a derived tree (35). Therefore, while CFGs are defined by
a set of primitives and context-free rules, an LTAG specification instead
contains elementary trees and derivation trees.
The book contains nineteen chapters, including the Introduction, divided into
five sections. I will first summarize the five parts of the book and then turn
to a discussion of the text.
Part One describes supertag creation and the arrangement of supertags in an
inventory, focusing on the development of tree adjoining grammars (TAGs).
Because TAGs are difficult to construct manually, automatic extraction of
grammars from large-scale lexical resources is an appealing alternative where
feasible. Part One describes two systems for the extraction of TAGs from
Chapter 2 (“From Treebanks to Tree-Adjoining Grammars,” Fei Xia and Martha
Palmer) describes LexTract, a system that produces both CFGs and LTAGs from
treebanks. Xia and Palmer first describe the three user-supplied input tables
required by the system; these tables provide the information to construct
elementary trees, which mark heads and distinguish between arguments and
adjuncts. The chapter then proceeds with a description of a three-step
extraction algorithm, which (1) converts a Treebank tree into an LTAG derived
tree; (2) decomposes this derived tree into elementary trees; and (3) creates
derivation trees showing how elementary trees are combined to produce the
derived tree. The resulting set of derivation trees is then used to train
statistical LTAG parsers. The authors then compare the system to other
extraction algorithms for CFG and LTAG and describe a variety of applications
using output from LexTract.
In Chapter 3 (“Developing Tree-Adjoining Grammars with Lexical Descriptions,”
Fei Xia, Martha Palmer, and K. Vijay-Shanker), the authors present LexOrg, a
system that produced LTAG grammars from abstract specifications (73). This
system is similar to LexTract in that it employs user-supplied information in
order to implement parsing. However, in the case of LexOrg, this information
comes in the form of abstract specifications encoding specific linguistic
information. The chapter describes a method of eliciting this linguistic
information from the user whereby users are required to enter feature equations,
rather than tree templates. This process ensures consistency across all
templates in the XTAG grammar (e.g. XTAG-Group, 1998) while requiring less
effort on the part of the user. These abstract specifications inform key modules
within LexOrg. The chapter also includes experimental results and comparison to
Part Two describes implementations of supertag parsers and the use of
supertagging in other parsing applications.
Chapter 4 (“Complexity of Parsing for Some Lexicalized Formalisms,” Giorgio
Satta) describes an innovative parsing algorithm for processing LTAGs more
efficiently. Satta presents LTAG parsing in the broader context of parsing for
lexicalized formalisms. The work extends an algorithm developed for lexicalized
context-free grammars to LTAGs, resulting in a significant increase in the
efficiency of an LTAG parsing algorithm.
In Chapter 5 (“Combining Supertagging and Lexicalized Tree-Adjoining Grammar
Parsing,” Anoop Sarkar), the author explores two factors that impact TAG parser
efficiency: syntactic lexical ambiguity and sentence complexity. Supertagging is
shown to reduce lexical ambiguity and thus improve parser efficiency. The
chapter also details a co-training design using a supertagging parser and a
statistically-trained LTAG parser.
Chapter 6 (“Discriminative Learning of Supertagging,” Libin Shen) gives an
overview of a system for supertagging based on discriminative learning, which
overcomes the problem of noise generated by a trigram supertagger. Using NP
chunking as an evaluation task, Shen shows that supertagging, correctly
implemented, can yield an improvement in NP chunking accuracy (where experiments
using a trigram supertagger showed lower performance). Shen demonstrates
techniques for overcoming problems of sparse data, taking advantage of the rich
feature sets provided by supertags, and forcing the learning algorithm to zero
in on the most difficult classification cases.
Chapter 7 (“A Nonstatistical Parsing-Based Approach to Supertagging,” Pierre
Boullier) proposes a disambiguation model using structural constraints as
opposed to statistical modeling. The type of model Boullier outlines is
automatically deduced from an LTAG and does not require additional training
data. In contrast to statistical systems choosing one or n-best tags during
parsing, the system outlined here ensures that all supertags needed to parse a
sentence will be available for processing; this approach therefore gives 100%
recall. The author provides comparison among several supertaggers constructed
using this paradigm and provides evaluation.
Chapter 8 (“Nonlexical Chart Parsing for TAG,” Alexis Nasr and Owen Rambow)
describes a Generative Dependency Grammar (GDG) parser for supertags. GDG is a
type of nonlexicalized chart parser that produces dependency parse output from
supertagged input. Nasr and Rambow provide evaluation on the Penn Treebank data
and describe parser efficiency. The chapter also details distinctions between
this approach, previous work in the area of dependency parsing, and the
Lightweight Dependency Analyzer (LDA; Bangalore, 2000).
Part Three gives an overview of supertags and supertag utilization in
Chapter 9 (“Supertagging for Efficient Wide-Coverage CCG Parsing,” Stephen Clark
and James R. Curran) describes a supertagging implementation in Combinatorial
Categorical Grammar (CCG; Steedman, 2000). The authors present a system
incorporating supertagging into a CCG parser. The chapter also illustrates a
model for CCG supertag disambiguation that yields multiple supertags per word;
this allows the system to find more supertags for a given span if the parser
fails to cover the entire span with the supertags currently under consideration.
The authors conclude by presenting parser evaluation and discussing the role of
supertagging in CCG parsing.
Chapter 10 (“Constraint Dependency Grammars: SuperARVs, Language Modeling, and
Parsing,” Mary P. Harper and Wen Wang) describes supertagging with constraint
dependency grammars (Mayurama, 1990). Parsing is viewed as a constraint
satisfaction problem. Harper and Wang introduce super abstract role values
(SuperARVs) to encode morphosyntactic information (which is analagous to the
supertag encoding of syntactic information). SuperARVs are gleaned from the Penn
Treebank, and the authors test two parsing methods. The first method performs
SuperARV disambiguation prior to dependency parsing and the second method
performs disambiguation and linking in conjunction with one another. The parsers
are evaluated using the Wall Street Journal and a speech recognition task.
In Chapter 11 (“Guiding a Constraint Dependency Parser with Supertags,” Kilian
Foth, Tomas By, and Wolfgang Menzel), the authors describe a constraint
dependency parser incorporating supertags. Supertags are formed from dependency
trees taken from the German NEGRA and TIGER corpora. These supertags are
designed to be especially information-rich, and thus increase the size of the
supertag vocabulary. The authors note that this increase in vocabulary size does
not cause a proportional increase in supertagger error, and this motivates an
exploration of methods of feature encoding to maximize the performance of a
rule-based weighted constrained dependency parser using rich supertags.
Chapter 12 (“Extraction of Type-Logical Supertags from the Spoken Dutch Corpus,”
Richard Moot) describes supertags in type-logical grammars (e.g. Lambek, 1958).
Moot first provides an introduction to type-logical grammars, which parse via
theorem proving. The chapter then proceeds by detailing extraction of a
type-logical treebank from the Spoken Dutch Corpus; the resulting lexicon forms
the basis of a supertagging vocabulary. The system is trained on a filtered
version of this dataset and evaluation of a supertagging system is presented.
Chapter 13 (“Extracting Supertags from HPSG-Based Treebanks,” Günter Neumann and
Berthold Crysmann) describes supertagging in the context of Head-Driven Phrase
Structure Grammar (HPSG; Pollard and Sag, 1994). The authors extract a
Lexicalized Tree Insertion Grammar (LTIG) from a German Treebank included in
Verbmobil. Trees from this grammar form supertags in an LTIG parser. The authors
present parser evaluation on both the Verbmobil and NEGRA corpora.
Chapter 14 (“Probabilistic Context-Free Grammars with Latent Annotations,”
Takuya Matsuzaki, Yusuke Miyao, and Jun’ichi Tsujii) gives a method for
extending localization in context-free grammars. The authors embellish
nonterminal s in a CFG with latent annotations taking a value from a fixed set.
These latent variables allow some dependence in the CFG. The chapter presents
evaluation results and a discussion of variations in the values associated with
each latent variable.
Chapter 15 (“Computational Paninian Grammar Framework,” Akshar Bharati and
Rajeev Sangal) describes a supertag implementation in Paninian Grammar. First
developed for Sanskrit, computational implementations of Paninian Grammar have
found utility for describing a variety of Indian languages (e.g. Bharati et.
al., 1995; Narayana, 1994), as well as other languages (Bharati et. al. 1997,
for English; Pedersen et. al. 2004, for Arabic). The authors discuss a parser
implementation using Computational Paninian Grammar and compare this
implementation and its performance with those of LTAG.
Part Four is dedicated to exploring linguistic and psycholinguistic issues
related to supertagging.
Chapter 16 (“Lexicalized Syntax and Phonological Merge,” Robert Frank) argues
that elementary trees are sufficient for encoding all the syntactic features of
a lexical item. The chapter tests this Syntactic Lexicalization Hypothesis (373)
in the phonological domain. The author illustrates the effectiveness of
elementary trees in describing the total syntactic representations for a variety
of linguistic phenomena.
Chapter 17 (“Constraining the Form of Supertags with the Strong Connectivity
Hypothesis,” Alessandro Mazzei, Vincenzo Lombardo, and Partick Sturt) provides a
model of incremental sentence processing using supertags. The authors introduce
a Dynamic Version of TAG (DVTAG) equipped to handle predicted heads. The chapter
illustrates an extraction of a DVTAG from the Penn Treebank and the authors
discuss the resultant number of predicted tags and the plausibility of an
extracted model as a proxy for human language processing.
Part Five describes several applications of supertagging.
Chapter 18 (“Semantic Labeling and Parsing via Tree-Adjoining Grammars,” John
Chen) details semantic role labeling (SRL) systems utilizing deep linguistic
features and compares performance to a model using surface features only. The
target labels of the SRL system are PropBank roles, annotated over the Penn
Treebank. Chen extracts LTAGs from the treebank to build supertaggers and an
LDA, which uses PropBank information expressed as part of the syntactic
constituent label. The chapter presents evaluation indicating that a
supertagging approach can improve SRL performance.
Chapter 19 (“Applications of HMM-Based Supertagging,” Karin Harbusch, Jens
Bäcker, and Saša Hasan) describes two applications of Hidden Markov Model
(HMM)-based supertagging: a dialog system using supertags and a system for
disambiguation on small device keyboards. Both applications use a combination of
a supertagger with an LDA. The HMM approach to supertagging is motivated by the
efficiency of decoding algorithms and by an improvement in performance for both
German and English data. The authors present implementation details of the
system components and use the applications for evaluation.
The text reviewed here provides a detailed overview of the theory,
implementation, and applications of supertagging in a variety of domains within
computational linguistics and related disciplines. Each of the five sections
provides knowledge critical to the reader’s ability to understand and use
supertags. Chapters in Part One describe methods for building LTAG grammars,
which are precursors to many supertagging systems, from treebanked data or
user-entered specifications. Papers in Part Two provide an outline of how to
implement supertag parsers in a variety of formats. Part Three illustrates
supertag implementations in a variety of linguistic formalisms to suit the needs
of many systems. Finally, Parts Four and Five provide empirical and
application-based justification for the supertagging approach. The text remains
coherent, despite covering a wide range of topics.
While the text provides a detailed introduction to supertagging and Tree
Adjoining Grammars, the editors presuppose some reader knowledge of many
linguistic subfields and formalisms. While each paper includes an introduction
to the formalism or methods included, the book might not be readily accessible
to an audience outside of the computational linguistics community or to those
unfamiliar with the intricacies of parsing tasks. However, most chapters provide
references to introductory materials for the motivated reader.
In general, the text provides an empirical justification of the efficacy of
supertags for a variety of tasks and provides implementation examples that
inspire future work in this area. The book would be a valuable resource for
linguists interested in computational grammars and parsing, and to machine
learning researchers interested in linguistic formalisms and the design of
complex syntactic features.
Bangalore, S. (2000). A lightweight dependency analyzer for partial parsing.
Journal of Natural Language Engineering: 6(2):113-138.
Bharati, A., Bhatia, M., Chaitanya, V., and Sangal, R. (1996). Paninian Grammar
Framework Applied to English. Technical Report TRCS-96-238, CSE, IIT Kanpur.
Bharati, A., Chaitanya, V., and Sangal, R. (1995). Natural Language Processing:
A Paninian Perspective. New Delhi: Prentice Hall of India.
Joshi, A. K. (1985). Tree Adjoining Grammars: How Much Context-Sensitivity Is
Required to Provide Reasonable Structural Descriptions? Natural Language
Parsing: Psychological, Computational and Theoretical Perspectives, pp. 206-250.
Joshi, A.K. and Srinivas, B. (1994). Disambiguation of super parts of speech
(supertags): Almost parsing. In Proceedings of the 1994 International Conference
on Computational Linguistics (COLING), Kyoto, Japan. pp. 154-160.
Lambek, J. (1958). The mathematics of sentence structure. American Mathematical
Mayurama, H. (1990). Constraint Dependency Grammar. Technical Report #RT0044,
IBM, Tokyo, Japan.
Narayana, V.N. (1994). Anusarak: A Device to Overcome the Language Barrier. PhD
thesis, Department of CSE, IIT Kanpur, January, 1994.
Pedersen, M. J., Eades, D., Amin, S.K. and Prakash, L. (2004). Parsing Arabic
relative clauses: A paninian dependency grammar approach. In: S. Shah and S.
Hussain, Proceedings of the Eighth International Multitopic Conference. The
Eighth International Multitopic Conference (INMIC 2004), Lahore, Pakistan,
(573-578). 24-26 December, 2004.
Pollard, C. and Sag, I. (1994). Head-Driven Phrase Structure Grammar. Chicago:
University of Chicago Press.
Srinivas, B. (1996). “Almost Parsing” Technique for Language Modeling. In
Proceedings of ICSLP96 Conference, Philadelphia, PA, pp. 1173-1176.
Srinivas, B. and Joshi, A.K. (1998). Supertagging: An approach to almost
parsing. Computational Linguistics: 22:1-29.
Steedman, M. J. (2000). The Syntactic Process. The MIT Press, Cambridge, M.A.
XTAG-Group. (1998). A Lexicalized Tree Adjoining Grammar for English. Technical
Report IRCS 98-18, University of Pennsylvania.
ABOUT THE REVIEWER
| ABOUT THE REVIEWER:
William Corvey is a PhD student in the Department of Linguistics and the
Institute of Cognitive Science at the University of Colorado at Boulder.
His main research interests are in discourse processing, computational
applications of Conversation Analysis, and the construction and use of
large-scale lexical resources, particularly VerbNet.