LINGUIST List 23.2359

Thu May 17 2012

Review: Computational Linguistics; General Ling.: Bangalore & Joshi (2010)

Editor for this issue: Joseph Salmons <jsalmonslinguistlist.org>



Date: 17-May-2012
From: William Corvey <william.corveycolorado.edu>
Subject: Supertagging
E-mail this message to a friend

Discuss this message

Announced at http://linguistlist.org/issues/21/21-1245.html

EDITORS: Srinivas Bangalore and Aravind K. JoshiTITLE: SupertaggingSUBTITLE: Using Complex Lexical Descriptions in Natural Language ProcessingPUBLISHER: MIT PressYEAR: 2010

William Corvey, Department of Linguistics, University of Colorado at Boulder

INTRODUCTIONAs the editors note in the Introduction, an encoding of linguistic informationcan be accomplished using either simple or complex primitives. Simple primitiveshave the advantage of being straightforward to annotate, but require morecomplex operations to compose larger structures; more complicated primitivesrequire more advanced local annotation of linguistic features but require onlygeneral operations for composition. The approach taken in this text is thelatter, to "complicate locally, simplify globally (CLSG)" (2).

Supertagging is an approach of "almost parsing" (Joshi and Srinivas, 1994;Srinivas, 1996; Srinivas and Joshi, 1998), whereby statistical techniques frompart of speech disambiguation may be applied to the elementary trees associatedwith lexical items in Lexicalized Tree Adjoining Grammars (LTAGs) (e.g. Joshi1985), and this processing allows for more efficient complete parsing. LTAGsdiffer from context-free grammars (CFGs) in several important ways. First, LTAGsand CFGs contain distinct primitive elements. Whereas the domain of locality ofa CFG is expressed as a rule, an LTAG instead encodes elementary trees, whichassociate a verb with its arguments. This is done in order to localize alldependencies within a single domain; the editors show that constraintspecification is often spread over several domains of locality in CFGs, which iscounter to the CLSG approach. Second, the LTAGs and CFGs accomplish phrasecomposition differently. CFGs derive phrases via the application of rules; toform larger LTAG trees, elementary trees may be composed by operations,particularly adjoining (9). In LTAG, these parse trees are called "derivedtrees," as they are the result of attaching several elementary trees together.LTAG "derivation trees" provide a record of the operations combinationsperformed to produce a derived tree (35). Therefore, while CFGs are defined bya set of primitives and context-free rules, an LTAG specification insteadcontains elementary trees and derivation trees.

The book contains nineteen chapters, including the Introduction, divided intofive sections. I will first summarize the five parts of the book and then turnto a discussion of the text.

SUMMARYPart One describes supertag creation and the arrangement of supertags in aninventory, focusing on the development of tree adjoining grammars (TAGs).Because TAGs are difficult to construct manually, automatic extraction ofgrammars from large-scale lexical resources is an appealing alternative wherefeasible. Part One describes two systems for the extraction of TAGs frompre-existing resources.

Chapter 2 ("From Treebanks to Tree-Adjoining Grammars," Fei Xia and MarthaPalmer) describes LexTract, a system that produces both CFGs and LTAGs fromtreebanks. Xia and Palmer first describe the three user-supplied input tablesrequired by the system; these tables provide the information to constructelementary trees, which mark heads and distinguish between arguments andadjuncts. The chapter then proceeds with a description of a three-stepextraction algorithm, which (1) converts a Treebank tree into an LTAG derivedtree; (2) decomposes this derived tree into elementary trees; and (3) createsderivation trees showing how elementary trees are combined to produce thederived tree. The resulting set of derivation trees is then used to trainstatistical LTAG parsers. The authors then compare the system to otherextraction algorithms for CFG and LTAG and describe a variety of applicationsusing output from LexTract.

In Chapter 3 ("Developing Tree-Adjoining Grammars with Lexical Descriptions,"Fei Xia, Martha Palmer, and K. Vijay-Shanker), the authors present LexOrg, asystem that produced LTAG grammars from abstract specifications (73). Thissystem is similar to LexTract in that it employs user-supplied information inorder to implement parsing. However, in the case of LexOrg, this informationcomes in the form of abstract specifications encoding specific linguisticinformation. The chapter describes a method of eliciting this linguisticinformation from the user whereby users are required to enter feature equations,rather than tree templates. This process ensures consistency across alltemplates in the XTAG grammar (e.g. XTAG-Group, 1998) while requiring lesseffort on the part of the user. These abstract specifications inform key moduleswithin LexOrg. The chapter also includes experimental results and comparison torelated systems.

Part Two describes implementations of supertag parsers and the use ofsupertagging in other parsing applications.

Chapter 4 ("Complexity of Parsing for Some Lexicalized Formalisms," GiorgioSatta) describes an innovative parsing algorithm for processing LTAGs moreefficiently. Satta presents LTAG parsing in the broader context of parsing forlexicalized formalisms. The work extends an algorithm developed for lexicalizedcontext-free grammars to LTAGs, resulting in a significant increase in theefficiency of an LTAG parsing algorithm.

In Chapter 5 ("Combining Supertagging and Lexicalized Tree-Adjoining GrammarParsing," Anoop Sarkar), the author explores two factors that impact TAG parserefficiency: syntactic lexical ambiguity and sentence complexity. Supertagging isshown to reduce lexical ambiguity and thus improve parser efficiency. Thechapter also details a co-training design using a supertagging parser and astatistically-trained LTAG parser.

Chapter 6 ("Discriminative Learning of Supertagging," Libin Shen) gives anoverview of a system for supertagging based on discriminative learning, whichovercomes the problem of noise generated by a trigram supertagger. Using NPchunking as an evaluation task, Shen shows that supertagging, correctlyimplemented, can yield an improvement in NP chunking accuracy (where experimentsusing a trigram supertagger showed lower performance). Shen demonstratestechniques for overcoming problems of sparse data, taking advantage of the richfeature sets provided by supertags, and forcing the learning algorithm to zeroin on the most difficult classification cases.

Chapter 7 ("A Nonstatistical Parsing-Based Approach to Supertagging," PierreBoullier) proposes a disambiguation model using structural constraints asopposed to statistical modeling. The type of model Boullier outlines isautomatically deduced from an LTAG and does not require additional trainingdata. In contrast to statistical systems choosing one or n-best tags duringparsing, the system outlined here ensures that all supertags needed to parse asentence will be available for processing; this approach therefore gives 100%recall. The author provides comparison among several supertaggers constructedusing this paradigm and provides evaluation.

Chapter 8 ("Nonlexical Chart Parsing for TAG," Alexis Nasr and Owen Rambow)describes a Generative Dependency Grammar (GDG) parser for supertags. GDG is atype of nonlexicalized chart parser that produces dependency parse output fromsupertagged input. Nasr and Rambow provide evaluation on the Penn Treebank dataand describe parser efficiency. The chapter also details distinctions betweenthis approach, previous work in the area of dependency parsing, and theLightweight Dependency Analyzer (LDA; Bangalore, 2000).

Part Three gives an overview of supertags and supertag utilization inalternative formalisms.

Chapter 9 ("Supertagging for Efficient Wide-Coverage CCG Parsing," Stephen Clarkand James R. Curran) describes a supertagging implementation in CombinatorialCategorical Grammar (CCG; Steedman, 2000). The authors present a systemincorporating supertagging into a CCG parser. The chapter also illustrates amodel for CCG supertag disambiguation that yields multiple supertags per word;this allows the system to find more supertags for a given span if the parserfails to cover the entire span with the supertags currently under consideration.The authors conclude by presenting parser evaluation and discussing the role ofsupertagging in CCG parsing.

Chapter 10 ("Constraint Dependency Grammars: SuperARVs, Language Modeling, andParsing," Mary P. Harper and Wen Wang) describes supertagging with constraintdependency grammars (Mayurama, 1990). Parsing is viewed as a constraintsatisfaction problem. Harper and Wang introduce super abstract role values(SuperARVs) to encode morphosyntactic information (which is analagous to thesupertag encoding of syntactic information). SuperARVs are gleaned from the PennTreebank, and the authors test two parsing methods. The first method performsSuperARV disambiguation prior to dependency parsing and the second methodperforms disambiguation and linking in conjunction with one another. The parsersare evaluated using the Wall Street Journal and a speech recognition task.

In Chapter 11 ("Guiding a Constraint Dependency Parser with Supertags," KilianFoth, Tomas By, and Wolfgang Menzel), the authors describe a constraintdependency parser incorporating supertags. Supertags are formed from dependencytrees taken from the German NEGRA and TIGER corpora. These supertags aredesigned to be especially information-rich, and thus increase the size of thesupertag vocabulary. The authors note that this increase in vocabulary size doesnot cause a proportional increase in supertagger error, and this motivates anexploration of methods of feature encoding to maximize the performance of arule-based weighted constrained dependency parser using rich supertags.

Chapter 12 ("Extraction of Type-Logical Supertags from the Spoken Dutch Corpus,"Richard Moot) describes supertags in type-logical grammars (e.g. Lambek, 1958).Moot first provides an introduction to type-logical grammars, which parse viatheorem proving. The chapter then proceeds by detailing extraction of atype-logical treebank from the Spoken Dutch Corpus; the resulting lexicon formsthe basis of a supertagging vocabulary. The system is trained on a filteredversion of this dataset and evaluation of a supertagging system is presented.

Chapter 13 ("Extracting Supertags from HPSG-Based Treebanks," Günter Neumann andBerthold Crysmann) describes supertagging in the context of Head-Driven PhraseStructure Grammar (HPSG; Pollard and Sag, 1994). The authors extract aLexicalized Tree Insertion Grammar (LTIG) from a German Treebank included inVerbmobil. Trees from this grammar form supertags in an LTIG parser. The authorspresent parser evaluation on both the Verbmobil and NEGRA corpora.

Chapter 14 ("Probabilistic Context-Free Grammars with Latent Annotations,"Takuya Matsuzaki, Yusuke Miyao, and Jun'ichi Tsujii) gives a method forextending localization in context-free grammars. The authors embellishnonterminal s in a CFG with latent annotations taking a value from a fixed set.These latent variables allow some dependence in the CFG. The chapter presentsevaluation results and a discussion of variations in the values associated witheach latent variable.

Chapter 15 ("Computational Paninian Grammar Framework," Akshar Bharati andRajeev Sangal) describes a supertag implementation in Paninian Grammar. Firstdeveloped for Sanskrit, computational implementations of Paninian Grammar havefound utility for describing a variety of Indian languages (e.g. Bharati et.al., 1995; Narayana, 1994), as well as other languages (Bharati et. al. 1997,for English; Pedersen et. al. 2004, for Arabic). The authors discuss a parserimplementation using Computational Paninian Grammar and compare thisimplementation and its performance with those of LTAG.

Part Four is dedicated to exploring linguistic and psycholinguistic issuesrelated to supertagging.

Chapter 16 ("Lexicalized Syntax and Phonological Merge," Robert Frank) arguesthat elementary trees are sufficient for encoding all the syntactic features ofa lexical item. The chapter tests this Syntactic Lexicalization Hypothesis (373)in the phonological domain. The author illustrates the effectiveness ofelementary trees in describing the total syntactic representations for a varietyof linguistic phenomena.

Chapter 17 ("Constraining the Form of Supertags with the Strong ConnectivityHypothesis," Alessandro Mazzei, Vincenzo Lombardo, and Partick Sturt) provides amodel of incremental sentence processing using supertags. The authors introducea Dynamic Version of TAG (DVTAG) equipped to handle predicted heads. The chapterillustrates an extraction of a DVTAG from the Penn Treebank and the authorsdiscuss the resultant number of predicted tags and the plausibility of anextracted model as a proxy for human language processing.

Part Five describes several applications of supertagging.

Chapter 18 ("Semantic Labeling and Parsing via Tree-Adjoining Grammars," JohnChen) details semantic role labeling (SRL) systems utilizing deep linguisticfeatures and compares performance to a model using surface features only. Thetarget labels of the SRL system are PropBank roles, annotated over the PennTreebank. Chen extracts LTAGs from the treebank to build supertaggers and anLDA, which uses PropBank information expressed as part of the syntacticconstituent label. The chapter presents evaluation indicating that asupertagging approach can improve SRL performance.

Chapter 19 ("Applications of HMM-Based Supertagging," Karin Harbusch, JensBäcker, and Saša Hasan) describes two applications of Hidden Markov Model(HMM)-based supertagging: a dialog system using supertags and a system fordisambiguation on small device keyboards. Both applications use a combination ofa supertagger with an LDA. The HMM approach to supertagging is motivated by theefficiency of decoding algorithms and by an improvement in performance for bothGerman and English data. The authors present implementation details of thesystem components and use the applications for evaluation.

EVALUATIONThe text reviewed here provides a detailed overview of the theory,implementation, and applications of supertagging in a variety of domains withincomputational linguistics and related disciplines. Each of the five sectionsprovides knowledge critical to the reader's ability to understand and usesupertags. Chapters in Part One describe methods for building LTAG grammars,which are precursors to many supertagging systems, from treebanked data oruser-entered specifications. Papers in Part Two provide an outline of how toimplement supertag parsers in a variety of formats. Part Three illustratessupertag implementations in a variety of linguistic formalisms to suit the needsof many systems. Finally, Parts Four and Five provide empirical andapplication-based justification for the supertagging approach. The text remainscoherent, despite covering a wide range of topics.

While the text provides a detailed introduction to supertagging and TreeAdjoining Grammars, the editors presuppose some reader knowledge of manylinguistic subfields and formalisms. While each paper includes an introductionto the formalism or methods included, the book might not be readily accessibleto an audience outside of the computational linguistics community or to thoseunfamiliar with the intricacies of parsing tasks. However, most chapters providereferences to introductory materials for the motivated reader.

In general, the text provides an empirical justification of the efficacy ofsupertags for a variety of tasks and provides implementation examples thatinspire future work in this area. The book would be a valuable resource forlinguists interested in computational grammars and parsing, and to machinelearning researchers interested in linguistic formalisms and the design ofcomplex syntactic features.

REFERENCESBangalore, S. (2000). A lightweight dependency analyzer for partial parsing.Journal of Natural Language Engineering: 6(2):113-138.

Bharati, A., Bhatia, M., Chaitanya, V., and Sangal, R. (1996). Paninian GrammarFramework Applied to English. Technical Report TRCS-96-238, CSE, IIT Kanpur.

Bharati, A., Chaitanya, V., and Sangal, R. (1995). Natural Language Processing:A Paninian Perspective. New Delhi: Prentice Hall of India.

Joshi, A. K. (1985). Tree Adjoining Grammars: How Much Context-Sensitivity IsRequired to Provide Reasonable Structural Descriptions? Natural LanguageParsing: Psychological, Computational and Theoretical Perspectives, pp. 206-250.

Joshi, A.K. and Srinivas, B. (1994). Disambiguation of super parts of speech(supertags): Almost parsing. In Proceedings of the 1994 International Conferenceon Computational Linguistics (COLING), Kyoto, Japan. pp. 154-160.

Lambek, J. (1958). The mathematics of sentence structure. American MathematicalMonthly, 65:154-170.

Mayurama, H. (1990). Constraint Dependency Grammar. Technical Report #RT0044,IBM, Tokyo, Japan.

Narayana, V.N. (1994). Anusarak: A Device to Overcome the Language Barrier. PhDthesis, Department of CSE, IIT Kanpur, January, 1994.

Pedersen, M. J., Eades, D., Amin, S.K. and Prakash, L. (2004). Parsing Arabicrelative clauses: A paninian dependency grammar approach. In: S. Shah and S.Hussain, Proceedings of the Eighth International Multitopic Conference. TheEighth International Multitopic Conference (INMIC 2004), Lahore, Pakistan,(573-578). 24-26 December, 2004.

Pollard, C. and Sag, I. (1994). Head-Driven Phrase Structure Grammar. Chicago:University of Chicago Press.

Srinivas, B. (1996). "Almost Parsing" Technique for Language Modeling. InProceedings of ICSLP96 Conference, Philadelphia, PA, pp. 1173-1176.

Srinivas, B. and Joshi, A.K. (1998). Supertagging: An approach to almostparsing. Computational Linguistics: 22:1-29.

Steedman, M. J. (2000). The Syntactic Process. The MIT Press, Cambridge, M.A.

XTAG-Group. (1998). A Lexicalized Tree Adjoining Grammar for English. TechnicalReport IRCS 98-18, University of Pennsylvania.

ABOUT THE REVIEWERWilliam Corvey is a PhD student in the Department of Linguistics and theInstitute of Cognitive Science at the University of Colorado at Boulder.His main research interests are in discourse processing, computationalapplications of Conversation Analysis, and the construction and use oflarge-scale lexical resources, particularly VerbNet.

Page Updated: 17-May-2012