Publishing Partner: Cambridge University Press CUP Extra Publisher Login
amazon logo
More Info


New from Oxford University Press!

ad

Oxford Handbook of Corpus Phonology

Edited by Jacques Durand, Ulrike Gut, and Gjert Kristoffersen

Offers the first detailed examination of corpus phonology and serves as a practical guide for researchers interested in compiling or using phonological corpora


New from Cambridge University Press!

ad

The Languages of the Jews: A Sociolinguistic History

By Bernard Spolsky

A vivid commentary on Jewish survival and Jewish speech communities that will be enjoyed by the general reader, and is essential reading for students and researchers interested in the study of Middle Eastern languages, Jewish studies, and sociolinguistics.


New from Brill!

ad

Indo-European Linguistics

New Open Access journal on Indo-European Linguistics is now available!


Academic Paper


Title: Structure-guided supertagger learning
Author: Yao-Zhong Zhang
Institution: University of Tokyo
Author: Takuya Matsuzaki
Institution: University of Tokyo
Author: Jun-ichi Tsujii
Institution: Microsoft Research Asia
Linguistic Field: Computational Linguistics
Abstract: As described in this paper, we specifically examine the structural learning problem of a supertagging task. Supertagging is a task to assign the most probable lexical entry to each word in a sentence. A supertagger is extremely important for a lexicalized grammar parser because an accurate supertagger can greatly reduce lexical ambiguity in downstream parser. Supertagging is more challenging than conventional sequence labeling tasks (e.g., part-of-speech tagging). First, the supertags are numerous. Supertags are the lexical entries defined in a lexicalized grammar, which consists of rich syntactic/semantic information. Second, the inter-supertag relation is more complex. A proper supertag assignment is expected to be compatible with other supertag assignments in a sentence to construct a parse tree. Commonly used adjacent label features (e.g., first-order edge feature) in a sequence labeling model are too rough for the supertagging task. Long-range information is extremely important for the supertagging task. Two approaches to consider long-range information in a supertagger's training stage are proposed. Specifically, we propose a dependency-informed supertagger to use word-to-word dependency derived from a dependency parser and generate long-range features as soft constraints in the training. In the forest-guided supertagger, we constrain the classifier to learn in a grammar-satisfying space and use a CFG filter to impose grammar constraints for the update of model parameters. The experiments show that the proposed structure-guided supertaggers perform significantly better than the baseline supertaggers. Based on the improved supertaggers, the F-score of the final parser is also improved. Using the forest-guided supertagger in a shift-reduce HPSG parser, we achieved a competitive parsing performance of 89.31% F-score with higher parsing speed than that of a state-of-the-art HPSG parser.

CUP at LINGUIST

This article appears in Natural Language Engineering Vol. 18, Issue 2, which you can read on Cambridge's site .



Back
Add a new paper
Return to Academic Papers main page
Return to Directory of Linguists main page