LINGUIST List 17.2660
|
Mon Sep 18 2006
FYI: Call for Collaboration: Latin Treebank
Editor for this issue: Hunter Lockwood
<hunter linguistlist.org>
|
To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.
|
Directory
1. David
Bamman,
Call for Collaboration: Latin Treebank
Message 1: Call for Collaboration: Latin Treebank
|
Date: 18-Sep-2006
From: David Bamman <David.Bamman tufts.edu>
Subject: Call for Collaboration: Latin Treebank
Call for Collaboration: Latin Treebank The Perseus Project has recently received a planning grant from the NSF to investigate the costs and labor involved in constructing a multimillion-word Latin treebank (a large collection of syntactically parsed sentences), along with its potential value for the linguistics and Classics community. While our initial efforts under this grant will focus on syntactically annotating excerpts from Golden Age authors (Caesar, Cicero, Vergil) and the Vulgate, a future multimillion-word corpus would be comprised of writings from the pre-Classical period up through the Early Modern era. To date we've annotated a total of 12,000 words in a style that's predominantly informed by two sources: the dependency grammar used by the Prague Dependency Treebank (itself based on Mel'cuk 1988), and the Latin grammar of Pinkster 1990. While treebanks provide valuable training data for computational tasks such as grammar induction and automatic syntactic parsing, they also have the potential to be used in traditional research areas as well. Large collections of syntactically parsed sentences have the potential to revolutionize lexicography and philology, as they provide the immediate context for a word's use along with its typical syntactic arguments (this lets us chart, for example, how the meaning of a verb changes as its predominant arguments change). Treebanks enable large-scale research into structurally-based rhetorical devices particularly of interest to Classicists (such as hyperbaton) and they provide the raw data for research in historical linguistics (such as the move in Latin from classical SOV word order to romance SVO). The eventual Latin treebank will be openly available to the public; we should, therefore, come to a consensus on how it should be built. To that end we encourage input from the linguistics and Classics community on the treebank design (including the syntactic representation of Latin) and welcome contributions by annotators (for which limited funding is available). Interested collaborators should contact David Bamman (David.Bamman tufts.edu) at the Perseus Project.
Linguistic Field(s): Historical Linguistics; Syntax; Text/Corpus Linguistics
Respond to list|Read more issues|LINGUIST home page|Top of issue
|
|

Please report any bad links or misclassified data
LINGUIST Homepage | Read
LINGUIST | Contact us

While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed on its pages, it cannot vouch for their contents.
|
|