* * * * * * * * * * * * * * * * * * * * * * * *
LINGUIST List logo Eastern Michigan University Wayne State University *
* People & Organizations * Jobs * Calls & Conferences * Publications * Language Resources * Text & Computer Tools * Teaching & Learning * Mailing Lists * Search *
* *
LINGUIST List 17.2660

Mon Sep 18 2006

FYI: Call for Collaboration: Latin Treebank

Editor for this issue: Hunter Lockwood <hunterlinguistlist.org>


To post to LINGUIST, use our convenient web form at http://linguistlist.org/LL/posttolinguist.html.
Directory
        1.    David Bamman, Call for Collaboration: Latin Treebank


Message 1: Call for Collaboration: Latin Treebank
Date: 18-Sep-2006
From: David Bamman <David.Bammantufts.edu>
Subject: Call for Collaboration: Latin Treebank


Call for Collaboration: Latin Treebank

The Perseus Project has recently received a planning grant from the NSF to
investigate the costs and labor involved in constructing a
multimillion-word Latin treebank (a large collection of syntactically
parsed sentences), along with its potential value for the linguistics and
Classics community. While our initial efforts under this grant will focus
on syntactically annotating excerpts from Golden Age authors (Caesar,
Cicero, Vergil) and the Vulgate, a future multimillion-word corpus would be
comprised of writings from the pre-Classical period up through the Early
Modern era. To date we've annotated a total of 12,000 words in a style
that's predominantly informed by two sources: the dependency grammar used
by the Prague Dependency Treebank (itself based on Mel'cuk 1988), and the
Latin grammar of Pinkster 1990.

While treebanks provide valuable training data for computational tasks such
as grammar induction and automatic syntactic parsing, they also have the
potential to be used in traditional research areas as well. Large
collections of syntactically parsed sentences have the potential to
revolutionize lexicography and philology, as they provide the immediate
context for a word's use along with its typical syntactic arguments (this
lets us chart, for example, how the meaning of a verb changes as its
predominant arguments change). Treebanks enable large-scale research into
structurally-based rhetorical devices particularly of interest to
Classicists (such as hyperbaton) and they provide the raw data for research
in historical linguistics (such as the move in Latin from classical SOV
word order to romance SVO).

The eventual Latin treebank will be openly available to the public; we
should, therefore, come to a consensus on how it should be built. To that
end we encourage input from the linguistics and Classics community on the
treebank design (including the syntactic representation of Latin) and
welcome contributions by annotators (for which limited funding is
available). Interested collaborators should contact David Bamman
(David.Bammantufts.edu) at the Perseus Project.

Linguistic Field(s): Historical Linguistics; Syntax; Text/Corpus Linguistics

Respond to list|Read more issues|LINGUIST home page|Top of issue




Please report any bad links or misclassified data

LINGUIST Homepage | Read LINGUIST | Contact us

NSF Logo

While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.