LINGUIST List 22.4436|
Mon Nov 07 2011
Confs: Text/Corpus Ling, Ling & Literature, Comp Ling/Germany
Editor for this issue: Amy Brunett
LINGUIST is pleased to announce the launch of an exciting new feature: Easy Abstracts! Easy Abs is a free abstract submission and review facility designed to help conference organizers and reviewers accept and process abstracts online. Just go to: http://www.linguistlist.org/confcustom, and begin your conference customization process today! With Easy Abstracts, submission and review will be as easy as 1-2-3!
1. Marco Passarotti ,
Annotation of Corpora for Research in the Humanities
Message 1: Annotation of Corpora for Research in the Humanities
From: Marco Passarotti <marco.passarottiunicatt.it>
Subject: Annotation of Corpora for Research in the Humanities
E-mail this message to a friend
Annotation of Corpora for Research in the Humanities
Short Title: ACRH
Date: 05-Jan-2012 - 05-Jan-2012
Location: Heidelberg, Germany
Contact: Marco Passarotti
Contact Email: < click here to access email >
Meeting URL: http://www.coli.uni-saarland.de/conf/ACRH10/
Linguistic Field(s): Computational Linguistics; Ling & Literature; Text/Corpus Linguistics
The workshop on ‘Annotation of Corpora for Research in the Humanities’ will be held on January 5, 2012 at the University of Heidelberg (Germany) (http://www.coli.uni-saarland.de/conf/ACRH10/).
The workshop aims at building a tighter collaboration between people working in various areas of the Humanities (such as literature, philology, history etc.) and the research community involved in developing, using and making accessible annotated corpora.
Addressing topics related to annotated corpora for research in the Humanities is an interdisciplinary task, which involves corpus and computational linguists (mostly those working in literary computing), philologists, scholars in the Humanities and computer scientists. However, this interdisciplinarity is not fully realised yet. Indeed, philologists and scholars are not used to exploit NLP tools and language resources such as annotated corpora; in turn, computational linguists are more prone to develop language resources for NLP purposes only. For instance, although many corpora that play a relevant role for research in Humanities are today available in digital format (theatrical plays, contemporary novels, critical literature, literary reviews etc.), only a few of them are linguistically tagged, while most still lack linguistic tagging at all. Historical corpora are also a case of special interest, since their creation demands a strong interplay between computational linguistics and more traditional scholarship. Over the past few years a number of historical annotated corpora have been started, among which are treebanks for Middle, Early Modern and Old English, Early New High German, Medieval Portuguese, Ugaritic, Latin, Ancient Greek and several translations of the New Testament into Indo-European languages. The experience of these ever-growing groups of projects can provide many suggestions on the methodology as well as on the practice of interaction between literary studies, philology and corpus linguistics. Moreover, we believe that a tighter collaboration between people working in the Humanities and the research community involved in developing annotated corpora is needed since, while annotating a corpus from scratch still remains a labor-intensive and time-consuming task, today this is simplified by intensively exploiting prior experience in the field.
The workshop will be co-located with the Tenth International Workshop on Treebanks and Linguistic Theories (TLT10), which will be held on January 6-7, 2012 (http://tlt10.cl.uni-heidelberg.de).
Gregory Crane (Tufts University, Boston, USA)
Invited lecture: Greg Crane
Morphological and Part-of-Speech Tagging of Historical Language Data: A Comparison
Iris Hendrickx and Rita Marquilhas: From Old Texts to Modern Spellings: An Experiment in Automatic Normalisation
Eiríkur Rögnvaldsson, Anton Karl Ingason, Einar Freyr Sigurðsson and Joel C. Wallenberg: Creating a Dual-purpose Treebank
Kristin Bech and Kristine Eide: The Annotation of Morphology, Syntax and Information Structure in a Multilayered Diachronic Corpus
Asif Ekbal, Francesca Bonin, Sriparna Saha, Egon Stemle, Eduard Barbu, Fabio Cavulli, Christian Girardi, Filippo Nardelli and Massimo Poesio: Rapid Adaptation of NE Resolvers for Humanities Domains using Active Annotation
Maria Sukhareva, Zahurul Islam, Armin Hoenen and Alexander Mehler: A Three-step Model of Language Detection in Multilingual Ancient Texts
Cerstin Mahlow and Britta Juska-Bacher: Exploring New High German Texts for Evidence of Phrasemes
Michael Piotrowski and Stefan Höfler: Building Corpora for the Philological Study of Swiss Legal Texts
Massimo Manca, Linda Spinazzé, Luigi Tessarolo, Paolo Mastandrea and Federico Boschetti: Musisque Deoque: Text Retrieval on Critical Editions
Timo Korkiakangas and Marco Passarotti: Challenges in Annotating Medieval Charters in Latin
Dain Kaplan, Ryu Iida, Kikuko Nishina and Takenobu Tokunaga: Slate -- A Tool for Creating and Maintaining Annotated Corpora
Voula Giouli: Annotating Corpora from Various Sources in the Humanities Domain: Shortcomings and Issues
Read more issues|LINGUIST home page|Top of issue
Page Updated: 07-Nov-2011
While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.