* * * * * * * * * * * * * * * * * * * * * * * *
LINGUIST List logo Eastern Michigan University Wayne State University *
* People & Organizations * Jobs * Calls & Conferences * Publications * Language Resources * Text & Computer Tools * Teaching & Learning * Mailing Lists * Search *
* *
LINGUIST List 22.4436

Mon Nov 07 2011

Confs: Text/Corpus Ling, Ling & Literature, Comp Ling/Germany

Editor for this issue: Amy Brunett <brunettlinguistlist.org>

LINGUIST is pleased to announce the launch of an exciting new feature: Easy Abstracts! Easy Abs is a free abstract submission and review facility designed to help conference organizers and reviewers accept and process abstracts online. Just go to: http://www.linguistlist.org/confcustom, and begin your conference customization process today! With Easy Abstracts, submission and review will be as easy as 1-2-3!
        1.     Marco Passarotti , Annotation of Corpora for Research in the Humanities

Message 1: Annotation of Corpora for Research in the Humanities
Date: 07-Nov-2011
From: Marco Passarotti <marco.passarottiunicatt.it>
Subject: Annotation of Corpora for Research in the Humanities
E-mail this message to a friend

Annotation of Corpora for Research in the Humanities
Short Title: ACRH

Date: 05-Jan-2012 - 05-Jan-2012
Location: Heidelberg, Germany
Contact: Marco Passarotti
Contact Email: < click here to access email >
Meeting URL: http://www.coli.uni-saarland.de/conf/ACRH10/

Linguistic Field(s): Computational Linguistics; Ling & Literature; Text/Corpus Linguistics

Meeting Description:

The workshop on ‘Annotation of Corpora for Research in the Humanities’ will be held on January 5, 2012 at the University of Heidelberg (Germany) (http://www.coli.uni-saarland.de/conf/ACRH10/).

The workshop aims at building a tighter collaboration between people working in various areas of the Humanities (such as literature, philology, history etc.) and the research community involved in developing, using and making accessible annotated corpora.

Addressing topics related to annotated corpora for research in the Humanities is an interdisciplinary task, which involves corpus and computational linguists (mostly those working in literary computing), philologists, scholars in the Humanities and computer scientists. However, this interdisciplinarity is not fully realised yet. Indeed, philologists and scholars are not used to exploit NLP tools and language resources such as annotated corpora; in turn, computational linguists are more prone to develop language resources for NLP purposes only. For instance, although many corpora that play a relevant role for research in Humanities are today available in digital format (theatrical plays, contemporary novels, critical literature, literary reviews etc.), only a few of them are linguistically tagged, while most still lack linguistic tagging at all. Historical corpora are also a case of special interest, since their creation demands a strong interplay between computational linguistics and more traditional scholarship. Over the past few years a number of historical annotated corpora have been started, among which are treebanks for Middle, Early Modern and Old English, Early New High German, Medieval Portuguese, Ugaritic, Latin, Ancient Greek and several translations of the New Testament into Indo-European languages. The experience of these ever-growing groups of projects can provide many suggestions on the methodology as well as on the practice of interaction between literary studies, philology and corpus linguistics. Moreover, we believe that a tighter collaboration between people working in the Humanities and the research community involved in developing annotated corpora is needed since, while annotating a corpus from scratch still remains a labor-intensive and time-consuming task, today this is simplified by intensively exploiting prior experience in the field.

The workshop will be co-located with the Tenth International Workshop on Treebanks and Linguistic Theories (TLT10), which will be held on January 6-7, 2012 (http://tlt10.cl.uni-heidelberg.de).

Invited Speaker:

Gregory Crane (Tufts University, Boston, USA)

Workshop Program:

9:15- 9:30

Invited lecture: Greg Crane

Coffee break

Stefanie Dipper:
Morphological and Part-of-Speech Tagging of Historical Language Data: A Comparison

Iris Hendrickx and Rita Marquilhas: From Old Texts to Modern Spellings: An Experiment in Automatic Normalisation

Eiríkur Rögnvaldsson, Anton Karl Ingason, Einar Freyr Sigurðsson and Joel C. Wallenberg: Creating a Dual-purpose Treebank


Kristin Bech and Kristine Eide: The Annotation of Morphology, Syntax and Information Structure in a Multilayered Diachronic Corpus

Asif Ekbal, Francesca Bonin, Sriparna Saha, Egon Stemle, Eduard Barbu, Fabio Cavulli, Christian Girardi, Filippo Nardelli and Massimo Poesio: Rapid Adaptation of NE Resolvers for Humanities Domains using Active Annotation

Maria Sukhareva, Zahurul Islam, Armin Hoenen and Alexander Mehler: A Three-step Model of Language Detection in Multilingual Ancient Texts

Coffee break

Cerstin Mahlow and Britta Juska-Bacher: Exploring New High German Texts for Evidence of Phrasemes

Michael Piotrowski and Stefan Höfler: Building Corpora for the Philological Study of Swiss Legal Texts

Poster session
Massimo Manca, Linda Spinazzé, Luigi Tessarolo, Paolo Mastandrea and Federico Boschetti: Musisque Deoque: Text Retrieval on Critical Editions
Timo Korkiakangas and Marco Passarotti: Challenges in Annotating Medieval Charters in Latin
Dain Kaplan, Ryu Iida, Kikuko Nishina and Takenobu Tokunaga: Slate -- A Tool for Creating and Maintaining Annotated Corpora
Voula Giouli: Annotating Corpora from Various Sources in the Humanities Domain: Shortcomings and Issues


Read more issues|LINGUIST home page|Top of issue

Page Updated: 07-Nov-2011

Supported in part by the National Science Foundation       About LINGUIST    |   Contact Us       ILIT Logo
While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed on its pages, it cannot vouch for their contents.