LINGUIST List 23.5221|
Thu Dec 13 2012
Calls: Text/Corpus Linguistics/Spain
Editor for this issue: Alison Zaharee
From: Gisle Andersen <gisle.andersennhh.no>
Subject: Workshop: Compilation and Annotation of Spoken Corpora
E-mail this message to a friend
Full Title: Workshop: Compilation and Annotation of Spoken Corpora
Date: 22-May-2013 - 22-May-2013
Location: Santiago de Compostela, Spain
Contact Person: Gisle Andersen
Meeting Email: gisle.andersennhh.no
Web Site: http://www.usc.es/en/congresos/icame34/workshops.html
Linguistic Field(s): Text/Corpus Linguistics
Call Deadline: 31-Jan-2013
This workshop provides a meeting ground for scholars involved in the creation of corpora of spoken language or with a more general interested in the representation of spoken data based on audio/video recordings. The workshop addresses the need to harmonise corpus-building methods by developing or utilising internationally recognised standards in corpus linguistics or best practice guidelines for the transcription and annotation of audio/video data. The aim is to facilitate the exchange of experience from large-scale and coordinated corpus building efforts as well as small-scale and local initiatives. This includes accounts of, on the one hand, the practicalities encountered in corpus compilation, transcription and annotation, and on the other hand, how annotation decisions are grounded in linguistic theory. This will hopefully stimulate a fruitful discussion about whether/how cross-corpora comparison is hampered by lack of uniformity in annotation schema and procedures, what solutions corpus builders recommend at different annotation levels, practical experience with the use of existing standards or de facto standards (e.g. COBUILD/NERC, TEI, XCES), methods for testing and improving inter-annotator agreement, etc.
Call for Papers:
Relevant topics include, but are not restricted to:
- Corpus design (techniques for capturing and linking text and audio/video data; ensuring consistency in transcription; ensuring inter-annotator agreement)
- Orthographic transcription (transcription of non-standard vocabulary, slang, swearing, neologisms; standardised vs. idiosyncratic orthography; standardised representation of pauses, backchannels and hesitation phenomena)
- Annotation of syntactic features (the relevance and reliability of part-of-speech tagging for (informal/messy) conversational data; syntactic parsing of speech; parsers’/taggers’ capability of handling non-standard forms and neologisms)
- Annotation of prosodic, phonetic, or acoustic features (standardised vs. in-house annotation schemes, simple vs. detailed prosodic annotation; the relevance and reliability of phonetic annotation)
- Pragmatic or gestural annotation (standardised/in-house systems for annotation of speech act information, discourse functions, pragmatic markers, quotatives, anaphora and deixis; gestural annotation schemes)
We invite papers that discuss specific corpus initiatives dealing with any of the above topics, or that report on corpus-based case studies which illustrate or problematise the need for methodological harmonisation and standardisation in the field. The workshop will be organised as a series of thematic slots consisting of 15-minute papers followed by joint discussions.
The deadline for abstract submission is 31 January 2013. Abstracts of 300-400 words should be submitted by email to all three convenors: gisle.andersennhh.no, jketinu.com and susan.naceyhihm.no. The notification of acceptance will be sent out in late February 2013.
Read more issues|LINGUIST home page|Top of issue
Page Updated: 13-Dec-2012
While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.