LINGUIST List 10.389

Fri Mar 12 1999

Calls: Hispanic Ling., Discourse Tagging

Editor for this issue: Jody Huellmantel <>

As a matter of policy, LINGUIST discourages the use of abbreviations or acronyms in conference announcements unless they are explained in the text.


  1. Cristina Sanz, Reminder: 3rd Hispanic Linguistics Symposium
  2. Barbara DiEugenio, ACL' 99 workshop: "Towards standards and tools for discourse tagging"

Message 1: Reminder: 3rd Hispanic Linguistics Symposium

Date: Thu, 11 Mar 1999 15:44:28 -0500 (EST)
From: Cristina Sanz <>
Subject: Reminder: 3rd Hispanic Linguistics Symposium





OCTOBER 8 - 11, 1999

We cordially invite abstracts for either of the above conferences (that is,
ONE abstract per presenter). Please submit FIVE copies of a one-page
anonymous abstract (maximum 500 words plus bibliography/figures) in English
or in any Hispanic language. Include with your submission the following
information on a 4 x 6 index card: name(s), affiliation(s), title of your
paper, mailing address, e-mail address, phone and fax numbers, and the
name>of the conference for which you would like to be considered. E-mail
or fax submissions will not be considered.

Presentation time for papers will be limited to 20 minutes plus 10 minutes
for discussion.

notified by JUNE 1, 1999.

Submissions should be sent to:

Abstract Committee
1999 Spanish Linguistics Conferences
Department of Spanish & Portuguese
Georgetown University
Washington, DC 20057-1039


Cristina Sanz
Assistant Professor of Catalan & Spanish Applied Linguistics
Director, Intensive Spanish Program
Georgetown University
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: ACL' 99 workshop: "Towards standards and tools for discourse tagging"

Date: Fri, 26 Feb 1999 20:08:54 -0600 (CST)
From: Barbara DiEugenio <>
Subject: ACL' 99 workshop: "Towards standards and tools for discourse tagging"


 June 22, 1999 
 University of Maryland
 College Park, MD, USA



Discourse tagging assigns labels from a tag set to discourse units in
texts or dialogues. The discourse units range from words and phrases,
such as referring expressions, to multi-utterance units identified by
criteria such as speaker intention or initiative. Just as the 
availability of syntactically annotated corpora has resulted in major
advances in sentence-level natural language processing, we expect that
corpora tagged for discourse features will lead to similar advances in
discourse processing.

Work on discourse tagging has gained momentum in the last 3-4 years.
Three major initiatives in this area are: the Discourse Resource
Initiative ( ),
that has organized yearly international workshops addressing the
standardization of discourse tagging schemes for coreference, for
dialogue acts, and for higher level discourse structures; MATE
(, a project co-funded by the European Union,
whose aim is to develop tools and standards for tagging spoken
dialogue corpora at different levels, including the discourse level; 
the Global Document Annotation initiative ( ), 
that aims at having Internet authors annotate their documents with a
common standard tag set which allows machines to recognize the
semantic and pragmatic structures of documents. 

Despite the progress made by these three initiatives, there is still
much work to be done before there are widely accepted (standardized)
discourse tagging schemes suitable for sharing and distribution across
sites and projects. Moreover, there has not yet been an open forum to
which researchers working in this area could participate. This
workshop will provide such a forum.

Submissions are invited on, but not limited to, the following topics
and issues:

1. How can standardization for discourse tagging concretely be
achieved? By developing a single coding scheme, or a set of coding
schemes, one for each phenomenon of interest? Or rather, by developing
some specification guidelines and mappings from one scheme to another?
In some other way?

2. Cross-level coding: All of the initiatives mentioned above promote
an approach in which coding schemes are developed at different levels,
rather than an approach in which a monolithic scheme addresses all
phenomena. Given this methodology, the issue of cross-level coding
arises, namely, how can coding schemes for different levels take
advantage of each other and allow coding of cross-level relationships?
Is it possible to use corpus annotations at different annotation
levels to examine the interdependence of linguistic phenomena?

3. Coding schemes and theories of discourse: Is it possible to develop
coding schemes that faithfully reflect a discourse theory? If yes,
is it desirable? Conversely, can corpora coded for discourse issues
help advance our theoretical understanding of discourse phenomena?

4. Coding schemes and applications: Is it possible to design discourse
coding schemes independently from the applications that the tagged
corpora may be used to inform (e.g., to train a speech act

5. Coding schemes and reliability: Thus far, experience in developing
schemes for discourse phenomena that can be coded reliably has been
mixed. Whatever the reason (e.g., lack of an overarching theory for
discourse, genuine ambiguity and misunderstandings in real dialogue
reflected in the coding, etc), how can we devise reliable coding
schemes? What reliability measures should be used: are widely used
measures (Kappa, Alpha, precision and recall) and the corresponding
standards appropriate for discourse tagging? If not, what other
measures can we use? Is reliability affected by whether naive or
expert coders are used?

6.Tools for discourse tagging: What specific features of a tool does
discourse tagging require? Can we just extend tools developed for
other purposes, e.g. for syntactic tagging? Do we need to develop new

7. Some paradigms for evaluating dialogue systems take advantage of
the use of tagged corpora: How are discourse tagging and tagging for
evaluation purposes related? Are there some discourse tags that may be
used as evaluation tags or is it advisable to introduce another
dimension of tagging?

In addition to papers, prospective participants may be asked to do a
small coding exercise before the workshop, in order to test out
various tagging schemes. Prospective participants who have developed
tools are welcome to bring a demo with them.


Authors are requested to submit an electronic version of their
papers. Send your electronic submission to both Marilyn Walker
( and Morena Danieli ( If
electronic submission is impossible, please contact the organizers to
arrange for hardcopy submission (four hardcopies will be required).
Maximum length is 6 pages including figures and references.

Please conform with the traditional two-column ACL Proceedings
format. Style files can be downloaded from


Paper submission deadline: March 26
Notification of acceptance: April 16
Camera ready papers due: April 30


Marilyn Walker (Contact Person)
ATT Labs - Research
180 Park Ave
Rm. E-103
Florham Park, N.J. 07932, USA

Morena Danieli (Contact Person)
CSELT-Centro Studi E Laboratori Telecomunicazioni
Via Reiss-Romoli, 274
I-10148 Torino, Italia

Johanna D. Moore
University of Edinburgh
Human Communication Research Centre
2, Buccleuch Place
Edinburgh EH8 9LW, UK

Barbara Di Eugenio
Department of Electrical Engineering and Computer Science
Science and Engineering Offices
851 South Morgan Street (M/C 154)
Chicago, Illinois 60607-7053, USA


Jean Carletta - HCRC, University of Edinburgh
Laila Dybkjaer - MIP, Odense University
Julia Hirschberg - AT&T
Diane Litman - AT&T
Masato Ishizaki - JAIST
David Novick - EURISCO
Silvia Quazza - CSELT
Daniel Jurafsky - University of Colorado (pending)

Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue