LINGUIST List 10.311

Fri Feb 26 1999

Calls: Computational Linguistics, Formal Grammar

Editor for this issue: Jody Huellmantel <>

As a matter of policy, LINGUIST discourages the use of abbreviations or acronyms in conference announcements unless they are explained in the text.


  1. Priscilla Rasmussen, ACL'99 Workshop Announcements
  2. Richard Oehrle, Formal Grammar 99: Final Call for Papers

Message 1: ACL'99 Workshop Announcements

Date: Thu, 25 Feb 99 18:29:55 EST
From: Priscilla Rasmussen <>
Subject: ACL'99 Workshop Announcements

Below, separated by askerisks (*) are FIVE ACL'99 associated Workshop 
announcements: 1) Coreference and Its Applications; 2) Joint EMNLP
and Very Large Corpora; 3) Relationship Between Discourse/Dialogue
Structure and Reference; 4) Toward Standards and Tools for Discourse
Tagging; and 5) SIGLEX'99. 


 ACL'99 Workshop

 June 22, 1999

 University of Maryland

 College Park,


Coreference is in some sense nature's own hyperlink. It conveys how
individual statements are connected within documents, across documents
and across bodies of human knowledge. Consequently coreference
resolution algorithms are at the core of Natural Language
Processing. Most of the work done on coreference deals with
a single language and a single text document (usually newswire).

As NLP research matures into "application" phases (as opposed to
theory-development), NLP systems are moving beyond traditional
research sources to document sets which reflect a more natural, 
research-oriented mix. This shift can be seen in both the document 
sets and tasks used in recent HUB, MET, and TDT evaluations. The
new sources consist of documents in several different languages, 
documents with data from noisy sources, and documents containing 
multimedia. In order for NLP systems to make a successful
transition to these new sources, it is critical for coreference 
resolution systems to also work on these new sources.

The workshop invites papers regarding the theory, design, and
evaluation of coreference resolution systems that deal with
non-traditional data sources. In particular, we encourage 
submission of papers for the following types of coreference:

 *-Cross-document coreference

 *-Coreference resolution in languages other than English

 *-Coreference resolution on noisy data

 *-Coreference resolution on non-text data (example: human speech)

 *-Coreference resolution on multimedia data

In addition, the workshop also invites papers on innovative NLP
applications that rely heavily on coreference resolution systems.


Paper submissions should consist of a full paper (5000 words or less,
including references). Each submission should include a separate
title page providing the following information: the title, a short
abstract, names and affiliations of all the authors, the full address
of the primary author (or alternate contact person), including phone,
fax, and email.

Papers may be submitted by submitting three hard copies to:

Amit Bagga
General Electric CRD 
Room K1-5C38B
1 Research Circle
Niskayuna, NY
12309. USA

phone: 1-518-387-7077



Paper submission deadline: March 29

Notification of acceptance: April 16

Camera ready papers due: 	April 30


Amit Bagga (Contact Person) 
General Electric Corporate
Research and Development 
1 Research Circle
Niskayuna, NY 12309. USA
518-387-7077 (voice)
518-387-6845 (fax)

Breck Baldwin 
Institute for Research in Cognitive Science
University of Pennsylvania 
3401 Walnut Street, #400C 
Philadelphia, PA 19104. USA 

Sara J. Shelton 
US Department of Defense 
9800 Savage Road, E24 
Ft Meade, MD 20755. USA 


Amit Bagga - GE CRD 
Breck Baldwin - University of Pennsylvania 
Branimir Boguraev - IBM T.J. Watson Research Center 
Ed Hovy - Information Sciences Institute (USC/ISI) 
Mark T. Maybury - MITRE 
Ruslan Mitkov - University of Wolverhampton 
Sara Shelton - DoD 


 First Call For Papers


 Sponsored by SIGDAT (ACL's Special Interest Group for Linguistic Data
 and Corpus-based Approaches to NLP)

 June 21-22, 1999
 University of Maryland

 In conjunction
 ACL'99: the 37th Annual Meeting of the Association for Computational

 This SIGDAT-sponsored joint conference will continue to provide a forum
 for new research in corpus-based and/or empirical methods in NLP. In
 addition to providing a general forum, the theme for this year is

 "Corpus-based and/or Empirical Methods in NLP for Speech, MT, IR, and
 other Applied Systems"

 A large number of systems in automatic speech recognition(ASR) and
 synthesis, machine translation(MT), information retrieval(IR), optical
 character recognition(OCR) and handwriting recognition have become
 commercially available in the last decade. Many of these systems use
 NLP technologies as an important component. Corpus-based and empirical
 methods in NLP have been a major trend in recent years. How useful are
 these techniques when applied to real systems, especially when compared
 to rule-based methods? Are there any new techniques to be developed in 
 EMNLP and from VLC in order to improve the state-of-the-art of ASR, MT, 
 IR, OCR, and other applied systems? Are there new ways to combine 
 corpus-based and empirical methods with rule-based systems?

 This two-day conference aims to bring together academic researchers and
 industrial practitioners to discuss the above issues, through technical
 paper sessions, invited talks, and panel discussions. The goal of the
 conference is to raise an awareness of what kind of new EMNLP techniques
 need to be developed in order to bring about the next breakthrough in
 speech recognition and synthesis, machine translation, information
 retrieval and other applied systems.

 The conference solicits paper submissions in (and not limited to) the
 following areas:

 1) Original work in one of the following technologies and its relevance
 to speech, MT, or IR:
 (a) word sense disambiguation
 (b) word and term segmentation and extraction
 (c) alignment
 (d) bilingual lexicon extraction
 (e) POS tagging
 (f) statistical parsing
 (g) others (please specify)

 2) Proposals of new EMNLP technologies for speech, MT, IR, OCR, or other
 applied systems (please specify)

 3) Comparative evaluation of the performance of EMNLP technologies in
 one of the areas in (1) and that of its rule-based or knowledge-based 
 counterpart in a speech, MT, IR, OCR or other applied systems

 Submissions Requirements

 Submissions should be limited to original, evaluated work. All papers
 should include background survey and/or reference to previous work. The
 authors should provide explicit explanation when there is no evaluation
 in their work. We encourage paper submissions related to the conference
 theme. In particular, we encourage the authors to include in their
 papers, proposals and discussions of the relevance of their work to the
 theme . However, there will be a special session in the conference to
 include corpus-based and/or empirical work in all areas of natural 
 language processing.

 Important Dates

 March 31 Submission of full-length paper
 April 30 Acceptance notice
 May 20 Camera-ready paper due
 June 21-22 Conference date

 Program Chair

 Pascale Fung
 Human Language Technology Center
 Department of Electrical and Electronic Engineering
 University of Science and Tehnology (HKUST)
 Clear Water Bay, Kowloon
 Hong Kong
 Tel: (+852) 2358 8537
 Fax: (+852) 2358 1485

 Program Co-Chair
 Joe Zhou
 LEXIS-NEXIS, a Division of Reed Elsevier
 9555 Springboro Pike
 Dayton, OH 45342



 ACL'99 Workshop on the Relationship Between
 Discourse/Dialogue Structure and Reference
 June 21 1999
 University of Maryland


The relationship between the structure of discourse and dialogue and
the use of referring expressions has been the focus of much research
in linguistics, computational linguistics, and psycholinguistics,
individual efforts have been couched in a variety of
frameworks ranging from (S)DRT and RST to Centering, they all share two 
underlying assumptions:

 1. The structure of discourse affects the interpretation of
 referring expressions and the space of anaphoric accessibility.
 2. The use of referring expressions restricts the set of possible
 discourse interpretations.

However, most approaches address only one of these two views on the
relation between structure and reference. And although several
theories explaining this relationship exist, few have made a significant 
impact on practical applications such as discourse parsing, summarization,
generation, and name-entity recognition.

This workshop will provide a forum for researchers in all areas of
linguistics, psycholinguistics, and computational linguistics who are
interested in advancing the state of the art in understanding the
relationship between discourse/dialogue structure and reference.
Submissions are invited on, but not limited to, the following topics
and issues:

 1. Linguistic issues:
 + what is the relation between lexico-grammatical
 constructs, referring expressions, and the structure of
 2. Psycholinguistic issues:
 + how does the use of referents affect the human
 interpretation of discourse/dialogue?
 3. Corpus-specific issues:
 + what coding schemata and annotation tools should one
 use in order to encode the relation between
 discourse/dialogue structure and reference?
 4. Representation issues:
 + how should discourse/dialogue structures and referents
 be represented?
 + how should one represent the relationship between them:
 as preferences; or as constraints?
 5. Algorithmic issues:
 + how can discourse/dialogue structures, referents, and
 co-referential links be identified and computed?
 + knowledge-intensive vs. shallow approaches
 + rule-driven vs. statistical vs. corpus-based approaches
 + Wordnet-based approaches
 + how do discourse/dialogue structure and referential
 expressions interact in natural language generation?
 6. General issues:
 + what are the commonalities of current approaches to
 studying the relation between discourse/dialogue and
 + what are the differences?
 + what are the arguments against a relation between
 discourse/dialogue structure and reference?
 + how language-dependent is the relation between
 discourse/dialogue structure and reference?

 Post-Workshop Dissemination:

 Selected papers from the workshop will be compiled into a volume
 tentatively scheduled to appear in the Text, Speech, and Language
 Technology book series from Kluwer Academic Press.

 Submission Procedure:

 * Authors are requested to submit one electronic version of their
 papers OR four hardcopies. Please submit hardcopies only if
 electronic submission is impossible.
 * Maximum length is 8 pages including figures and references.
 * Please conform with the traditional two-column ACL Proceedings
 format. Style files can be downloaded from or from

 Submission should be sent to:

 Nancy Ide
 Department of Computer Science
 Vassar College
 124 Raymond Avenue
 Poughkeepsie, New York 12604-0520 USA
 Fax: (+1 914) 437 7498


 Deadline for submissions: March 26, 1999.
 Notification of acceptance: To Be Announced.
 Camera ready copies due: To Be Announced.

 Organizing committee:

 * Dan Cristea - University "A.I. Cuza" of Iasi, Romania.
 * Nancy Ide - Vassar College, USA.
 * Daniel Marcu - Information Sciences Institute/University of
 Southern California, USA.

 Program Committee:

 * Nicholas Asher (University of Texas)
 * Eugene Charniak (Brown University)
 * Udo Hahn (Freiburg University)
 * Lynette Hirschman (MITRE Corp.)
 * Graeme Hirst (University of Toronto)
 * Massimo Poesio (University of Edinburgh)
 * Ehud Reiter (University of Aberdeen)
 * Michael Strube (University of Pennsylvania)
 * Wietske Vonk (Max Planck Institute)
 * Marilyn Walker (AT&T)

 Related Events

 * ACL'99
 * ACL'99 SIGDIAL Business Meeting
 * ACL'99 Workshop on Tagging
 * ACL'99 Workshop on Coreference and Its Applications
 * EuroLAN'99 Summer School


 TITLE: Towards Standards and Tools for Discourse Tagging


 Discourse tagging assigns labels from a tag set to discourse units in
 texts or dialogues. The discourse units range from words or referring
 expressions to multi-utterance units identified by criteria such as
 speaker intention or initiative. Since the emergence of syntactically
 annotated corpora has resulted in major advances in sentence-level
 natural language processing, the hope is that corpora of tagged
 discourse may lead to similar advances in the area of discourse

 Work on discourse tagging has gained momentum in the last 3-4 years.
 Three major initiatives in this area are: the Discourse Resource
 Initiative (,
 that has organized yearly international workshops addressing the
 standardization of discourse tagging schemes for coreference,
 for dialogue acts, and for higher level discourse structures;
 MATE (,
 a project co-funded by the European Union, whose aim is to
 develop tools and standards for tagging spoken dialogue
 corpora at different levels, including the discourse level;
 the Global Document Annotation initiative, that aims at having
 Internet authors annotate their documents with a common standard
 tag set which allows machines to recognize the semantic and pragmatic
 structures of documents (

 Even with these three initiatives in place, there is still much work to
 be done before there are widely accepted (standardized) tagging
 schemes for various discourse phenomena that could be shared across
 sites; moreover, there has not yet been an open forum to which
 researchers working in this area could participate and
 contribute. This workshop will provide such a forum.

 Submissions are invited on, but not limited to, the following topics
 and issues:

 1. How can standardization for discourse tagging concretely be achieved?
 by developing a single coding scheme, or more likely, a set of coding
 schemes, one for each phenomenon of interest? or rather, by developing
 some specification guidelines and a way of mapping from one scheme to
 another? in some other way?

 2. Cross-level coding: all the initiatives mentioned above promote an
 approach in which coding schemes are developed at different levels,
 rather than an approach in which a monolithic scheme addresses all
 phenomena. Given this methodology, the issue of cross-level coding
 arises, namely, how can coding schemes for different levels
 take advantage of each other and allow coding of cross-level
 relationships? is it possible to relate corpus annotations at
 different annotation levels to examine the interdependence of
 linguistic phenomena?

 3. Coding schemes and theories of discourse: is it possible to develop
 coding schemes that faithfully reflect a discourse theory? if yes,
 is it desirable? conversely, can corpora coded for discourse issues
 help advance our theoretical understanding of discourse phenomena?

 4. Coding schemes and applications: is it possible to design
 discourse coding schemes independently from the applications tagged
 corpora are supposed to be used for (eg, to train a speech act

 5. Coding schemes and reliability: discourse categories are difficult
 to code for reliably. Whatever the reason (e.g., lack of an overarching
 theory for discourse, or genuine ambiguity and misunderstandings in real
 dialogue reflected in the coding), how can we devise reliable
 coding schemes? What reliability measures should be used: are
 widely used measures (Kappa, Alpha, precision and
 recall) appropriate in this case? If not, what other measures can
 we use? Is reliability affected by whether naive or expert coders
 are used?

 6. Tools for discourse tagging: what specific features of a tool
 does discourse tagging require? can we just extend tools developed
 eg for syntactic tagging? do we need to develop new tools?

 7. Some paradigms for evaluating dialogue systems take advantage of
 the use of tagged corpora: how are tagging for evaluation purposes and
 discourse tagging related? Are there some discourse tags
 that may be used as evaluation tags or is it advisable to introduce
 another dimension of tagging?

 In addition to papers, prospective participants may be asked to do a
 small homework before the workshop to test out various tagging
 schemes. Prospective participants who have developed tools are welcome
 to bring a demo with them.

 Submission Procedure:

 Authors are requested to submit one electronic version of
 their papers OR four hardcopies. Please submit
 hardcopies only if electronic submission is impossible.
 Send your electronic submission to both Marilyn Walker
 ( and Morena Danieli (

 If electronic submission is impossible, please contact the organizers
 to arrange for hardcopy submission.

 Maximum length is 6 pages including figures and references.

 Please conform with the traditional two-column ACL Proceedings
 format. Style files can be downloaded from


 Deadline for submissions: March 20, 1999.
 Notification of acceptance: April 16, 1999.
 Camera ready copies due: April 30, 1999

 WORKSHOP CHAIRS: Marilyn Walker, Morena Danieli, Johanna D. Moore, 
 Barbara Di Eugenio.


			Standardizing Lexical Resources 
			 June 21, 22, 1999
			 University of Maryland


As our national interests become increasingly global, timely access to
information becomes more and more necessary. Many promising
strategies for information provision rely heavily on lexical
resources, including ontologies. Our next major challenge is
providing a standardized lexical resource: an inventory of word
meanings, or senses, associated with criteria for distinguishing them.
Currently there are several different on-line lexical resources that
are being used for English, WordNet, Longman's, the Oxford English
Dictionary, (OED), CIDE from Cambeidge University Press (CUP),
Collins, and Webster's, to name just a few, and they each use very
different approaches to making sense distinctions. Various
computational lexicons and related resources such as ontologies are
under development, including the European PAROLE/SIMPLE lexicons, the
Generative Lexicon, the SENSUS ontology, Mikrokosmos, WordNet,
Framenet, and the theory of Lexical Conceptual Structures. Each takes
a very different approach and makes reference to different underlying
theories of semantics. This divergence of resources has motivated the
efforts of the EAGLES Lexical Semantics Group, which is defining a
common format for lexical semantic representation for 12 languages.

In a recent evaluation of word sense disambiguation systems,
SIGLEX98-SENSEVAL, (also supported by Euralex, Elsenet, ECRAN and
SPARKLE) "";, the
training data and test data were prepared using a set of Oxford
University Press (OUP) senses. This made it difficult to evaluate the
performance of pre-existing systems that had been built using other
lexical resources. A mapping was made from the OUP senses to WordNet
senses, so that WordNet systems could be included, but this was
somewhat problematic as there were far fewer WordNet senses, and
frequently no direct mapping was possible. As do most dictionaries,
OUP and WordNet often make different decisions about how to structure
entries for the same words which are all equally valid, but simply not
compatible. Therefore, it becomes especially difficult to include
pre-existing systems in the evaluation that rely on a pre-existing
lexical resource other than the one used as the Gold Standard. The
question that arises here is the likelihood of making performance
preserving mappings between lexical resources. Is it even possible to
treat one lexical resource as a standard that other resources can be
mapped to? (This is true even when focusing on just one language - the
problem simply becomes more explosive when additional languages become
involved.) All of the participants in SIGLEX98-SENSEVAL agreed that
they would prefer evaluations based on running text rather than corpus
instances, but this is only feasible if the Gold Standard sense
inventory being used for tagging can be appropriately mapped onto
several different lexical resources.

The purpose of SIGLEX99 is to directly address the issue of
standardization of lexical resources, and performance-preserving
mappings between existing resources. As a spin-off from SENSEVAL, we
are investigating mapping the OUP SENSEVAL senses onto other lexical
resources. We will also be tagging running text with these senses,
and other senses, and will circulate this ahead of time to workshop
participants. There will be several working sessions focussed around
the mappings between lexical resources and the tagged samples.

Languages other than English will also be considered, in connection
with ROMANSEVAL, the subset of SENSEVAL for Romance languages (but
with no restriction to that language family). We will study the
relevance of EuroWordNet (EWN) sense dictinctions for WSD systems, and
the applicability of the Interlingua Language Index (ILI) created
within EWN for cross-language sense-standardization. An issue of
particular interest is the mapping of existing resources to the ILI,
which could be an important step towards the development of a
standardized multilingual lexicon for WSD. Such a multilingual gold
standard could in turn be used to semantically tag parallel texts and
thus create standardized corpora useful for many multilingual
applications. There will also be a session to discuss the future of
American involvement in EAGLES, and how the workshop results and
conclusions can be incorporated.

We will have invited talks on ontologies and lexical resources, and we
welcome submissions on any areas in lexical semantics and
computational lexical semantics, but particularly on the acquisition
and use of lexical resources and ontologies and on word sense
disambiguation. There will be a workshop proceedings, and as we have
done with our last two workshops, we will encourage partipants to make
electronic versions of their papers available on the web prior to the
workshop. Likely invited speakers include Patrick Hanks (Oxford
University Press), Chuck Fillmore (Berkeley), and someone speaking on
WordNet or EuroWordNet and on SIMPLE (the European project for
building harmonized semantic lexicons for 12 European languages).

The schedule for paper submissions (ACL format, 6 pages):

SUBMISSION DEADLINE:				March 29, 1999


CAMERA READY COPIES (and copyrights) DUE:	May 28, 1999

Please send submissions, hard copy or electronic (.ps or .doc), to:

Martha Palmer
Institute for Research in Cognitive Science
400A, 3401 Walnut Street/6228
University of Pennsylvania
Philadlephia, PA 19104
Telephone: (215) 898-0361
FAX No.: (215) 573-9247

Program Committee:

Nicoletta Calzolari, Istituto di Linguistica Computazionale, Pisa 
Bonnie Dorr, University of Maryland 
Chuck Fillmore, University of California, Berkeley 
Ralph Grishman, New York University
Patrick Hanks, Oxford University Press 
Eduard Hovy, USC Information Sciences Institute 
Nancy Ide, Vassar College 
Adam Kilgarriff, ITRI, University of Brighton 
Marc Light, MITRE Corporation 
Martha Palmer, University of Pennsylvania, CHAIR
James Pustejovsky, Brandeis University
Philip Resnik, University of Maryland 
Patrick St Dizier, IRIT-CNRS, Universiti Paul Sabatier
Antonio Sanfilippo, European Commission, DG XIII 
Frederique Segond, Xerox Research Centre, Grenoble 
Jean Vironis, Universiti de Provence 
Evelyne Viegas, New Mexico State University 
Piek Vossen, University of Amsterdam 
Yorick Wilks, University of Sheffield
David Yarowsky, John's Hopkins University 
Antonio Zompolli, Istituto di Linguistica Computazionale, Pisa 

Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: Formal Grammar 99: Final Call for Papers

Date: Wed, 24 Feb 1999 17:34:47 -0500 (EST)
From: Richard Oehrle <>
Subject: Formal Grammar 99: Final Call for Papers

 FG99 FG99 FG99 FG99 FG99 FG99 FG99 FG99 FG99 FG99 FG99 FG99 FG99 FG99 FG99 


 		 August 7-8, 1999,
		 Utrecht, The Netherlands


FG99 is the 5th conference on Formal Grammar held in conjunction with
the European Summer School in Logic, Language and Information, which
takes place in 1999 in Utrecht. Previous meetings were held in
Barcelona (1995), Prague (1996), Aix-en-Provence (1997), and as part
of the Joint Conference on Formal Grammar, Head-Driven Phrase
Structure Grammar, and Categorial Grammar (FHCG98) held in
Saarbruecken last August.


FG99 provides a forum for the presentation of new and original
research on formal grammar, especially with regard to the application
of formal methods to natural language analysis.

Themes of interest include, but are not limited to,

* formal and computational syntax, semantics, pragmatics, and phonology; 
* model-theoretic and proof-theoretic methods in linguistics; 
* constraint-based and resource-sensitive approaches to grammar;
* foundational, methodological and architectural issues in grammar.

Previous conferences in this series have welcomed papers from
a wide variety of frameworks.


	Grammatical Resources and Grammatical Inference
		 David Dowty (Ohio State)
		 Polly Jacobson (Brown)
		 Gerhard Jaeger (Berlin)
		 Reinhard Muskens (Tilburg)
		 Mark Steedman (Edinburgh)
	commentator: Johan van Benthem (Amsterdam) [tentative]


We invite E-MAIL submissions of abstracts for 30-minute papers (including
questions, comments, and discussion).

A submission should consist of two parts: 

- an information sheet (in ascii), containing the name of the author(s), 
 affiliation(s), e-mail and postal address(es) and a title; 

- an abstract, consisting of a description of not more than 5 pages 
 (including figures and references). Abstracts may be either in plain 
 ASCII or in (unix-compatible encoded) postscript, PDF, or DVI. 

Abstracts can be sent to (Geert-Jan M. Kruijff)


March 1, 1999


April 30, 1999


A full version of each accepted paper will be included in the conference 
proceedings, to be distributed at the conference. Full papers are due
June 30, 1999.


Anne Abeill'e	 (Paris)		Gosse Bouma	 (Groningen)
John Coleman (Oxford)		Mary Dalrymple (Xerox Parc) 
David Dowty	 (Ohio State)		Elisabet Engdahl (Gotenborg)
Daniele Godard (Lille)		Jack Hoeksema	 (Groningen)
Polly Jacobson (Brown)		Mark Johnson	 (Brown)
Ruth Kempson 	 (London)		Shalom Lappin (London)
Anton Nijholt (Twente)		Owen Rambow	 (Cogentex)
Mark Steedman	 (Edinburgh)


Web site for ESSLLI XI:

Web site for FG99 :

The organizers:

Geert-Jan Kruijff
Glyn Morrill
Paola Monachesi
Dick Oehrle
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue