Editor for this issue: Jody Huellmantel <jody
linguistlist.org>
Below, separated by askerisks (*) are FIVE ACL'99 associated Workshop announcements: 1) Coreference and Its Applications; 2) Joint EMNLP and Very Large Corpora; 3) Relationship Between Discourse/Dialogue Structure and Reference; 4) Toward Standards and Tools for Discourse Tagging; and 5) SIGLEX'99. ********************************************************************* ACL'99 Workshop COREFERENCE AND ITS APPLICATIONS June 22, 1999 University of Maryland College Park, MD. USA http://www.cs.duke.edu/~amit/acl99-wkshp.html WORKSHOP DESCRIPTION Coreference is in some sense nature's own hyperlink. It conveys how individual statements are connected within documents, across documents and across bodies of human knowledge. Consequently coreference resolution algorithms are at the core of Natural Language Processing. Most of the work done on coreference deals with a single language and a single text document (usually newswire). As NLP research matures into "application" phases (as opposed to theory-development), NLP systems are moving beyond traditional research sources to document sets which reflect a more natural, research-oriented mix. This shift can be seen in both the document sets and tasks used in recent HUB, MET, and TDT evaluations. The new sources consist of documents in several different languages, documents with data from noisy sources, and documents containing multimedia. In order for NLP systems to make a successful transition to these new sources, it is critical for coreference resolution systems to also work on these new sources. The workshop invites papers regarding the theory, design, and evaluation of coreference resolution systems that deal with non-traditional data sources. In particular, we encourage submission of papers for the following types of coreference: *-Cross-document coreference *-Coreference resolution in languages other than English *-Coreference resolution on noisy data *-Coreference resolution on non-text data (example: human speech) *-Coreference resolution on multimedia data In addition, the workshop also invites papers on innovative NLP applications that rely heavily on coreference resolution systems. FORMAT FOR SUBMISSION Paper submissions should consist of a full paper (5000 words or less, including references). Each submission should include a separate title page providing the following information: the title, a short abstract, names and affiliations of all the authors, the full address of the primary author (or alternate contact person), including phone, fax, and email. Papers may be submitted by submitting three hard copies to: Amit Bagga General Electric CRD Room K1-5C38B 1 Research Circle Niskayuna, NY 12309. USA phone: 1-518-387-7077 email: baggaMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuecrd.ge.com IMPORTANT DATES Paper submission deadline: March 29 Notification of acceptance: April 16 Camera ready papers due: April 30 ORGANIZATION COMMITTEE Co-Chairs: Amit Bagga (Contact Person) General Electric Corporate Research and Development K1-5C38B 1 Research Circle Niskayuna, NY 12309. USA bagga
crd.ge.com 518-387-7077 (voice) 518-387-6845 (fax) Breck Baldwin Institute for Research in Cognitive Science University of Pennsylvania 3401 Walnut Street, #400C Philadelphia, PA 19104. USA breck
linc.cis.upenn.edu Sara J. Shelton US Department of Defense 9800 Savage Road, E24 Ft Meade, MD 20755. USA PROGRAM COMMITTEE Amit Bagga - GE CRD Breck Baldwin - University of Pennsylvania Branimir Boguraev - IBM T.J. Watson Research Center Ed Hovy - Information Sciences Institute (USC/ISI) Mark T. Maybury - MITRE Ruslan Mitkov - University of Wolverhampton Sara Shelton - DoD ********************************************************************** First Call For Papers (EMNLP/VLC-99) JOINT SIGDAT CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND VERY LARGE CORPORA Sponsored by SIGDAT (ACL's Special Interest Group for Linguistic Data and Corpus-based Approaches to NLP) June 21-22, 1999 University of Maryland In conjunction ACL'99: the 37th Annual Meeting of the Association for Computational Linguistics This SIGDAT-sponsored joint conference will continue to provide a forum for new research in corpus-based and/or empirical methods in NLP. In addition to providing a general forum, the theme for this year is "Corpus-based and/or Empirical Methods in NLP for Speech, MT, IR, and other Applied Systems" A large number of systems in automatic speech recognition(ASR) and synthesis, machine translation(MT), information retrieval(IR), optical character recognition(OCR) and handwriting recognition have become commercially available in the last decade. Many of these systems use NLP technologies as an important component. Corpus-based and empirical methods in NLP have been a major trend in recent years. How useful are these techniques when applied to real systems, especially when compared to rule-based methods? Are there any new techniques to be developed in EMNLP and from VLC in order to improve the state-of-the-art of ASR, MT, IR, OCR, and other applied systems? Are there new ways to combine corpus-based and empirical methods with rule-based systems? This two-day conference aims to bring together academic researchers and industrial practitioners to discuss the above issues, through technical paper sessions, invited talks, and panel discussions. The goal of the conference is to raise an awareness of what kind of new EMNLP techniques need to be developed in order to bring about the next breakthrough in speech recognition and synthesis, machine translation, information retrieval and other applied systems. The conference solicits paper submissions in (and not limited to) the following areas: 1) Original work in one of the following technologies and its relevance to speech, MT, or IR: (a) word sense disambiguation (b) word and term segmentation and extraction (c) alignment (d) bilingual lexicon extraction (e) POS tagging (f) statistical parsing (g) others (please specify) 2) Proposals of new EMNLP technologies for speech, MT, IR, OCR, or other applied systems (please specify) 3) Comparative evaluation of the performance of EMNLP technologies in one of the areas in (1) and that of its rule-based or knowledge-based counterpart in a speech, MT, IR, OCR or other applied systems Submissions Requirements Submissions should be limited to original, evaluated work. All papers should include background survey and/or reference to previous work. The authors should provide explicit explanation when there is no evaluation in their work. We encourage paper submissions related to the conference theme. In particular, we encourage the authors to include in their papers, proposals and discussions of the relevance of their work to the theme . However, there will be a special session in the conference to include corpus-based and/or empirical work in all areas of natural language processing. Important Dates March 31 Submission of full-length paper April 30 Acceptance notice May 20 Camera-ready paper due June 21-22 Conference date Program Chair Pascale Fung Human Language Technology Center Department of Electrical and Electronic Engineering University of Science and Tehnology (HKUST) Clear Water Bay, Kowloon Hong Kong Tel: (+852) 2358 8537 Fax: (+852) 2358 1485 Email: pascale
ee.ust.hk Program Co-Chair Joe Zhou LEXIS-NEXIS, a Division of Reed Elsevier 9555 Springboro Pike Dayton, OH 45342 USA Email: joez
lexis-nexis.com ********************************************************************** CALL FOR PAPERS ACL'99 Workshop on the Relationship Between Discourse/Dialogue Structure and Reference June 21 1999 University of Maryland http://www.isi.edu/~marcu/discourse-ref-acl99/ --------------------------------- The relationship between the structure of discourse and dialogue and the use of referring expressions has been the focus of much research in linguistics, computational linguistics, and psycholinguistics, individual efforts have been couched in a variety of frameworks ranging from (S)DRT and RST to Centering, they all share two underlying assumptions: 1. The structure of discourse affects the interpretation of referring expressions and the space of anaphoric accessibility. 2. The use of referring expressions restricts the set of possible discourse interpretations. However, most approaches address only one of these two views on the relation between structure and reference. And although several theories explaining this relationship exist, few have made a significant impact on practical applications such as discourse parsing, summarization, generation, and name-entity recognition. This workshop will provide a forum for researchers in all areas of linguistics, psycholinguistics, and computational linguistics who are interested in advancing the state of the art in understanding the relationship between discourse/dialogue structure and reference. Submissions are invited on, but not limited to, the following topics and issues: 1. Linguistic issues: + what is the relation between lexico-grammatical constructs, referring expressions, and the structure of discourse/dialogue? 2. Psycholinguistic issues: + how does the use of referents affect the human interpretation of discourse/dialogue? 3. Corpus-specific issues: + what coding schemata and annotation tools should one use in order to encode the relation between discourse/dialogue structure and reference? 4. Representation issues: + how should discourse/dialogue structures and referents be represented? + how should one represent the relationship between them: as preferences; or as constraints? 5. Algorithmic issues: + how can discourse/dialogue structures, referents, and co-referential links be identified and computed? + knowledge-intensive vs. shallow approaches + rule-driven vs. statistical vs. corpus-based approaches + Wordnet-based approaches + how do discourse/dialogue structure and referential expressions interact in natural language generation? 6. General issues: + what are the commonalities of current approaches to studying the relation between discourse/dialogue and referents? + what are the differences? + what are the arguments against a relation between discourse/dialogue structure and reference? + how language-dependent is the relation between discourse/dialogue structure and reference? Post-Workshop Dissemination: Selected papers from the workshop will be compiled into a volume tentatively scheduled to appear in the Text, Speech, and Language Technology book series from Kluwer Academic Press. Submission Procedure: * Authors are requested to submit one electronic version of their papers OR four hardcopies. Please submit hardcopies only if electronic submission is impossible. * Maximum length is 8 pages including figures and references. * Please conform with the traditional two-column ACL Proceedings format. Style files can be downloaded from http://www.isi.edu/~marcu/stylefiles/ or from ftp://ftp.cs.columbia.edu/acl-l/Styfiles/Proceedings/. Submission should be sent to: Nancy Ide Department of Computer Science Vassar College 124 Raymond Avenue Poughkeepsie, New York 12604-0520 USA Fax: (+1 914) 437 7498 WWW: http://www.cs.vassar.edu/~ide E-mail: ide
cs.vassar.edu Timetable: Deadline for submissions: March 26, 1999. Notification of acceptance: To Be Announced. Camera ready copies due: To Be Announced. Organizing committee: * Dan Cristea - University "A.I. Cuza" of Iasi, Romania. * Nancy Ide - Vassar College, USA. * Daniel Marcu - Information Sciences Institute/University of Southern California, USA. Program Committee: * Nicholas Asher (University of Texas) * Eugene Charniak (Brown University) * Udo Hahn (Freiburg University) * Lynette Hirschman (MITRE Corp.) * Graeme Hirst (University of Toronto) * Massimo Poesio (University of Edinburgh) * Ehud Reiter (University of Aberdeen) * Michael Strube (University of Pennsylvania) * Wietske Vonk (Max Planck Institute) * Marilyn Walker (AT&T) Related Events * ACL'99 * ACL'99 SIGDIAL Business Meeting * ACL'99 Workshop on Tagging * ACL'99 Workshop on Coreference and Its Applications * EuroLAN'99 Summer School ********************************************************************** TITLE: Towards Standards and Tools for Discourse Tagging DESCRIPTION: Discourse tagging assigns labels from a tag set to discourse units in texts or dialogues. The discourse units range from words or referring expressions to multi-utterance units identified by criteria such as speaker intention or initiative. Since the emergence of syntactically annotated corpora has resulted in major advances in sentence-level natural language processing, the hope is that corpora of tagged discourse may lead to similar advances in the area of discourse processing. Work on discourse tagging has gained momentum in the last 3-4 years. Three major initiatives in this area are: the Discourse Resource Initiative (http://www.georgetown.edu/luperfoy/Discourse-Treebank/), that has organized yearly international workshops addressing the standardization of discourse tagging schemes for coreference, for dialogue acts, and for higher level discourse structures; MATE (http://mate.mip.ou.dk/), a project co-funded by the European Union, whose aim is to develop tools and standards for tagging spoken dialogue corpora at different levels, including the discourse level; the Global Document Annotation initiative, that aims at having Internet authors annotate their documents with a common standard tag set which allows machines to recognize the semantic and pragmatic structures of documents (http://ww.etl.go.jp/etl/nl/GDA). Even with these three initiatives in place, there is still much work to be done before there are widely accepted (standardized) tagging schemes for various discourse phenomena that could be shared across sites; moreover, there has not yet been an open forum to which researchers working in this area could participate and contribute. This workshop will provide such a forum. Submissions are invited on, but not limited to, the following topics and issues: 1. How can standardization for discourse tagging concretely be achieved? by developing a single coding scheme, or more likely, a set of coding schemes, one for each phenomenon of interest? or rather, by developing some specification guidelines and a way of mapping from one scheme to another? in some other way? 2. Cross-level coding: all the initiatives mentioned above promote an approach in which coding schemes are developed at different levels, rather than an approach in which a monolithic scheme addresses all phenomena. Given this methodology, the issue of cross-level coding arises, namely, how can coding schemes for different levels take advantage of each other and allow coding of cross-level relationships? is it possible to relate corpus annotations at different annotation levels to examine the interdependence of linguistic phenomena? 3. Coding schemes and theories of discourse: is it possible to develop coding schemes that faithfully reflect a discourse theory? if yes, is it desirable? conversely, can corpora coded for discourse issues help advance our theoretical understanding of discourse phenomena? 4. Coding schemes and applications: is it possible to design discourse coding schemes independently from the applications tagged corpora are supposed to be used for (eg, to train a speech act recognizer)? 5. Coding schemes and reliability: discourse categories are difficult to code for reliably. Whatever the reason (e.g., lack of an overarching theory for discourse, or genuine ambiguity and misunderstandings in real dialogue reflected in the coding), how can we devise reliable coding schemes? What reliability measures should be used: are widely used measures (Kappa, Alpha, precision and recall) appropriate in this case? If not, what other measures can we use? Is reliability affected by whether naive or expert coders are used? 6. Tools for discourse tagging: what specific features of a tool does discourse tagging require? can we just extend tools developed eg for syntactic tagging? do we need to develop new tools? 7. Some paradigms for evaluating dialogue systems take advantage of the use of tagged corpora: how are tagging for evaluation purposes and discourse tagging related? Are there some discourse tags that may be used as evaluation tags or is it advisable to introduce another dimension of tagging? In addition to papers, prospective participants may be asked to do a small homework before the workshop to test out various tagging schemes. Prospective participants who have developed tools are welcome to bring a demo with them. Submission Procedure: Authors are requested to submit one electronic version of their papers OR four hardcopies. Please submit hardcopies only if electronic submission is impossible. Send your electronic submission to both Marilyn Walker (walker
research.att.com) and Morena Danieli (morena.danieli
cselt.it) If electronic submission is impossible, please contact the organizers to arrange for hardcopy submission. Maximum length is 6 pages including figures and references. Please conform with the traditional two-column ACL Proceedings format. Style files can be downloaded from ftp://ftp.cs.columbia.edu/acl-l/Styfiles/Proceedings/. Timetable: Deadline for submissions: March 20, 1999. Notification of acceptance: April 16, 1999. Camera ready copies due: April 30, 1999 WORKSHOP CHAIRS: Marilyn Walker, Morena Danieli, Johanna D. Moore, Barbara Di Eugenio. ************************************************************************ SIGLEX99 Standardizing Lexical Resources June 21, 22, 1999 University of Maryland =========================================================================== FIRST CALL FOR PAPERS ========================================================================== As our national interests become increasingly global, timely access to information becomes more and more necessary. Many promising strategies for information provision rely heavily on lexical resources, including ontologies. Our next major challenge is providing a standardized lexical resource: an inventory of word meanings, or senses, associated with criteria for distinguishing them. Currently there are several different on-line lexical resources that are being used for English, WordNet, Longman's, the Oxford English Dictionary, (OED), CIDE from Cambeidge University Press (CUP), Collins, and Webster's, to name just a few, and they each use very different approaches to making sense distinctions. Various computational lexicons and related resources such as ontologies are under development, including the European PAROLE/SIMPLE lexicons, the Generative Lexicon, the SENSUS ontology, Mikrokosmos, WordNet, Framenet, and the theory of Lexical Conceptual Structures. Each takes a very different approach and makes reference to different underlying theories of semantics. This divergence of resources has motivated the efforts of the EAGLES Lexical Semantics Group, which is defining a common format for lexical semantic representation for 12 languages. http://www.ilc.pi.cnr.it/EAGLES96/rep2/ In a recent evaluation of word sense disambiguation systems, SIGLEX98-SENSEVAL, (also supported by Euralex, Elsenet, ECRAN and SPARKLE) "http://www.itri.brighton.ac.uk/events/senseval", the training data and test data were prepared using a set of Oxford University Press (OUP) senses. This made it difficult to evaluate the performance of pre-existing systems that had been built using other lexical resources. A mapping was made from the OUP senses to WordNet senses, so that WordNet systems could be included, but this was somewhat problematic as there were far fewer WordNet senses, and frequently no direct mapping was possible. As do most dictionaries, OUP and WordNet often make different decisions about how to structure entries for the same words which are all equally valid, but simply not compatible. Therefore, it becomes especially difficult to include pre-existing systems in the evaluation that rely on a pre-existing lexical resource other than the one used as the Gold Standard. The question that arises here is the likelihood of making performance preserving mappings between lexical resources. Is it even possible to treat one lexical resource as a standard that other resources can be mapped to? (This is true even when focusing on just one language - the problem simply becomes more explosive when additional languages become involved.) All of the participants in SIGLEX98-SENSEVAL agreed that they would prefer evaluations based on running text rather than corpus instances, but this is only feasible if the Gold Standard sense inventory being used for tagging can be appropriately mapped onto several different lexical resources. The purpose of SIGLEX99 is to directly address the issue of standardization of lexical resources, and performance-preserving mappings between existing resources. As a spin-off from SENSEVAL, we are investigating mapping the OUP SENSEVAL senses onto other lexical resources. We will also be tagging running text with these senses, and other senses, and will circulate this ahead of time to workshop participants. There will be several working sessions focussed around the mappings between lexical resources and the tagged samples. Languages other than English will also be considered, in connection with ROMANSEVAL, the subset of SENSEVAL for Romance languages (but with no restriction to that language family). We will study the relevance of EuroWordNet (EWN) sense dictinctions for WSD systems, and the applicability of the Interlingua Language Index (ILI) created within EWN for cross-language sense-standardization. An issue of particular interest is the mapping of existing resources to the ILI, which could be an important step towards the development of a standardized multilingual lexicon for WSD. Such a multilingual gold standard could in turn be used to semantically tag parallel texts and thus create standardized corpora useful for many multilingual applications. There will also be a session to discuss the future of American involvement in EAGLES, and how the workshop results and conclusions can be incorporated. We will have invited talks on ontologies and lexical resources, and we welcome submissions on any areas in lexical semantics and computational lexical semantics, but particularly on the acquisition and use of lexical resources and ontologies and on word sense disambiguation. There will be a workshop proceedings, and as we have done with our last two workshops, we will encourage partipants to make electronic versions of their papers available on the web prior to the workshop. Likely invited speakers include Patrick Hanks (Oxford University Press), Chuck Fillmore (Berkeley), and someone speaking on WordNet or EuroWordNet and on SIMPLE (the European project for building harmonized semantic lexicons for 12 European languages). The schedule for paper submissions (ACL format, 6 pages): SUBMISSION DEADLINE: March 29, 1999 NOTIFICATION OF ACCEPTANCE: May 7, 1999 CAMERA READY COPIES (and copyrights) DUE: May 28, 1999 Please send submissions, hard copy or electronic (.ps or .doc), to: Martha Palmer Institute for Research in Cognitive Science 400A, 3401 Walnut Street/6228 University of Pennsylvania Philadlephia, PA 19104 Telephone: (215) 898-0361 FAX No.: (215) 573-9247 e-mail: mpalmer
cis.upenn.edu Program Committee: Nicoletta Calzolari, Istituto di Linguistica Computazionale, Pisa Bonnie Dorr, University of Maryland Chuck Fillmore, University of California, Berkeley Ralph Grishman, New York University Patrick Hanks, Oxford University Press Eduard Hovy, USC Information Sciences Institute Nancy Ide, Vassar College Adam Kilgarriff, ITRI, University of Brighton Marc Light, MITRE Corporation Martha Palmer, University of Pennsylvania, CHAIR James Pustejovsky, Brandeis University Philip Resnik, University of Maryland Patrick St Dizier, IRIT-CNRS, Universiti Paul Sabatier Antonio Sanfilippo, European Commission, DG XIII Frederique Segond, Xerox Research Centre, Grenoble Jean Vironis, Universiti de Provence Evelyne Viegas, New Mexico State University Piek Vossen, University of Amsterdam Yorick Wilks, University of Sheffield David Yarowsky, John's Hopkins University Antonio Zompolli, Istituto di Linguistica Computazionale, Pisa ***********************************************************************
FG99 FG99 FG99 FG99 FG99 FG99 FG99 FG99 FG99 FG99 FG99 FG99 FG99 FG99 FG99 FORMAL GRAMMAR CONFERENCE August 7-8, 1999, Utrecht, The Netherlands FINAL CALL FOR PAPERS FG99 is the 5th conference on Formal Grammar held in conjunction with the European Summer School in Logic, Language and Information, which takes place in 1999 in Utrecht. Previous meetings were held in Barcelona (1995), Prague (1996), Aix-en-Provence (1997), and as part of the Joint Conference on Formal Grammar, Head-Driven Phrase Structure Grammar, and Categorial Grammar (FHCG98) held in Saarbruecken last August. AIMS and SCOPE FG99 provides a forum for the presentation of new and original research on formal grammar, especially with regard to the application of formal methods to natural language analysis. Themes of interest include, but are not limited to, * formal and computational syntax, semantics, pragmatics, and phonology; * model-theoretic and proof-theoretic methods in linguistics; * constraint-based and resource-sensitive approaches to grammar; * foundational, methodological and architectural issues in grammar. Previous conferences in this series have welcomed papers from a wide variety of frameworks. SPECIAL SESSIONS and INVITED SPEAKERS. There will be a SYMPOSIUM on Grammatical Resources and Grammatical Inference David Dowty (Ohio State) Polly Jacobson (Brown) Gerhard Jaeger (Berlin) Reinhard Muskens (Tilburg) Mark Steedman (Edinburgh) commentator: Johan van Benthem (Amsterdam) [tentative] SUBMISSION DETAILS We invite E-MAIL submissions of abstracts for 30-minute papers (including questions, comments, and discussion). A submission should consist of two parts: - an information sheet (in ascii), containing the name of the author(s), affiliation(s), e-mail and postal address(es) and a title; - an abstract, consisting of a description of not more than 5 pages (including figures and references). Abstracts may be either in plain ASCII or in (unix-compatible encoded) postscript, PDF, or DVI. Abstracts can be sent to fgMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issueufal.mff.cuni.cz (Geert-Jan M. Kruijff) ABSTRACT SUBMISSION DEADLINE March 1, 1999 NOTIFICATION OF ACCEPTANCE April 30, 1999 PROCEEDINGS A full version of each accepted paper will be included in the conference proceedings, to be distributed at the conference. Full papers are due June 30, 1999. PROGRAMME COMMITTEE Anne Abeill'e (Paris) Gosse Bouma (Groningen) John Coleman (Oxford) Mary Dalrymple (Xerox Parc) David Dowty (Ohio State) Elisabet Engdahl (Gotenborg) Daniele Godard (Lille) Jack Hoeksema (Groningen) Polly Jacobson (Brown) Mark Johnson (Brown) Ruth Kempson (London) Shalom Lappin (London) Anton Nijholt (Twente) Owen Rambow (Cogentex) Mark Steedman (Edinburgh) FURTHER INFORMATION Web site for ESSLLI XI: http://esslli.let.uu.nl Web site for FG99 : http://ufal.mff.cuni.cz/fg.html The organizers: Geert-Jan Kruijff gj
ufal.mff.cuni.cz Glyn Morrill glyn
lsi.upc.es Paola Monachesi Paola.Monachesi
let.uu.nl Dick Oehrle oehrle
linc.cis.upenn.edu