LINGUIST List 8.122

Tue Jan 28 1997

Calls: Environments for Grammar Development

Editor for this issue: T. Daniel Seely <seelylinguistlist.org>


Please do not use abbreviations or acronyms for your conference unless you explain them in your text. Many people outside your area of specialization will not recognize them. Thank you for your cooperation.

Directory

  1. Alberto Lavelli, Environments for Grammar Development

Message 1: Environments for Grammar Development

Date: Mon, 27 Jan 1997 18:39:54 +0100 (MET)
From: Alberto Lavelli <lavelliitc.it>
Subject: Environments for Grammar Development

**********************************************************************
Call for Submissions Please Distribute Widely
**********************************************************************

				ENVGRAM
	COMPUTATIONAL ENVIRONMENTS FOR PRACTICAL GRAMMAR DEVELOPMENT,
	 PROCESSING AND INTEGRATION WITH OTHER NLP MODULES


 Madrid, Spain, July 11 or 12, 1997
 (in conjunction with ACL-97/EACL-97)


WORKSHOP DESCRIPTION


With a growing number of NLP applications going beyond the status of simple
research systems, there is also a more evident need for better methods,
tools and environments to support the development and reuse of large scale
linguistic resources and efficient processors. This new area of research,
often referred to as Linguistic Engineering, is rapidly gaining interest
along side the more traditional ones concerned with formalisms or algorithm
studies and development.

Aspects of linguistic engineering range from grammar development
environments, through the construction and maintenance of large scale
linguistic resources, to methodologies for quality assurance and
evaluation. Some of the most prominent examples of sophisticated development
platforms comprising tracer, debugger and all kinds of highly important
visualization tools are ALEP (funded by the European Union), GATE (common
infrastructure for building LE architectures using pre-existing components),
GWB (LFG-workbench developed at Xerox Parc) PAGE (typed feature logics-based
grammar development developed at DFKI), and many others. There have been a
number of projects on the development of large-scale computational lexicons
(e.g. Acquilex), as well as projects concerned with the development of
standards and reference data for diagnostics and evaluation (e.g. TSNLP).

However, while these platforms and components typically provide fairly clean
formalisms, processing components and data, it is not yet clear to which
extent current results and approaches fit the requirements for scale
development and deployment of real NLP applications.

In this connection, a number of pending issues need be addressed, the
relevance of which becomes particularly clear when the focus is shifted from
linguistic formalism to usability and user/application requirements. The
following points are examples of relevant topics:


- What is the state of the art in Grammar Development Environments?

There are a number of systems on the market already. Given the enormous cost
of developing such environments, it is unlikely that many others will be
developed from scratch. Up to what point do the existing systems meet
actual user requirements? What experiences are there in tailoring such
systems to specific applications?


- How can we meet the demands arising from distributed grammar development?

Even if in the past the biggest systems have been based on the work of one
individual, it is unwise and unpractical to have one large grammar developed
by single writers. Thus, the development and maintenance of large grammars
tends to be more and more a joint effort involving many computational
linguists. What specific requirements and prerequisites have to be met in a
development environment to ensure a smooth cooperation between different
authors leading to the necessary modularity, consistency and integratability
of grammar fragments?


- How can we meet the demands of multi-lingual grammar development?

For many applications (even outside machine translation itself)
multi-linguality is becoming an indispensable standard feature. The parallel
development of several grammars in different languages will require some
synchronization of linguistic knowledge bases and sharing of processing
components. Can different language specific grammars share a common core
grammar? Is it useful to build on modern formalisms which allow an object
oriented design (such as typed feature logics) or even on theories of a
putative "universal grammar".


- What is the appropriate division of labour in a large scale development
environment?

Sophisticated applications may require a whole range of knowledge sources
and processors, addressing, e.g. computational morphology, syntax,
semantics, lexicography, corpus analysis, parsing and generation to name but
a few. What approaches and methods can be devised and which tools and
facilities should be employed to facilitate and support the integration of
different levels of linguistic abstraction, of different processing modules
and the cooperation between grammar writing and processor design ?


- How can we facilitate the shift from reusability to usability?

Grammar development in academic and research oriented environments has often
concentrated on the maximum generality and reusability of the linguistic
resources developed. However, for building actual applications and for
applying systems to specific domains, this generality can turn out to be a
drawback rather than an asset. Thus, the question is how one can support the
specialization and customization to more constrained domains without
sacrificing the advantages of more a more general and reusable design.


- What are the necessary ingredients for quality assurance in grammar
development?

The incremental construction of large grammars in particular in a
distributed environment makes it necessary to maintain sufficient control
over different versions. Coverage and speed are expected to increase over
the development cycles. Quality assurance, testing and diagnostics cannot be
carried out properly, if they are based on the odd collection of test items
or some arbitrarily chosen corpus fragment. Evaluation of a system, which
goes even further, will require a minimum degree of standardization of
reference material. What are then the appropriate methods and data to be
applied for these purposes? How can they be constructed, collected and
customized to specific applications and domains?


The workshop will be the occasion to discuss the results achieved and the
most promising directions and to highlight pending problems. Contributions
are solicited from institutions (both research-oriented and industrial)
involved in the production of NLP applications.



Invited Speaker

Hans Uzkoreit (DFKI) "Reference Data and Grammar Development Environments"


ORGANIZING COMMITTEE

Fabio Pianesi (Primary Contact), IRST, Italy (pianesiirst.itc.it)
Dominique Estival, University of Melbourne, Australia
 (D.Estivallinguistics.unimelb.edu.au)
Alberto Lavelli, IRST, Italy (lavelliirst.itc.it)
Klaus Netter, DFKI, Germany (netterdfki.uni-sb.de)


PROGRAMME COMMITTEE

Harry Bunt, Tilburg University, The Netherlands
Bob Carpenter, Lucent Technologies Bell Labs, USA
Jochen Dorre, University of Stuttgart, Germany
Dominique Estival, University of Melbourne, Australia
Dan Flickinger, CSLI Stanford, USA
Klaus Netter, DFKI, Germany
Fabio Pianesi, IRST, Italy
Steven Pulman, SRI Cambridge, UK
Antonio Sanfilippo, Sharp, UK


PROGRAMME CHAIRS

Klaus Netter, DFKI, Germany
Fabio Pianesi, IRST, Italy


SUBMISSIONS

Authors are asked to submit previously unpublished papers; ALL SUBMISSIONS
SHOULD BE SENT TO FABIO PIANESI. A limited number of position papers could
also be considered. Each submission will undergo multiple reviews. The
papers should be full length (not exceeding 3200 words, exclusive of
references), also including a descriptive abstract of about 200 words.
Electronic submissions are strongly preferred, either in self-contained
LaTeX format (using the ACL-97 submission style; see:
ftp://ftp.cs.columbia.edu/acl-l/, as well as the submission guidelines for
the main conference, at http://www.ieec.uned.es/cl97/), or as a PostScript
file. In exceptional circumstances, Microsoft Word files will also be
accepted as electronic submissions, provided they follow the same formating
guidelines. Hard copy submissions should include eight copies of the paper.
A separate title page should include the title of the paper, names,
addresses (postal and e-mail), telephone and fax number of all authors. Any
correspondence will be addressed to the first author (unless otherwise
specified). Authors will be responsible for preparation of camera-ready
copies of final versions of accepted papers, conforming to a uniform format,
with guidelines and a style file to be supplied by the organisers.


REQUIREMENTS

A paper accepted for presentation cannot be presented or have been presented
at any other meeting. Please indicate in your submission if you have
submitted your paper to another conference.


ORGANISATION OF SESSIONS

Presentations will be allocated 25 minute slots each, plus an extra five
minutes for discussion, distributed over morning and afternoon sessions,
including an invited talk and a (closing) general discussion.


WORKSHOP PARTICIPATION

Workshop attendance will be limited to maximally 40 people, persons without
a submission should contact the organizers as soon as possible. According to
the ACL/EACL workshop guidelines, all workshop participants must register
for the ACL/EACL main conference.


DEMOS

Depending on the availability of time and appropriate computing facilities,
a demo session will be organised.


SCHEDULE

Submission deadline: 10 March 1997
Notification of acceptance: 4 April 1997
Camera-ready versions of accepted papers due: 27 April 1997
Workshop: 11 or 12 July 1997


ADDRESS FOR SUBMISSIONS AND FURTHER INFORMATION

Fabio Pianesi
IRST - Istituto per la Ricerca Scientifica e Tecnologica
38050, Povo Trento, Italy
tel: +461-314327
fax: +461-302040
e-mail: pianesiirst.itc.it
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue