LINGUIST List 13.58

Fri Jan 11 2002

Calls: Ling Knowledge Acquisition,Wordnet Structures

Editor for this issue: Dina Kapetangianni <dinalinguistlist.org>


As a matter of policy, LINGUIST discourages the use of abbreviations or acronyms in conference announcements unless they are explained in the text.

Directory

  1. Alessandro Lenci, LREC 2002 Workshop on LINGUISTIC KNOWLEDGE ACQUISITION
  2. Claudia Kunze, LREC 2002 Workshop on Wordnet Structures and Standardization

Message 1: LREC 2002 Workshop on LINGUISTIC KNOWLEDGE ACQUISITION

Date: Wed, 09 Jan 2002 18:49:50 +0100
From: Alessandro Lenci <lenciilc.pi.cnr.it>
Subject: LREC 2002 Workshop on LINGUISTIC KNOWLEDGE ACQUISITION

 LREC 2002 Workshop on

 LINGUISTIC KNOWLEDGE ACQUISITION AND REPRESENTATION:
 BOOTSTRAPPING ANNOTATED LANGUAGE DATA

 Las Palmas, Canary Islands, Spain

 2nd June 2002

 _____________________________

MOTIVATION AND AIMS

Provision of large-scale labelled language resources, such as tagged
corpora or repositories of pre-classified text documents, is a crucial
key to steady progress in an extremely wide spectrum of research,
technological and business areas in the HLT sector. The continuously
changing demands for language-specific and application-dependent
annotated data (e.g. at the syntactic or at the semantic level),
indispensable for design validation and efficient software prototyping,
however, are daily confronted by the labelled-data bottleneck.
Hand-crafted resources are often too costly and time-consuming to be
produced at a sustainable pace, and, in some cases, they even exceed the
limits of human conscious awareness and descriptive capability.

Possible ways to circumvent, or at least minimise, this problem come
from the literature on automatic knowledge acquisition and, more
generally, from the machine-learning community. Annotated data are
bootstrapped by training a machine-learning classifier with a small
sample of pre-annotated data and by using the induced classifier to
annotate more data. Co-learning provides an alternative methodology,
which essentially consists in iterative cooperation of two or more
independent learning systems. Another promising route consists in
automatically tracking down recurrent knowledge patterns in unstructured
or implicit information sources (such as free texts or machine readable
dictionaries) for this information to be moulded into explicit
representation structures (e.g. subcategorisation frames,
syntactic-semantic templates, ontology hierarchies etc.).

We believe that all these attempts at bootstrapping labelled data are
not only of practical interest (for continuous updating, management and
validation of dynamic resources), but also point to a bunch of germane
theoretical issues. In particular, the workshop intends to focus on the
issue of interaction between techniques for inducing structured
knowledge from raw data and formal methods of linguistic knowledge
representation. Gaining insights into this issue is an essential
requirement for explaining the effective use of linguistic knowledge by
cognitive agents. Although the cognitive and engineering views of the
form and acquisition of linguistic knowledge need not be related, data
from neuroscience and psychology are indeed relevant when evaluating
different ways of representing information in artificial systems, and
different models for linguistic knowledge acquisition.

We encourage in-depth analysis of underlying assumptions of the proposed
bootstrapping methods and discussion of possible relevant connections
with existing annotation and representation schemes. This investigation
is likely to have significant repercussions on the way linguistic
resources will be designed, developed and used for applications in the
years to come. As the two aspects of knowledge representation and
acquisition are profoundly interrelated, progress on both fronts can
only be achieved, in our view of things, through a full appreciation of
this deep interdependency.


TOPICS OF INTEREST

Possible themes for contributions are:
* development of 'data-driven' annotation/representation schemes
* dynamic update, customisation and tuning of labelled resources through
acquired data
* 'hybrid models' of linguistic knowledge extraction, whereby machine
learning methods are integrated with formal structures of knowledge
representation
* incremental linguistic knowledge-bases
* formal representation and structuring of information flow
automatically acquired from texts
* knowledge acquisition and linguistic resources lifecycle
* linguistic knowledge acquisition and representation in cognitive tasks



IMPORTANT DATES

Deadline for workshop abstract submission:
15th of February 2002

Notification of acceptance:
15th of March 2002

Final version of paper for workshop proceedings:
15th of April 2002

Workshop:
2nd June 2002 (afternoon session)


SUBMISSIONS

The organizers welcome contributions describing existing research
related to the topics of the workshop. Each presentation will be 25
minutes long (20 minutes for presentation and 5 minutes for questions
and discussion). Submissions should include: title; author(s);
affiliation(s); and contact author's e-mail address, postal address,
telephone and fax numbers. Abstracts (maximum 500 words, plain-text
format) must be sent to: simoilc.pi.cnr.it

The final version of the accepted papers should not be longer than 4,000
words or 10 A4 pages. Instructions for formatting and presentation of
the final version will be sent to authors upon notification of
acceptance.


ORGANISING COMMITEE

Alessandro Lenci (Universit´┐Ż di Pisa, Italy)
Simonetta Montemagni (Istituto di Linguistica Computazionale - CNR,
Italy)
Vito Pirrelli (Istituto di Linguistica Computazionale - CNR, Italy)


PROGRAM COMMITTEE

Harald Baayen (Max Planck Institute for Psycholinguistics - Nijmegen,
The Netherlands)
Rens Bod (University of Amsterdam, Holland)
Michael R. Brent (Washington University, USA)
Nicoletta Calzolari (Istituto di Linguistica Computazionale - CNR,
Italy)
Jean-Pierre Chanod (Xerox Research Centre Europe, Grenoble, France)
Walter Daelemans (University of Antwerp, Belgium)
Dekang Lin (University of Alberta, Edmonton, Canada)
Horacio Rodriguez (Universidad Politecnica de Catalunya)
Fabrizio Sebastiani (Istituto per l'Elaborazione dell'Informazione -
CNR, Italy)
Lucy Vanderwende (Microsoft Research, Redmond, USA)
Fran´┐Żois Yvon (Ecole Nationale Superieure des Telecommunications, Paris
Frances)
Menno van Zaanen (University of Amsterdam, The Netherlands)


CONTACT PERSON

Simonetta Montemagni
Istituto di Linguistica Computazionale (ILC) - CNR
Area della Ricerca di Pisa
Via Moruzzi 1, 56124 Pisa, ITALY
e-mail: simoilc.pi.cnr.it
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: LREC 2002 Workshop on Wordnet Structures and Standardization

Date: Thu, 10 Jan 2002 20:15:26 +0100 (MET)
From: Claudia Kunze <kunzesfs.nphil.uni-tuebingen.de>
Subject: LREC 2002 Workshop on Wordnet Structures and Standardization



Workshop on Wordnet Structures and Standardization and how
these affect Wordnet Applications and Evaluation


 Workshop held in conjunction with the
Third Language Resources and Evaluation Conference (LREC 2002)
 in Las Palmas, Spain

 May 28, 2002


 CALL FOR PAPERS

Wordnets, which are structured along the lines of the Princeton
WordNet, have become popular lexical-semantic resources in the
field of language technology. Various initiatives to monolingual
and multilingual wordnet construction have been launched
(EuroWordNet, BalkaNet, Portuguese Wordnet etc.), and numerous
language processing tasks rely on wordnet resources and their
implicit knowledge structures.

Existing wordnets vary as with respect to their stage of
development, coverage of concepts, encoding principles of
linguistic contents and semantic relations, and thus their
applicability in different NLP tasks.
Furthermore, language-specific peculiarities of wordnets
have to be considered in the field of cross-lingual applications.
Recently attempts have been made towards the construction of
wordnets for the less-studied languages, which are in need of
reliable standards, yielding at the same time new perspectives
on wordnet construction.

This one-day workshop emphasizes two major topics: wordnet
structures for less-studied languages on the one hand, and wordnet
standardization, evaluation and application on the other hand.
The workshop aims at bringing together wordnet builders and wordnet
appliers from academia and industries in order to integrate the
efforts being made by different sites.

One major topic focuses on wordnets for less-studied languages,
i.e. Eastern European and Scandinavian languages which have
recently started developing sementic networks in order to exchange
new approaches for linguistic structures and architectures of
semantic networks and communicate their preliminary results to a
wider research community.

The other major topic discusses standardization issues for wordnets
and wordnet-related tools, as well as evaluation of wordnet
resources and the information encoded in them, and experiences
with wordnet applications in the area of information retrieval
and sense tagging.

Conference topics:

- guidelines and methodologies for building wordnets;
- new approaches to wordnet construction;
- building of wordnets for less-studied languages;
- architecture of semantic networks and its relationship
to the language type;
- semantic relations of less-studied languages and
their representations;
- structure as language-independent module;
- applicability of WordNet assumptions to other language types;
- standardization of wordnet specifications including the
Interlingual Index as a universal index of meaning;
- standardization of wordnet representations as with respect to
metalanguages (XML, etc.);
- compatibility issues with regard to different formal representations;
- criteria and methods for verifying the content encoded in wordnets;
- consistency checking, comparison and evaluation of wordnet modules;
- evaluation of the value being added by integrating wordnets in
natural language processing tasks;
- experiences from sense-tagging with wordnets.


Submissions

Papers are invited that will describe existing research
connected to the topics of the workshop. Each presentation
will be 20 minutes long (15 minutes and 5 minutes of discussion).
Each submission should indicate: title; author(s); affiliation(s);
and contact author's e-mail address, postal address, telephone
and fax numbers. Abstracts (maximum 1.500 words, plain-text
format) should be sent to the respective contact persons:

Papers related to Wordnet Structures and Applications
for the Less-Studied Languages should be submitted to:
mathiouceid.upatras.gr

Papers related to Wordnet Applications, Standardization &
Evaluation should be submitted to: kunzesfs.nphil.uni-tuebingen.de

All submissions will be reviewed by an international programme
committee. Accepted papers will be published in the Workshop
Proceedings.

The final version of the accepted papers should be no
longer than 4,000 words or 10 A4 pages. Instructions for
formatting and presentation of the final version will
be sent to authors upon notification of acceptance.


Important Dates

Deadline for abstract submission: 10th of February 2002
Notification of acceptance: 10th of March 2002
Final version of paper: 5th of April 2002

Pre-conference Workshop: 28th of May 2002


Organizing Committee

Dimitris N. Christodoulakis (Patras University, Greece)
Claudia Kunze/ Lothar Lemnitzer (University of Tuebingen, Germany)
Karel Pala (Masaryk University Brno, Czech Republic)


Contact Persons

Prof. Dimitris N. Christodoulakis
Databases Laboratory of Computer Engineering & Informatics Department
Patras University
GR 26500 Greece
Phone: +30 61 960 385
Fax: +30 61 960 438
Email: dxricti.gr

Claudia Kunze
Seminar fuer Sprachwissenschaft
Universitaet Tuebingen
Wilhelmstr. 113
D-72074 Tuebingen
Germany
Phone: +49 7071 29 77474
Fax: +49 7071 551335
Email: kunzesfs.uni-tuebingen.de


Programme Committee

Christiane Fellbaum (Princeton University, USA)
Piek Vossen (Irion Technology Delft, The Netherlands)
Kemal Oflazer (Sabanci University Istanbul, Turkey)
Sofia Stamou (CTI Patras, Greece)
Jeroen Hoppenbrouwers (Tilburg University, The Netherlands)
Randee Tengi (Princeton University, USA)
Wim Peters (Sheffield University, GB)
Kadri Vider (Universtiy of Tartu, Estonia)
Julio Gonzales (UNED Madrid, Spain)
Palmira Marrafa (University of Lisboa, Portugal)
Paul Buitelaar (DFKI Saarbruecken, Germany)
Andreas Wagner (University of Tuebingen, Germany)
Erhard Hinrichs (University of Tuebingen, Germany)
Simonetta Montemagni (University of Pisa, Italy)
R.J.H.M Ermers (Almaty, Kazakhstan)

Workshop Fee

for Conference participants: 90 EURO
for others: 140 EURO

To obtain further information about the workshop please visit
http://www.lrec-conf.org/lrec2002/index.html or
http://www.cti.gr/nlp/
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue