LINGUIST List 16.658

Sat Mar 05 2005

Calls: Computational Ling/USA; Computational Ling/USA

Editor for this issue: Amy Wronkowicz <>

As a matter of policy, LINGUIST discourages the use of abbreviations or acronyms in conference announcements unless they are explained in the text. To post to LINGUIST, use our convenient web form at


        1.    Anna Korhonen, ACL 2005 Workshop on Deep Lexical Acquisition
        2.    Eric Ringger, ACL 2005 Workshop on Feature Engineering for Machine Learning in Natural Language Processing

Message 1: ACL 2005 Workshop on Deep Lexical Acquisition

Date: 04-Mar-2005
From: Anna Korhonen <>
Subject: ACL 2005 Workshop on Deep Lexical Acquisition

Full Title: ACL 2005 Workshop on Deep Lexical Acquisition

Date: 30-Jun-2005 - 30-Jun-2005
Location: Ann Arbor, Michigan, United States of America
Contact Person: Anna Korhonen
Meeting Email:
Web Site:

Linguistic Field(s): Computational Linguistics

Call Deadline: 11-Apr-2005

Meeting Description:

Sponsored by the ACL Special Interest Group on the Lexicon (SIGLEX)
30 June, 2005
Ann Arbor, USA

Submission deadline: 11 April, 2005


In natural language processing (NLP), there is a pressing need to develop deep
lexical resources (e.g. lexicons for linguistically-precise grammars, template
sets for information extraction systems, ontologies for word sense
disambiguation). Such resources are critical for enhancing the performance of
systems and for improving their portability between domains. For example, to
perform reliably, an information extraction system needs access to high-quality
lexicons or templates specific to the task at hand.

Most deep lexical resources have been developed manually by lexicographers.
Manual work is costly and the resulting resources have limited coverage, and
require labour-intensive porting to new tasks. Automatic lexical acquisition is
a more promising and cost-effective approach to take, and is increasingly viable
given recent advances in NLP and machine learning technology, and corpus

While advances have recently been made in some areas of automatic deep lexical
acquisition, a number of important challenges need addressing before benefits
can be reaped in practical language engineering:

* Acquisition of deep lexical information from corpora

While corpus data has been successfully applied in learning certain types of
deep lexical information (e.g. semantic relations, subcategorization,
selectional preferences), there remain a broad range of lexical relations that
corpus-based techniques have yet to be applied to.

* Accurate, large-scale, portable acquisition techniques

One of the biggest current research challenges is how to improve the accuracy
of existing acquisition techniques further, at the same time as improving both
scalability and robustness.

* Use of deep lexical acquisition in recognised applications

Although lexical acquisition has the potential to boost performance in many NLP
application tasks, this has yet to be demonstrated for many important applications.

* Multilingual deep lexical acquisition

For theoretical and practical reasons it is important to test whether
techniques developed for one language (typically English) can be used to benefit
research on other languages.


The workshop will be of interest to anyone interested in automatically acquired
deep lexical information, e.g. in the areas of computational grammars,
computational lexicography, machine translation, information retrieval,
question-answering, and text mining. Areas of Interest

* Automatic acquisition of deep lexical information:
o subcategorization
o diathesis alternations
o selectional preferences
o lexical / semantic classes
o qualia structure
o lexical ontologies
o semantic roles
o word senses

* Methods for supervised, unsupervised and weakly supervised deep lexical
acquisition (machine learning, statistical, example- or rule-based, hybrid etc.)

* Large-scale, cross-domain, domain-specific and portable deep lexical acquisition

* Extending and refining existing lexical resources with automatically acquired

* Evaluation of deep lexical acquisition

* Application of deep lexical acquisition to NLP applications (e.g. machine
translation, information extraction, language generation, question-answering)

* Multilingual deep lexical acquisition


Paper submission deadline: 11 April, 2005
Notification date: 2 May, 2005
Camera-ready submission deadline: 16 May, 2005
Workshop date: 30 June, 2005



Papers should describe original work; they should emphasize completed work
rather than intended work, and should indicate clearly the state of completion
of the reported results. Wherever appropriate, concrete evaluation results
should be included. Submissions will be judged on correctness, originality,
technical strength, significance and relevance to the conference, and interest
to the attendees.

A paper accepted for presentation at the workshop, cannot be presented or have
been presented at any other meeting with publicly available published
proceedings. Papers that are being submitted to other conferences or workshops
must indicate this on the title page, as must papers that contain significant
overlap with previously published work. Reviewing

The reviewing of the papers will be blind. Each submission will be reviewed by
at least three programme committee members. Submission Information

Submissions should follow the two-column format of ACL proceedings and should
not exceed eight (8) pages, including references. We strongly recommend the use
of ACL-05 LaTeX style files or Microsoft Word Style files. They are available at A description of the format is also
available in case you are unable to use these style files directly. Papers must
conform to the official ACL-05 style guidelines, and we reserve the right to
reject submissions that do not conform to these styles including font size

As reviewing will be blind, the paper should not include the authors' names and
affiliations. Furthermore, self-references that reveal the author's identity,
e.g., ''We previously showed (Smith, 1991) ...'', should be avoided. Instead,
use citations such as ''Smith previously showed (Smith, 1991) ...''. Papers that
do not conform to these requirements will be rejected without review.

Papers should be submitted electronically in BOTH Postscript and PDF format to:

The following identification information should be sent in a separate email with
the subject line ''ACL2005 WORKSHOP ID PAGE'':

Title: title of paper
Authors: list of all authors
Keywords: up to five topic keywords
Contact author: email address of author of record (for correspondence)
Abstract: abstract of paper (not more than 10 lines)

Notification of receipt will be emailed to the contact author.


Timothy Baldwin
University of Melbourne, Australia

Anna Korhonen
University of Cambridge, UK
NII, Japan

Aline Villavicencio
University of Essex, UK


Collin Baker (University of California Berkeley, USA)
Roberto Basili (University of Rome Tor Vergata, Italy)
Francis Bond (NTT, Japan)
Chris Brew (Ohio State University, USA)
Ted Briscoe (University of Cambridge, UK)
John Carroll (University of Sussex, UK)
Stephen Clark (University of Oxford, UK)
Sonja Eisenbeiss (University of Essex, UK)
Christiane Fellbaum (University of Princeton, USA)
Frederick Fouvry (University of Saarland, Germany)
Sadao Kurohashi (University of Tokyo, Japan)
Diana McCarthy (University of Sussex, UK)
Rada Mihalcea (University of North Texas, USA)
Tom O'Hara (University of Maryland, Baltimore County, USA)
Martha Palmer (University of Pennsylvania, USA)
Massimo Poesio (University of Essex, UK)
Philip Resnik (University of Maryland, USA)
Patrick Saint-Dizier (IRIT-CNRS, France)
Sabine Schulte im Walde (University of Saarland, Germany)
Mark Steedman (University of Edinburgh, Scotland, UK)
Mark Stevenson (University of Sheffield, UK)
Suzanne Stevenson (University of Toronto, Canada)
Dominic Widdows (MAYA Design, Inc., USA)
Yorick Wilks (University of Sheffield, UK)
Dekai Wu (Hong Kong University of Science and Technology)

Message 2: ACL 2005 Workshop on Feature Engineering for Machine Learning in Natural Language Processing

Date: 04-Mar-2005
From: Eric Ringger <>
Subject: ACL 2005 Workshop on Feature Engineering for Machine Learning in Natural Language Processing

Full Title: ACL 2005 Workshop on Feature Engineering for Machine Learning in
Natural Language Processing
Short Title: Feat. Eng. for ML in NLP

Date: 29-Jun-2005 - 29-Jun-2005
Location: Ann Arbor, Michigan, United States of America
Contact Person: Eric Ringger
Meeting Email:
Web Site:

Linguistic Field(s): Computational Linguistics

Call Deadline: 20-Apr-2005

Meeting Description:


Feature Engineering for Machine Learning
in Natural Language Processing

Workshop at the Annual Meeting of
the Association of Computational Linguistics (ACL 2005)

Submission Deadline: April 20, 2005

Ann Arbor, Michigan
June 29, 2005

As experience with machine learning for solving natural language processing
tasks accumulates in the field, practitioners are finding that feature
engineering is as critical as the choice of machine learning algorithm, if not
more so. Feature design, feature selection, and feature impact (through
ablation studies and the like) significantly affect the performance of systems
and deserve greater attention. In the wake of the shift away from knowledge
engineering and of the successes of data-driven and statistical methods,
researchers in the field are likely to make further progress by incorporating
additional, sometimes familiar, sources of knowledge as features. Although some
experience in the area of feature engineering is to be found in the theoretical
machine learning community, the particular demands of natural language
processing leave much to be discovered.

This workshop aims to bring together practitioners of NLP, machine learning,
information extraction, speech processing, and related fields with the intention
of sharing experimental evidence for successful approaches to feature
engineering, including feature design and feature selection. We welcome papers
that address these goals. We also seek to distill best practices and to
discover new sources of knowledge and features previously untapped.

The workshop will include an invited talk by Andrew McCallum of the University
of Massachusetts at Amherst.


Submitted papers should be prepared in PDF format (all fonts included) or
Microsoft Word .doc format and not longer than 8 pages following the ACL style.
More detailed information about the format of submissions can be found here:

The language of the workshop is English. Submissions should be sent as an
attachment to the following email address: ringger AT microsoft DOT com . All
accepted papers will be presented in oral sessions of the workshop and collected
in the printed proceedings.

Submissions are invited on all aspects of feature engineering for machine
learning in NLP. Topics may include, but are not necessarily limited to:

- Novel methods for discovering or inducing features, such as mining the web for
closed classes, useful for indicator features.

- Comparative studies of different feature selection algorithms for NLP tasks.

- Interactive tools that help researchers to identify ambiguous cases that could
be disambiguated by the addition of features.

- Error analysis of various aspects of feature induction, selection, representation.

- Issues with representation, e.g., strategies for handling hierarchical
representations, including decomposing to atomic features or by employing
statistical relational learning.

- Techniques used in fields outside NLP that prove useful in NLP.

- The impact of feature selection and feature design on such practical
considerations as training time, experimental design, domain independence, and

- Analysis of feature engineering and its interaction with specific machine
learning methods commonly used in NLP.

- Combining classifiers that employ diverse types of features.

- Studies of methods for defining a feature set, for example by iteratively
expanding a base feature set.

- Issues with representing and combining real-valued and categorical features
for NLP tasks.


- Paper submission deadline: April 20, 2005; Noon, PST (GMT-8)

- Notification of acceptance: May 10, 2005

- Submission of camera-ready copy: May 17, 2005

- Workshop: June 29, 2005


Chair and contact person:

Eric Ringger
Microsoft Research
One Microsoft Way
Redmond, WA 98052 USA
ringger AT microsoft DOT com

Program Committee:

- Simon Corston-Oliver, Microsoft Research, USA
- Kevin Duh, University of Washington, USA
- Matthew Richardson, Microsoft Research, USA
- Oren Etzioni, University of Washington, USA
- Andrew McCallum, University of Massachusetts at Amherst, USA
- Dan Bikel, IBM Research, USA
- Olac Fuentes, INAOE, Mexico
- Chris Manning, Stanford University, USA
- Kristina Toutanova, Stanford University, USA
- Hideki Isozaki, NTT Communication Science Laboratories, Japan
- Caroline Sporleder, University of Edinburgh, UK

Respond to list|Read more issues|LINGUIST home page|Top of issue