LINGUIST List 18.2136

Sat Jul 14 2007

Calls: Computational Ling/India

Editor for this issue: Ania Kubisz <anialinguistlist.org>

        1.    Anil Kumar Singh, IJCNLP Workshop on Named Entity Recognition

Message 1: IJCNLP Workshop on Named Entity Recognition
Date: 14-Jul-2007
From: Anil Kumar Singh <anilresearch.iiit.ac.in>
Subject: IJCNLP Workshop on Named Entity Recognition
Full Title: IJCNLP Workshop on Named Entity Recognition
Short Title: NER-SSEAL-08

Date: 12-Jan-2008 - 12-Jan-2008
Location: Hyderabad, A.P., India
Contact Person: Anil Kumar Singh
Meeting Email: anilresearch.iiit.ac.in
Web Site: http://ltrc.iiit.ac.in/ner-ssea-07

Linguistic Field(s): Computational Linguistics

Call Deadline: 21-Sep-2007

Meeting Description

Papers are invited on substantial, original, and unpublished research on
all aspects of Named Entity Recognition (NER) for South and South East
Asian (SSEA) languages. At least one of the languages considered should be
an SSEA language. We also invite researchers to be contestants in a shared
task (the second track of the workshop) on NER for SSEA languages.

Call for Papers

IJCNLP 2008 Workshop on Named Entity Recognition for South and South East
Asian Languages


Background and Motivation

Most of the SSEA languages are scarce in resources as well as tools and NER
systems are no exception. It is very important that good systems for NER be
available, because many problems in information extraction and machine
translation (among others) are dependent on accurate NER. However, the
issues involved are significantly different for these languages from those
for European languages or even East Asian languages. For example, these
languages do not have capitalization, which is a major feature for NER
systems for European languages.
Another similarity among these languages is that most of them use scripts
of Brahmi origin. For some languages, there are additional issues like word
segmentation (e.g. for Thai). Large gazetteers are not available for most
of these languages. There is also the problem of lack of standardization
and spelling variation. The number of frequently used words which can also
be used as names is very large for many languages, unlike European
languages where a larger proportion of the first names are not used as
common words. And most importantly, there is a serious lack of labeled data
for machine learning.


This workshop will be the second stage of an annual event called NLPAI
Machine Learning Contest which focuses on application of machine learning
techniques for one major NLP problem every year. This year the problem was
NER. However, unlike that event, this workshop will have one track for
regular research papers on NER for SSEA languages and the second track will
be on the lines of a shared task.

Shared Task

In the shared task, the contestants having their own NER systems will be
given some annotated test data. The participating systems will be ranked
according to their performance on the test data. There may or may not be
training data for a particular language. In either case, the contestants
will have the freedom to use any technique for NER, e.g. a purely rule
based technique or a purely statistical technique.

At present some data is available for Hindi, Bengali and Telugu for the
shared task. Other languages can be included in the contest provided data
for them becomes available. The data released for the shared task will be
made accessible to all researchers, not just the participants.

If the language you are interested in has not been included in the shared
task, you can also prepare the annotated test data and submit it to us. We
will then include that language in the shared task.

The task in this contest will be different in one important way. The NER
systems also have to identify nested named entities. For example, in the
sentence The Lal Bahadur Shastri National Academy of Administration is
located in Mussoorie, 'Lal Bahadur Shastri' is a Person, but 'Lal Bahadur
Shastri National Academy of Administration' is an Organization. In this
case, the NER systems will have to identify both 'Person' and
'Organization' in the given sentence.


Paper submission is through the centralized workshop submission page at
https://www.softconf.com/ijcnlp/NERSSEAL. Papers have to be written in
English. Note that shared task contestants also have to submit a paper
describing their method and the results etc. Long or short papers can be
submitted to either of the tracks. Long papers can be up to 8 pages long,
while the maximum length for short papers is 5 pages (including references,
figures, tables etc.). All selected papers will be published in the
workshop proceedings.

The papers should be formatted using the LaTeX styles or MS Word templates
recommended for the main IJCNLP conference. These documents are available
at http://www.ijcnlp2008.org/callforpapers.htm. Reviewing will be blind.
The draft papers should not contain any information that can identify the
authors, as far as possible.

Important Dates

- Paper Submission Deadline: Sept 21, 2007
- Notification of Paper Acceptance: Oct 26, 2007
- Camera Ready Submission Deadline: Nov 16, 2007

Program Committee

Rajeev Sangal, IIIT, Hyderabad, India
Dekai Wu, The Hong Kong University of Science & Technology, Hong Kong
Ted Pedersen, University of Minnesota, USA
Dipti Misra Sharma, IIIT, Hyderabad, India
Virach Sornlertlamvanich, TCL, NICT, Thailand
M. Sasikumar, CDAC, Mumbai, India
Sudeshna Sarkar, Indian Institute of Technology, Kharagpur, India
Thierry Poibeau, CNRS, France
Sobha L., AU-KBC, Chennai, India
Tzong-Han Tsai, National Taiwan University, Taiwan
Prasad Pingali, IIIT, India
Canasai Kreungkrai, NICT, Japan
Manabu Sassano, Yahoo Japan Corporation, Japan
Anil Kumar Singh, IIIT, Hyderabad, India
Doaa Samy, Universidad Autonoma de Madrid, Spain
Ratna Sanyal, Indian Inst. of Inf. Tech., Allahabad, India
V. Sriram, IIIT, Hyderabad, India
Anagh Kulkarni, Carnegie Mellon University, USA
Soma Paul, IIIT, Hyderabad, India

Contact Persons

Dipti Misra Sharma, Rajeev Sangal, Anil Kumar Singh
Language Technologies Research Centre
International Institute of Information Technology
Gachibowli, Hyderabad, India

Phone: 91-9391008624
Fax: 91-40-23001413
Email: diptiiiit.ac.in, sangaliiit.ac.in, anilresearch.iiit.ac.in

