LINGUIST List 18.3644

Wed Dec 05 2007

Confs: Computational Linguistics/India

Editor for this issue: Stephanie Morse <morselinguistlist.org>

        1.    Anil Kumar Singh, IJCNLP Workshop on Named Entity Recognition

Message 1: IJCNLP Workshop on Named Entity Recognition
Date: 05-Dec-2007
From: Anil Kumar Singh <anilresearch.iiit.ac.in>
Subject: IJCNLP Workshop on Named Entity Recognition
IJCNLP Workshop on Named Entity Recognition
Short Title: NER-SSEAL-08

Date: 12-Jan-2008 - 12-Jan-2008
Location: Hyderabad, India
Contact: Anil Kumar Singh
Contact Email: anilresearch.iiit.ac.in
Meeting URL: http://ltrc.iiit.ac.in/ner-ssea-07

Linguistic Field(s): Computational Linguistics

Meeting Description:

The workshop is being held in conjunction with the Third
International Joint Conference on Natural Language Processing
(January 7-12, 2008), which is one of the major conferences
in NLP/CL. The workshop program is given below.

The papers going to be presented at the workshop include both
regular research papers as well as papers in the Shared Task.
There will also be two invited talks by senior researchers who
have worked on the NER problem for South Asian languages.

IJCNLP 2008 Workshop on
Named Entity Recognition (NER) for South and South East Asian Languages

Friday, 12 January 2008
Hyderabad, India

Most of the South and South East Asian (SSEA) languages are
scarce in resources and tools and Named Entity Recognition (NER)
systems are no exception. It is very important that good systems
for NER be available, because many problems in information
extraction and machine translation (among others) are dependent
on accurate NER. However, the issues involved are significantly
different for these languages from those for European languages
or even East Asian languages. For example, these languages do
not have capitalization, which is a major feature for NER systems
for European languages.

Another similarity among these languages is that many of them use
scripts of Brahmi origin. For some languages, there are additional
issues such as word segmentation (e.g. for Thai). Large gazetteers
are not available for most of these languages. Lack of
standardization and spelling variation add further problems.
The number of frequently used common nouns which can also be used
as names is very large for many languages, unlike European
languages where a larger proportion of the first names are not
used as common words. Lastly, and most importantly, there is
a serious lack of labeled data for machine learning.

Workshop Programme

Named Entity Recognition for South and South East Asian Languages:
Taking Stock

Anil Kumar Singh

Session 1

Invited Talk: Named Entity Recognition: Different Approaches

Sobha L

A Hybrid Approach for Named Entity Recognition in Indian Languages

Sujan Kumar Saha, Sanjay Chatterji, Sandipan Dandapat, Sudeshna Sarkar
and Pabitra Mitra

Session 2

Invited Talk: Multilingual Named Entity Recognition

Sivaji Bandyopadhyay

Aggregating Machine Learning and Rule Based Heuristics for Named
Entity Recognition

Karthik Gali, Harshit Surana, Ashwini Vaidya, Praneeth Shishtla
and Dipti Misra Sharma

Language Independent Named Entity Recognition in Indian Languages

Asif Ekbal, Rejwanul Haque, Amitava Das, Venkateswarlu Poka and
Sivaji Bandyopadhyay

Session 3

Named Entity Recognition for Telugu

Srikanth P and Narayana Murthy Kavi

Poster Display and Discussion

An experiment on automatic detection of Named Entity in Bangla

Bidyut Baran Chaudhuri and Suvankar Bhattacharya

A Hybrid Named Entity Recognition System for South Asian Languages

Praveen P and Ravi Kiran V

Named Entity Recognition for South Asian Languages

Amit Goyal

Named Entity Recognition for Indian Languages

Animesh Nayan, B. Ravi Kiran Rao, Pawandeep Singh, Sudip Sanyal
and Ratna Sanyal

Experiments in Telugu NER: A Conditional Random Field Approach

[Praneeth Shishtla, Prasad Pingali, Vasudeva Varma and Karthik Gali]

Session 4

Bengali Named Entity Recognition using Support Vector Machine

Asif Ekbal and Sivaji Bandyopadhyay

Domain focused Named Entity Recognizer for Tamil using Conditional
Random Fields

Vijayakrishna R and Sobha L

A Character n-gram Based Approach for Improved Recall in Indian
Language NER

Praneeth Shishtla, Prasad Pingali and Vasudeva Varma

Closing Discussion

For workshop specific inquiries, please contact:

Anil Kumar Singh
Language Technologies Research Centre
IIIT, Hyderabad, India
Email: anilresearch.iiit.ac.in

For General Inquiries (accommodations ect.), please contact:

IJCNLP-08 Secretariat
International Institute of Information Technology
Gachibowli, Hyderabad 500 032, Andhra Pradesh, India
Tel: +91-40-2300 0646; Fax: +91-40-2300 0044
Email: ijcnlp08iiit.ac.in

