* * * * * * * * * * * * * * * * * * * * * * * *
LINGUIST List logo Eastern Michigan University Wayne State University *
* People & Organizations * Jobs * Calls & Conferences * Publications * Language Resources * Text & Computer Tools * Teaching & Learning * Mailing Lists * Search *
* *
LINGUIST List 20.1591

Fri Apr 24 2009

Calls: Computational Linguistics/Thailand

Editor for this issue: Elyssa Winzeler <elyssalinguistlist.org>


LINGUIST is pleased to announce the launch of an exciting new feature: Easy Abstracts! Easy Abs is a free abstract submission and review facility designed to help conference organizers and reviewers accept and process abstracts online. Just go to: http://www.linguistlist.org/confcustom, and begin your conference customization process today! With Easy Abstracts, submission and review will be as easy as 1-2-3!
Directory
        1.    Thepchai Supnithi, Workshop on InterBEST 2009 Thai Word Segmentation

Message 1: Workshop on InterBEST 2009 Thai Word Segmentation
Date: 23-Apr-2009
From: Thepchai Supnithi <thepchainectec.or.th>
Subject: Workshop on InterBEST 2009 Thai Word Segmentation
E-mail this message to a friend

Full Title: Workshop on InterBEST 2009 - Thai Word Segmentation
Short Title: InterBEST 2009

Date: 19-Oct-2009 - 19-Oct-2009
Location: Bangkok, Thailand
Contact Person: Thepchai Supnithi
Meeting Email: thepchainectec.or.th
Web Site: http://www.hlt.nectec.or.th/best/index.php

Linguistic Field(s): Computational Linguistics

Call Deadline: 30-Jun-2009

Meeting Description:

This workshop is the second event in the series of BEST (Benchmark for Enhancing
the Standard of Thai Language Processing), a series of contests on Thai language
processing, which is expected to help accelerate the progress of this
technology. The topic of the first contest, held in February 2009 as a special
topic in the 11th National Software Contest in Bangkok Thailand, is Thai word
segmentation. The result from this contest has set quite a high standard for a
Thai word segmentation algorithm.

This workshop still focuses on the topic of Thai word segmentation as we believe
that it is possible to improve beyond the current level of accuracy. Workshop
participants will have an opportunity to work on the same task, developing a
Thai word segmentation algorithm using the provided training data. In addition
to the 5-million word training corpus released since the first contest, a
2-million word corpus in more diverse genres will be released. The test set will
consist of a different set of genres to evaluated the generality of a word
segmentation algorithm on variety of text domains and styles. The submitted
algorithms will be evaluated with the same test set and the same scoring
program, thus allowing comparisons among various word segmentation algorithms.

Another goal of this workshop is to provide a venue for researchers to share
their experience and discuss current obstacles and future directions of Thai
word segmentation and Thai language processing in general. Since the InterBEST
2009 workshop is co-located with the SNLP2009, the Eight International Symposium
on Natural Language Processing, the participants will then have an opportunity
to submit a paper and present their word segmentation algorithm and its
performance at the workshop. All the contest procedures and guidelines will be
provided in English, to reach out for more researchers in an international
community who may be interested in Thai language processing.

Call for Papers

Step 1
To participate in this workshop, please submit an extended abstract (1,000 words
maximum, references excluded) describing your word segmentation algorithm and
its result on the 100-thousand word initial test data. The criteria of
acceptance are based on both the validity of the algorithm and its
performance.Please see the download page
(http://www.hlt.nectec.or.th/best/index.php?option=com_content&task=viewid=13&Itemid=27)
for further information on how to download a 5-million word training corpus and
submit your result for evaluation. A corpus description, as well as the
guidelines established for word segmentation criteria, are also available for
downloading. Registration is required to download training/testing data and
upload test results for evaluation. Please register only one account for each
system submitted, and use this account both when submit your test result and
paper. Also note that a permission to use the data is granted only for a
non-commercial R&D.


Step 2
Authors of accepted abstracts will be notified and provided with additional
2-million word training data to improve their systems. After the final test set
is released, participants will have one week to submit their final result for
evaluation, and additional 3 weeks to submit a full paper describing their
algorithm and result.

Important Dates:
Release of 5-million word training data and 100-thousand word initial test data:
18 Mar 2009

Abstract submission deadline:
30 Jun 2009

Notification of acceptance and release of additional 2-million word training
data for accepted abstracts:
15 Jul 2009

Release of final test data:
17 Aug 2009

Test result submission deadline:
24 Aug 2009

Camera-ready submission deadline:
11 Sep 2009

Workshop day (co-located with SNLP):
19 Oct 2009
Read more issues|LINGUIST home page|Top of issue




Please report any bad links or misclassified data

LINGUIST Homepage | Read LINGUIST | Contact us

NSF Logo

While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.