* * * * * * * * * * * * * * * * * * * * * * * *
LINGUIST List logo Eastern Michigan University Wayne State University *
* People & Organizations * Jobs * Calls & Conferences * Publications * Language Resources * Text & Computer Tools * Teaching & Learning * Mailing Lists * Search *
* *
LINGUIST List 17.807

Thu Mar 16 2006

FYI: Third SIGHAN Chinese Language Processing Bakeoff

Editor for this issue: Svetlana Aksenova <svetlanalinguistlist.org>

To post to LINGUIST, use our convenient web form at http://linguistlist.org/LL/posttolinguist.html.
        1.    Gina-Anne Levow, Third SIGHAN Chinese Language Processing Bakeoff

Message 1: Third SIGHAN Chinese Language Processing Bakeoff
Date: 15-Mar-2006
From: Gina-Anne Levow <levowcs.uchicago.edu>
Subject: Third SIGHAN Chinese Language Processing Bakeoff

Call for Participation

The Third International Chinese Language Processing Bakeoff
Description and Important Dates

1. Introduction

This is the official announcement for the Third International Chinese
Language Processing Bakeoff, sponsored by the Special Interest Group for
Chinese Language Processing (SIGHAN) of the Association for Computational
Linguistics. The bakeoff will occur over the late spring of 2006 and the
results will be presented at the 5th SIGHAN Workshop, to be held at
ACL-COLING 2006 in Sydney, Australia, July 22-23, 2006.

The first bakeoff, held in 2003 and presented at the 2nd SIGHAN Workshop at
ACL 2003 in Sapporo, has become the pre-eminent measure for Chinese word
segmentation evaluation and has been cited in numerous papers. The second
bakeoff held in 2005 and presented at the 4th SIGHAN Workshop at IJCNLP-05
on Jeju Island, Korea demostrated further progress in this task. In a
change from the first two evaluations, the third bakeoff will augment the
classic Word Segmentation task with a new Named Entity Recognition task.
Corpora from the following organizations will be available for use:

- Beijing Universty, China
- CKIP, Academia Sinica, Taiwan
- City University of Hong Kong, Hong Kong SAR
- Linguistic Data Consortium, United States
- Microsoft Research, China
- University of Pennsylvania and University of Colorado, Boulder, United States

The full details of the segmentation and named entity tagging task will be
made available through the registration site which will open March 15, 2006.

Participants are required to submit a short paper describing their system
and analyzing their performance, and present a summary at the workshop. The
reports will be published in the SIGHAN workshop proceedings.

The language of the workshop is English. Papers must be submitted and
presented in English. Note that unlike the workshop proper, there will not
be a peer review process on the bakeoff reports.

2. Important Dates

2006-03-15 Registration Open
2006-04-17 Training data made available
2006-05-15 Testing data made available
2006-05-17 Test results due back to organizers
2006-05-19 Results privately reported to participants
2006-06-2 Final reports due from participants

3. Contact Information

The bakeoff is being organized by Gina-Anne Levow of University of Chicago
and Olivia Oi Yee Kwong, City University of Hong Kong.

The web page for the competition is:


Questions on the bakeoff should be addressed to Gina-Anne Levow,

Linguistic Field(s): Computational Linguistics; Morphology; Text/Corpus Linguistics

Respond to list|Read more issues|LINGUIST home page|Top of issue

Please report any bad links or misclassified data

LINGUIST Homepage | Read LINGUIST | Contact us

NSF Logo

While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.