Date: 03-Oct-2011 From: Shu-Chuan Tseng <tsengscgate.sinica.edu.tw> Subject: Mandarin Conversational Corpus Wordlist E-mail this message to a friend
The Mandarin Conversational Corpus Wordlist is generated from the transcripts of 30 free conversations between strangers, 29 topic-specific conversations between friends/family members, and 26 map task dialogues between friends/family members, recorded in Taiwan. The wordlist contains automatically segmented words and their frequency, part of speech, and size in syllables - in total 405K word tokens in approximately 42 hours of recording. You can download the wordlist at http://mmc.sinica.edu.tw/home_c.htm
About LINGUIST
|
Contact Us
While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.