LINGUIST List 13.2812

Thu Oct 31 2002

FYI: Cantonese corpora

Editor for this issue: James Yuells <>


  1. Sam Po Law, Cantonese corpora

Message 1: Cantonese corpora

Date: Wed, 30 Oct 2002 00:50:16 +0000
From: Sam Po Law <>
Subject: Cantonese corpora

Dear colleagues,

	We are happy to announce that two corpora, the Hong Kong
Cantonese Adult Corpus (HKCAC) and the Hong Kong Corpus of Primary
School Chinese (HKCPSC), are now available on the web. The HKCAC was
supported by an RGC grant (#HKU 5190/98H) to Sam-Po Law, Suk-Yee Fung,
and Man-Tak Leung. It contains orthographic and phonetic
transcriptions of 8 hours of spoken Cantonese. The HKCPSC was
supported by the QEF grant (Project # 1999/1825) to Man-Tak Leung and
Wing-Yee Lee. The corpus contains linguistic analysis of 186,022
characters used by grade1 to grade 5 students in Hong Kong. These
databases are intended to be useful research tools for linguists and
psycholinguists. To search for information in the corpora, please
	Thank you for your attention.

HKCAC is currently unavailable until further notice. 

Sam Po Law
Division of Speech and Hearing Sciences
University of Hong Kong 

Subject-Language: Hong Kong Cantonese; Code: YUH 
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue