Editor for this issue: Marie Klopfenstein <marie
linguistlist.org>
Hello. Some weeks ago, I posted a request for information on frequency of Hangul characters in Korean text. This is a summary of the responses I received. In general, the consensus was that such research is rare or nonexistent, as it has less value in academics than in the commercial sector, where I work. However, here are some responses that helped me: >From Byong-seon Yang, There is a published book on Hangul Frequency "Hangul Sayong Bindo-uy Bunsuk" (An Analysis of Korean Frequncy: �ѱۻ��ݵ��� �м�) which is written in Korean and published in Korea Cultural Reserch Center, Korea University press, Seoul. The book analyzed by consonant and vowel (onset, coda), syllable, etc. Unfortuantely it is written in Korean. If you read Korean, it is useful for you since it is a kind of table analyed by the number of frequency. The publisher's phone # is 82-2-3290-1610~8, fax: 82-2-926-8385). If you need more help, please contact me. - Byong-seon Yang, Ph.D. Professor of English, Chair of Korean Studies Jeonju University Chonju, Korea 560-759 Tel) 82-63-220-2213 (Office) 82-63-226-3294 (H) Fax) 82-63-224-9920 E-mail) bsyangMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuewww.jeonju.ac.kr >From Sean M. Witty, Off the top of my head, I don't think any such documentation exists. If it does, the numbers must be staggering. Korean phonology is not as dynamic as that of English. Thus, there are fewer possible syllables, overall, available to the language (I have compiled a catalog). The total is further reduced because, although some syllables are possible according to the phonology, they simply aren't used by the language. Of those that are phonologically possible and used meaningfully, the pronunciation may vary depending on the phonetic environment (reducing the total possible number of syllables even further). The end result is a 5000+ year old language that uses a vocabulary based on a relatively small number of syllables. This leads to each syllable having more than one meaning, sometimes as many as ten (thereby increasing the frequency of each). Take a common syllable like ? (ka), which has several meanings and is a case marker. The frequency of usage for this one syllable, either in terms of meaningfulness or daily usage, would be an extremely high number. This would also probably be true of almost every other syllable in the language. >From Hyeri Joo, If you're interested in frequencies of Korean words or morphemes, go to the Web site <kibs.kaist.ac.kr>. The site is still developing, but it will be very helpful for you since you're a computational linguist. And the most informative response was from Ivan A. Derzhanski, who sent me data from his own research on the subject: My corpus consisted of 1 024 424 syllables' worth of newspaper text, mostly from the Daily Hankyoreh. There were 1526 different syllables found in the text, of the 2350 the KSC code caters for. Derzhanski's data includes counts for how many times each Hangul appeared in his corpus, as well as counts on onset, nucleus, and coda jamo. I include his signature information here in case anyone wishes to contact him about the data: - <fa-al-_haylu wa-al-laylu wa-al-baydA'u ta`rifunI wa-as-sayfu wa-ar-rum.hu wa-al-qir.tAsu wa-al-qalamu> (Abu t-Tayyib Ahmad Ibn Hussayn al-Mutanabbi) Ivan A Derzhanski http://www.math.bas.bg/~iad/ H: cplx Iztok bl 91, 1113 Sofia, Bulgaria <iad
math.bas.bg> W: Dept for Math Lx, Inst for Maths & CompSci, Bulg Acad of Sciences Thanks to everyone who responed to my posting, and thanks especially to Ivan Derzhanski for sharing his data. - Tim Mills - Zi Corporation - -------------------------------------------- Tim Mills, Computational Linguist Zi Corporation Suite 300, 500 - 4 Avenue SW Calgary, Alberta Canada T2P 2V6 Main: (403) 233.8875 Direct: (403) 231.4591 Fax: (403) 231.4595 E-mail: tmills
zicorp.com Website: www.zicorp.com