LINGUIST List 11.2637

Tue Dec 5 2000

Qs: Korean Corpus, Cyrillic OCR

Editor for this issue: Karen Milligan <>

We'd like to remind readers that the responses to queries are usually best posted to the individual asking the question. That individual is then strongly encouraged to post a summary to the list. This policy was instituted to help control the huge volume of mail on LINGUIST; so we would appreciate your cooperating with it whenever it seems appropriate.


  1. Elena Rudnitskaya, Korean Corpus
  2. JeffPower, Cyrillic OCR

Message 1: Korean Corpus

Date: Mon, 4 Dec 2000 19:25:46 +0300
From: Elena Rudnitskaya <>
Subject: Korean Corpus

Dear Linguists,

I am looking for a Korean Corpus in the Internet. If you happen to know any,
please let me know its www-address.

Elena Rudnitskaya
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: Cyrillic OCR

Date: Mon, 4 Dec 2000 15:19:03 EST
From: JeffPower <>
Subject: Cyrillic OCR

Ladies and gentlemen:

I am involved in a large project to develop, modify, or obtain an OCR
tool that can recognize strings of handwritten Cyrillic characters that were 
recorded in columns on ledgers or logs. Data on a given CD would be for a 
given year, in sequence chronologically by type of event (i.e., birth, death, 
divorce, marriage). Data would include given names, surnames, family 
information, dates of birth, death, towns, etc. Such a tool would enable 
users to search for their family surnames and towns for family research 
purposes. The software would have to be capable of recognizing most 
reasonably well-formed handwritten Cyrillic characters.

The data will be CD-ROM formatted, after being transferred from microfilms of 
paper records. I may have some influence over the preparation of the CD's, 
so any suggestions for formatting requirements would be most helpful.

I would be interested in hearing from anyone knowing about these issues. What 
can you tell me about the software used for these purposes, any of your own 
experiences with or knowledge of such projects, and any advice that you can 
give on how I should proceed?

Jeff Miller
E-mail:, or,
Maryland, US
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue