LINGUIST List 8.1135

Mon Aug 4 1997

Sum: OCR Software

Editor for this issue: Martin Jacobsen <>


  1. David Beck, Sum: OCR software

Message 1: Sum: OCR software

Date: Mon, 4 Aug 1997 16:15:02 -0400
From: David Beck <>
Subject: Sum: OCR software

A couple of weeks back I posted a query about OCR software for the Mac
that is trainable enough to be useful to a linguist scanning Latin or
IPA-based non-English texts. Thanks to

 Jakob Dempsey
 Sarah Rilling
 Michael Betsch
 Andrew Arefiev
 Marc Fryd
and Daniel Loehr

for their responses.

In the Mac world, it appears that the front-runner in this area is the
widely-available OmniPage programme from Caere Corporation
( for info). It is apparently trainable although
one respondent expressed some doubts about being able to train it to
handle more than a single special font. I should also mention that the
first sales rep I talked to previously about OmniPage seemed to think
that it might have trouble with the combinations of letters and
diacrits typical of IPA- based alphabets. However, the publicity
literature on the Web site seems to imply that it can be trained to
recognize combinations of separate characters and the last sales rep I
talked to seemed to think that there was no doubt that OmniPage could
do the job.

Jakob Dempsey also mentioned an "expensive Kurzweil product" for the
Mac, but I haven't heard anything further about this.

I also got two responses that mentioned Windows-based applications
that are highly trainable. One is a German product called OPTOPUS made
by a German company called Makrolog in Wiesbaden which is "exclusively
trainable"--that is, it needs to be trained from scratch and so can be
configured to any alphabet you like. The other is by a Russian company
called Bit Software (; their programme is called
FineReader and in addition to having a wide range of set alphabets for
langauges using both Latin and Cyrillic, they report having
sucessfully trained it to recognize Icelandic and Tibetan fonts).

David Beck

David Beck
Department of Linguistics
Sixth Floor, Robarts Library
130 St. George St.
University of Toronto
Toronto, Ontario M5S 3H1
phone: (416) 978-4029
 (416) 923-2394 (home)
FAX: (416) 971-2688
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue