Publishing Partner: Cambridge University Press CUP Extra Publisher Login
amazon logo
More Info


New from Cambridge University Press!

ad

From Utterances to Speech Acts

By Mikhail Kissine

"Kissine offers a new theory of speech acts which is philosophically sophisticated and builds on work in cognitive science, formal semantics, and linguistic typology. This highly readable, brilliant essay is a major contribution to the field."

--François Recanati, Institut Jean-Nicod



Summary Details


Query:   Sum: OCR software
Author:  David Beck
Submitter Email:  click here to access email
Linguistic LingField(s):   Computational Linguistics

Summary:   A couple of weeks back I posted a query about OCR software for the Mac
that is trainable enough to be useful to a linguist scanning Latin or
IPA-based non-English texts. Thanks to

Jakob Dempsey
Sarah Rilling
Michael Betsch
Andrew Arefiev
Marc Fryd
and Daniel Loehr

for their responses.

In the Mac world, it appears that the front-runner in this area is the
widely-available OmniPage programme from Caere Corporation
(http://www.caere.com for info). It is apparently trainable although
one respondent expressed some doubts about being able to train it to
handle more than a single special font. I should also mention that the
first sales rep I talked to previously about OmniPage seemed to think
that it might have trouble with the combinations of letters and
diacrits typical of IPA- based alphabets. However, the publicity
literature on the Web site seems to imply that it can be trained to
recognize combinations of separate characters and the last sales rep I
talked to seemed to think that there was no doubt that OmniPage could
do the job.

Jakob Dempsey also mentioned an "expensive Kurzweil product" for the
Mac, but I haven't heard anything further about this.

I also got two responses that mentioned Windows-based applications
that are highly trainable. One is a German product called OPTOPUS made
by a German company called Makrolog in Wiesbaden which is "exclusively
trainable"--that is, it needs to be trained from scratch and so can be
configured to any alphabet you like. The other is by a Russian company
called Bit Software (www.bitsoft.ru); their programme is called
FineReader and in addition to having a wide range of set alphabets for
langauges using both Latin and Cyrillic, they report having
sucessfully trained it to recognize Icelandic and Tibetan fonts).

David Beck

======================================================================
David Beck
Department of Linguistics
Sixth Floor, Robarts Library
130 St. George St.
University of Toronto
Toronto, Ontario M5S 3H1
Canada
e-mail: dbeck@chass.utoronto.ca
phone: (416) 978-4029
(416) 923-2394 (home)
FAX: (416) 971-2688

LL Issue: 8.1135
Date Posted: 04-Aug-1997
Original Query: Read original query


Back

Sums main page