LINGUIST List 3.642

Fri 21 Aug 1992

Qs: Arabic, Object-Oriented Programming, OCR

Editor for this issue: <>


Directory

  1. , Arabic
  2. Alex Franz, Chinese Morphology and Object Oriented Programming
  3. stan kulikowski ii, proposal estimates for OCR scanning?

Message 1: Arabic

Date: Wed, 19 Aug 92 15:59 N
From: <DEROOSlew.rug.ac.be>
Subject: Arabic

Does anyone know of a corpus of ARABIC texts?
Also an ARABIC lexicon.
I am preparing a thesis involving the parsing of arabic,
and would be interested to hear of anyone having
experience in this field.

derooslew.rug.ac.be.bitnet
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: Chinese Morphology and Object Oriented Programming

Date: Thu, 20 Aug 92 20:53:52 CEChinese Morphology and Object Oriented Programming
From: Alex Franz <FRANZze8.rz.uni-duesseldorf.de>
Subject: Chinese Morphology and Object Oriented Programming

Hi!

I'm a student of linguistic working on a specific topic of chinese
morphology. For that, I want to use some basic ideas of Object Orient.
Programming / Smalltalk, which came originally from PARC Research Cent.
/ Xerox (that's what I heard about it) in the 60ies.

So what I need is some information about publications of this group
at PARC, especially theoretical papers.

Thanks a lot for ANY help,

Alexander

Alexander Franz Phonology
Seminar fuer Allg. Sprachwissenschaft is the study of telephone
Heinrich-Heine-Universitaet etiquette.
Duesseldorf
e-mail: franzze8.rz.uni-duesseldorf.de (in: Fromkin/Rodman)
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 3: proposal estimates for OCR scanning?

Date: Thu, 20 Aug 92 16:19:56 CDproposal estimates for OCR scanning?
From: stan kulikowski ii <STANKULIUWF.BITNET>
Subject: proposal estimates for OCR scanning?


 i am preparing a proposal which needs to include substantial optical
character recognition (OCR scanning) to capture printed textbook materials into
a machine-readable format. i need some expertise on what expenses to plan for.
the project will collect all reading assignments for students in a class and
then scan in as much as we can per month for statistical analysis of the
material.

 as i understand the process, we scan text in a fixed font and then have to
hand-correct errors in the range of 5-10 characters per 1000. this then needs
a third pass to verify the correction. result is about 1 char per 100K error.
through-put rate is about the same as hiring a skilled secretary (55 wpm) to
type in the text, reading from the hardcopy. (secretary is without dual
reading for 1/100K verification). from this, can i estimate that 2 graduate
students (20 hr per week) could process about 200M per 9 month academic year?

 can i assume that a 486 cpu with 10M ram is adequate engine for OCR? what
scanner and software expenses should i request? hand-held or flatbed? i have
heard that OCR is prone to mechanical downtime. would 2 sets of OCR hardware
per cpu be adequate to keep the process moving?

 now, how much a wrinkle will it be to do this in russian (or other printed
indoeuropean languages) rather than english?
 stan

 stankuliUWF.bitnet
 .
 === we all help each other get a little further down the road,
 : : or be damned for the fools that we are.
 --- -- the motorcycle modificationalist's moto
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue