Editor for this issue: <>
Does anyone know of a corpus of ARABIC texts? Also an ARABIC lexicon. I am preparing a thesis involving the parsing of arabic, and would be interested to hear of anyone having experience in this field. deroosMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuelew.rug.ac.be.bitnet
Hi! I'm a student of linguistic working on a specific topic of chinese morphology. For that, I want to use some basic ideas of Object Orient. Programming / Smalltalk, which came originally from PARC Research Cent. / Xerox (that's what I heard about it) in the 60ies. So what I need is some information about publications of this group at PARC, especially theoretical papers. Thanks a lot for ANY help, Alexander Alexander Franz Phonology Seminar fuer Allg. Sprachwissenschaft is the study of telephone Heinrich-Heine-Universitaet etiquette. Duesseldorf e-mail: franzMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issueze8.rz.uni-duesseldorf.de (in: Fromkin/Rodman)
i am preparing a proposal which needs to include substantial optical character recognition (OCR scanning) to capture printed textbook materials into a machine-readable format. i need some expertise on what expenses to plan for. the project will collect all reading assignments for students in a class and then scan in as much as we can per month for statistical analysis of the material. as i understand the process, we scan text in a fixed font and then have to hand-correct errors in the range of 5-10 characters per 1000. this then needs a third pass to verify the correction. result is about 1 char per 100K error. through-put rate is about the same as hiring a skilled secretary (55 wpm) to type in the text, reading from the hardcopy. (secretary is without dual reading for 1/100K verification). from this, can i estimate that 2 graduate students (20 hr per week) could process about 200M per 9 month academic year? can i assume that a 486 cpu with 10M ram is adequate engine for OCR? what scanner and software expenses should i request? hand-held or flatbed? i have heard that OCR is prone to mechanical downtime. would 2 sets of OCR hardware per cpu be adequate to keep the process moving? now, how much a wrinkle will it be to do this in russian (or other printed indoeuropean languages) rather than english? stan stankuliMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issueUWF.bitnet . === we all help each other get a little further down the road, : : or be damned for the fools that we are. --- -- the motorcycle modificationalist's moto