Editor for this issue: Ann Dizdar <ann
linguistlist.org>
An acquaintance of my daughter's writes: =================================== Identify this language please? "Idolem urodo iatu a wi rot Ukufu kush onuoy nehawuoch Etia di ukoik ura nakurah Enadu yoimi nnesar urugem Eteako ich atak Ureatu tso oodah Amia wibo koro yonneie" I think I have a pretty good idea of what languages this is *not* (not a Romance language, not Germanic, not Slavic, not Chinese, Japanese, Vietnamese...). Also, if it translates to something really corny, lemme know so I can stop embarrassing myself every time I sing it. =================================== Please respond to me. I will forward replies to the inquirer and summarize to the list. Thank you for any help. Mark A. Mandel : Senior Linguist : markMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuedragonsys.com Dragon Systems, Inc. : speech recognition : +1 617 965-5200 320 Nevada St., Newton, MA 02160, USA : http://www.dragonsys.com/ Personal home page: http://world.std.com/~mam/
Has anyone in the List compiled or worked with STUDENT corpora? I am in the process of putting together a corpus of Chinese college students' unedited writings in English. The purpose is to subsequently analyze this corpus, with concordancer and other programs, and find quantitative information about the extent of some characteristic errors or other non-native speaker word usages in their writings. This information can be very valuable in determining syllabuses and directions in secondary school English instruction. The corpus is planned to be the size of about 300,000 words, consisting of 800-1000 pieces of written assignments, each anywhere between 150-400 words long, typed and saved as text files. About one third of these assignments has already been typed (entered). I haven't so far used any other STUDENT corpus, from any country. So my question is: are there any STANDARDS, generally used or accepted electronic formats, in which these corpora are compiled, saved, and prepared to be used by others? Here I briefly describe how the corpus is being compiled here, and will be very grateful for suggestions or comments whether this way is OK or any change should be made to comply with accepted forms. - Each piece is typed in the Word 6.0 window (in Windows 3.1 environment), using a fixed space font, making each line about 70 words long, typing the unedited, uncorrected text (only obvious spelling and punctuation mistakes made by the students are corrected). - An 8-12 character code (number) is typed in the first line. Then one line is skipped, and the heading (headline) of the piece, as written by the student, is typed. - Paragraphing follows the original, with blank lines between the paragraphs. - Before saving the text, possible spelling and other errors made in the typing process are checked and corrected using Word's spell checker. - Then each piece is saved as a "text only with line breaks" file and given a file name (number). - All these files are placed in one directory and backed up to prevent accidental erasure. - Using a simple merger application, the files are merged. So far, I have already tried using in a concordancer (WordSmith Tools) a consolidated long file comprising about 350 pieces of writing, about 120,000 words, and there seem to be no problems. Would files compiled this way be ALSO USABLE in other concordancer or text processing/analyzing programs? Please send your comments either to the List, or to me. I could certainly summarize the contents of communications sent to me and send it to the List. I should also be very happy to eventually make this corpus available to anyone interested in using it, or exchange it with similar learner corpora on file, based on writings of other Chinese or Japanese students, or English-learning college students in any country. Best to all, Colman Bernath - ----------------- Colman Bernath c/o Department of English Soochow University, Taipei, TAIWAN colberMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuembm1.scu.edu.tw
Does anyone have (a published version of) the following two papers, both of which are cited in L. Rizzi's (1990) _Relativized Minimality_? Carstens, V., and Kinyalolo. 1989. Agr, Tense, Aspect and the IP Structure: Evidence from Bantu. Paper presented at GLOW Conference, Utrecht. Schneider-Zioga, P. 1987. Syntax Screening. Paper, USC, Los Angeles. Please contact me at the address below. +------/-----------------------------------/------+ | Hiroyuki Tanaka | | Department of English Linguistics, | | Faculty of Letters, Osaka University. +--+ | e-mail: htanakaMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issueosk.threewebnet.or.jp | / +----------------------------------------------+