Editor for this issue: <>
PC-KIMMO News ============= May 20, 1991 This announcement describes recent developments related to PC-KIMMO (an implementation for personal computers of Kimmo Koskenniemi's two-level model of word production and recognition). (1) PC-KIMMO version 1.0.5 update (2) KGEN - a rule compiler (table generator) for PC-KIMMO (3) KTEXT - a text-processing application using the PC-KIMMO parser (4) recent articles related to PC-KIMMO The software described below is made freely available to the academic community for non-commercial use and redistribution. We invite your feedback on these programs. Please note that the software is packaged in compressed archives: Zip files for MS-DOS and Stuffit files for Macintosh. In addition, if you obtain the files by e-mail, they will arrive in encoded form: uu-encoding for MS-DOS and Binhex format for Macintosh. Utility programs for handling archives and encoded files are available from computer bulletin boards or from your university computing center. (Hint for MS-DOS users: when you unzip a file, use the -d option to preserve the subdirectories.) Finally, it is possible that the files may not yet be available in some of the places listed below. Just wait a few days and try again. (1) PC-KIMMO 1.0.5 update PC-KIMMO version 1.0.5 has been available since the end of February. It fixes a problem with loading very large lexicons (more than 100 sublexicons). Thanks to Elizabeth Hinkelman and her colleagues for finding this bug. This version also fixes a couple things that caused crashes on the Macintosh. There are no functional changes in version 1.0.5. If you want to upgrade to version 1.0.5, you can obtain it as follows: 1. Obtain it via anonymous FTP from the following sources. (I am advised that it is best to use the symbolic names rather than the numeric addresses. Also, the directory structure is subject to change.) MS-DOS version: msdos.archive.umich.edu [141.211.165.34] msdos/linguistics/pckim105.zip Macintosh version: mac.archive.umich.edu [141.211.165.34] mac/etc/linguistics/pckim105.sit 2. Request it from us via e-mail. Be *sure* to specify which version you want (DOS, Mac, UNIX). 3. Send a diskette and a self-addressed, stamped diskette mailer to the address below. Be *sure* to specify which version you want (DOS, Mac, UNIX) and the disk format. (2) KGEN KGEN, a rule compiler for PC-KIMMO, is now available for beta testing. KGEN was written by Nathan Miles of Ohio State University. All rights and responsibilities pertaining to the program presently belong to Nathan Miles (not to the Summer Institute of Linguistics). He can be reached by e-mail at milesMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuecis.ohio-state.edu. Nathan has done a great job at developing this program and he deserves our thanks. KGEN takes a two-level rule like this: y:i =>
:C___+:0 and translates it into a finite state table like this:
y +
C i 0
1: 2 0 1 1 2: 2 3 2 1 3. 0 0 1 0 KGEN accepts as input a file of two-level rules and produces as output a file of state tables that is identical in format to PC-KIMMO's rules file. Anything that KGEN does not correctly handle can be easily fixed by hand in its output file. Everyone who uses PC-KIMMO (or who doesn't use it because they don't want to write tables by hand) is welcome to try out KGEN. But what we really need are some beta testers who can compare KGEN's output to tables they have written by hand. Let us know if you are willing to beta test KGEN for us. Presently KGEN runs only under MS-DOS and UNIX, but we hope to get it compiled for the Macintosh soon (any Think C experts out there?). You can obtain KGEN as follows. 1. The MS-DOS version of KGEN is available via anonymous FTP from SIMTEL20: wsmr-simtel20.army.mil [192.88.110.20] pd1:<msdos.linguistics>kgen02.zip SIMTEL20 can also be accessed using LISTSERV commands from BITNET via LISTSERV
NDSUVM1, LISTSERV
RPIECS and in Europe from EARN TRICKLE servers (for example, FRMOP11 in France). You can also obtain files from SIMTEL20 by e-mail. Send this line as the only message to listserv
vm1.nodak.edu (1 = one) (this may not work outside the U.S.): /PDGET MAIL PD1:<MSDOS.LINGUISTICS>KGEN02.ZIP UUENCODE The MS-DOS version of KGEN is also available by anonymous FTP from: msdos.archive.umich.edu [141.211.165.34] (symbolic name recommended) msdos/linguistics/kgen02.zip 2. The UNIX version (consisting of the source files which you must compile on your own machine) is available by anonymous FTP from the machine TUT: cis.ohio-state.edu [128.146.8.60] pub/kgen/kgen03.tar.Z 3. Request KGEN from us via e-mail. Be *sure* to specify which version you want (DOS, UNIX). 4. If all else fails, send a diskette and a self-addressed, stamped diskette mailer to the address below. Be *sure* to specify which version you want (DOS, UNIX) and the disk format. (3) KTEXT KTEXT is a new text-processing application that uses the PC-KIMMO parser. It accepts as input a text in orthographic form, tokenizes it into words, strips off and saves punctuation, capitalization, white space, and formatting codes, parses each word, and outputs the result to a quasi-database file with a record for each word. Its output data structures are suitable for further processing by other programs, such as a text interlinearizer, a syntactic parser, or a machine translation system. KTEXT is a beta test release that is distributed and supported by the Summer Institute of Linguistics. It is available for MS-DOS, Macintosh, and UNIX. You can obtain it as follows. 1. The MS-DOS version of KTEXT is available from SIMTEL20 as (see above on how to access SIMTEL20 by FTP or e-mail): pd1:<msdos.linguistics>ktext093.zip It is also available via anonymous FTP from: msdos.archive.umich.edu [141.211.165.34] (symbolic name recommended) msdos/linguistics/kgen02.zip 2. The Macintosh version of KTEXT is available via anonymous FTP from: mac.archive.umich.edu [141.211.165.34] (symbolic name recommended) mac/etc/linguistics/ktext094.sit It is also available via anonymous FTP from: sumex-aim.stanford.edu [36.44.0.6] /info-mac/app/ktext094.hqx You can also obtain files from SUMEX-AIM by e-mail. Send this line as the only message to listserv
ricevm1.rice.edu (1 = one) (this may not work outside the U.S.): $MACARCH GET /info-mac/app/ktext094.hqx 3. Request KTEXT from us via e-mail. Be *sure* to specify which version you want (DOS, UNIX). 4. If all else fails, send a diskette and a self-addressed, stamped diskette mailer to the address below. Be *sure* to specify which version you want (DOS, UNIX) and the disk format. 5. To obtain the UNIX sources, please contact us at the address below. (4) Recent articles related to PC-KIMMO: Antworth, Evan L. 1991. Introduction to two-level phonology. Notes on Linguistics, 53:4P18. Dallas, TX: Summer Institute of Linguistics. Antworth, Evan L. 1991. Glossing text with the PC-KIMMO morphological parser. (Manuscript submitted for publication) Simons, Gary F. 1991. A two-level processor for morphological analysis. Notes on Linguistics, 53:19P27. Dallas, TX: Summer Institute of Linguistics. Vanni, Michelle. 1990. Abstract of "PC-KIMMO: a two-level processor for morphological analysis." Georgetown Journal of Languages & Linguistics 1.4:498-500. Special requests for any of the software or articles described above and/or requests for more information should be sent to: Evan Antworth Academic Computing Department Summer Institute of Linguistics 7500 W. Camp Wisdom Road Dallas, TX 75236 U.S.A. Internet: evan
txsil.sil.org <-------- new address as of May 1991 UUCP: ...!uunet!convex!txsil!evan phone: 214/709-2418 fax: 214/709-3387 From
utafll.uta.edu:txsil!evan
utafll.uta.edu Mon May 20 22:56:49 1991 Received: from ns.uta.edu by uniwa.uwa.oz.au with SMTP (5.61+IDA+MU) id AA00365; Mon, 20 May 1991 22:56:33 +0800 Received-Date: Mon, 20 May 1991 22:56:33 +0800 Received: from utafll.uta.edu by ns.uta.edu with SMTP; Mon, 20 May 1991 9:56:24 CDT Received: from txsil.UUCP by utafll.uta.edu with UUCP (4.1/25-eef) id AA28944; Mon, 20 May 91 10:57:07 CDT From: txsil!evan
txsil
utafll.uta.edu (Evan Antworth) X-Mailer: SCO System V Mail (version 3.2) To: linguist Subject: new linguistics directory on SIMTEL20 Date: Mon, 20 May 91 8:55:15 CDT Message-Id: <9105200855.aa18036
txsil.sil.org> Status: RO There is a new directory on SIMTEL20 called PD1:<MSDOS.LINGUISTICS>. Two programs that previously were in the education subdirectory have now been moved to this new linguistics subdirectory; these are fonol400.zip and pckimmo.zip. The directory also contains a couple new programs related to PC-KIMMO. I hope that others will submit programs useful to linguists to this new directory. (File can be downloaded from SIMTEL20 by anonymous FTP from wsmr-simtel20.army.mil [192.88.110.20]). Evan Antworth evan
txsil.sil.org <------- new address as of May 1991