Editor for this issue: <>
I have uploaded today 25 May 1994, at 14:25 Melbourne local time, file GLOTTO02.ZIP into the /pc/incoming directory at garbo.uwasa.fi. By the time you read this message it will probably have already been moved to directory /pc/linguistics, where it will replace GLOTTO01.ZIP File: glotto02.zip One line description: Language classification and simulation Suggested Garbo directory: /pc/linguistics Replaces: /pc/linguistics/glotto01.zip Uploader name & email: Jacques B.M. Guy -- j.guyMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuetrl.oz.au Author or company: Jacques B.M. Guy Email address: j.guy
trl.oz.au Surface address: Telecom Research Laboratories PO Box 249 Clayton 3168 Australia Special requirement: nil Shareware payment from private users: no Shareware payment required from corporate users: no Distribution limitations: nil Demo: no Nagware: no Self-documenting: no External documentation included: yes (220K) Source included: no Size: 173k compressed, 394k expanded 10-line description: This package, consisting of six programs, twenty sample data files, and two documentation files, lets you: 1. Classify languages from sample wordlists, with or without identifying cognates. 2. Classify languages from existing tables of cognate percentages. 3. Generate whole language families to test the validity and accuracy of any classification method relying on sample vocabulary lists or on proportions of shared cognates. Detailed description: Name Size Contents GLOTTO.DOC 111028 Documentation file. GLOTED.EXE 32832 Program for typing existing tables of cognate percentages into computer files. GLOTTREE.EXE 24144 Program for reconstructing the genealogical trees of language families from tables of cognate percentages. GLOTLPP.EXE 22544 Program for computing tables of lexicophonological percentages directly from wordlists. Those tables can be used instead of cognate percentages. GLOTPC.EXE 22384 Program for computing cognate percentages from files of identified cognate groups. GLOTMRG.EXE 12768 Program for merging wordlists, listing them not by language, but by list item. Useful for identifying cognates by hand. GLOTSIM.EXE 35934 Program for simulating the evolution and diversification of the vocabularies of language families. Useful for testing the validity and accuracy of various reconstruction methods. GLOTTO.TXT 104202 "On Glottochronology and Lexicostatistics" XVth Pacific Science Congress, Dunedin, New Zealand, 1983. VANUATU.PC 239 Percentages of cognates shared by eight languages of Vanuatu, formerly New Hebrides. VANUATU.SIM 442 Description of the evolution and diversification of a language family. Running GLOTSIM with VANUATU.SIM as input generates a language family with lexicostatistical properties closely mimicking those of the real languages in VANUATU.PC. UTOAZTEC.PC 2487 Percentages of cognates shared by 32 Uto-Aztecan languages (from W.R. Miller's "The Classification of the Uto-Aztecan Languages Based on Lexical Evidence" (IJAL vol.40, no.1, January 1984, pp.1-24). UTOAZTEC.SIM 2193 Description of the evolution and diversification of a language family mimicking the lexicostatis- tical properties of the languages of UTOAZTEC.PC. DUTCH.VOC 1324 ENGLISH.VOC 1321 GERMAN.VOC 1451 YIDDISH.VOC 1442 DANISH.VOC 1291 Sixteen languages each represented by a 200-item NORWEGIA.VOC 1277 wordlist, for testing and experimenting. SWEDISH.VOC 1309 Selected from Peter Bergman's "The Concise FRENCH.VOC 1412 Dictionary of 26 Languages in Simultaneous ITALIAN.VOC 1507 Translation", Signet Books, 1968. SPANISH.VOC 1486 PORTUGUE.VOC 1468 RUMANIAN.VOC 1539 POLISH.VOC 1607 CZECH.VOC 1486 RUSSIAN.VOC 1501 SERBO-CR.VOC 1405 What is New in this Version ============================= 1. I have finally located a PC with a monochrome monitor and found that program GLOTED did not work at all on such a PC. I have corrected the error and it now works. 2. GLOTED has commands such as Alt-X for "exit", but if you are using a foreign-language keyboard you might well find that pressing Alt-X does nothing of the kind. I have added a command (F9) which causes GLOTED to ask you a few questions so that it may adapt to your keyboard. 3. You no longer have to identify cognates to classify languages from wordlists. Program GLOTLPP computes similarity measures directly from wordlists, which can be used instead of cognate percentages. The process is much faster than cognate recognition. On a 386DX-33 without a math co-processor GLOTLPP took 20 seconds to process the seventeen 200-item wordlists provided as example. On a 486DX-50 it took just under 9 seconds. Computing time is, very roughly, proportional to the number of items in the sample wordlist and to the square of the number of languages. The values computed by GLOTLPP being measures of phonological as well as lexical similarity, I would argue that, theoretically, they ought to give better classifications than cognate percentages proper. 4. In the first version of GLOTTO, if you wanted to classify languages from wordlists in the traditional way, you had to type those wordlists into computer files, merge them by item (using program GLOTMRG), insert your identification of cognates by hand, and, finally, you ran GLOTPC on that file to produce cognate percentages. This had two great disadvantages: a. Since GLOTTO allows handling up to 180 wordlists of up to 2000 items each, the resulting file could be too large for many editors to handle. GLOTMRG now produces not one single file, but as many as necessary so that none is larger than 64K. GLOTPC has been modified to accept the new output from GLOTMRG. b. Typing wordlists into computer files is very time-consuming, and you might have preferred to have only to type in cognate groups, working directly from printed or handwritten wordlists. You can now do so. GLOTPC has been modified to accept this type of input data as well. 5. I have added an option to GLOTSIM which lets you specify unequal retention rates for lexical items, so that you may investigate their effects. 6. GLOTSIM now records the vocabularies of the languages it creates in formats compatible with the other programs in the package (GLOTMRG, GLOTLPP and GLOTPC). 7. I have added to this package, in file GLOTTO.TXT, the text of my paper "On Glottochronology and Lexicostatistics" which was presented in 1983 at the XVth Pacific Science Conference. It discusses critically the main contributions to the topic from Swadesh 1950 to Blust 1981. Amongst other things it shows how Lees (1953) misinterpreted his data as evidence for a universal constant of lexical retention rate -- when it was evidence to the contrary, and how Blust's findings on the retention rates of Austronesian languages (1981) were corroborated by Dyen's independent observations presented on the same occasion (Third International Conference on Austronesian Linguistics, Denpasar, Indonesia). What prompted me to include it in this package was its conclusion: In which light we can only conclude that the present study is unlikely to have much impact, and that misuses of lexicostatistical data will continue as in the past for many years to come, perhaps even increasing with the easier and easier availability of cheap, high-speed computational facilities. Ten years later now, not only have I indeed observed a resurgence of interest in glottochronology, but the model, with all its false assumptions, is even being reinvented in biology: viz the late Alan Wilson's "biological clock" which is nothing but the notion of a universal, constant rate of change translated into genetics, even though it has long been observed by geneticists to be contrary to fact. 8. GLOTPC has been enhanced to let you compute pseudo-cognate percentages from biological data, should you want to try reconstructing genetic trees without resorting to the "biological clock". 9. I have corrected an error in GLOTTREE, which sometimes caused a branch showing no replacements to be misformed.