LINGUIST List 4.863

Tue 19 Oct 1993

FYI: Software available: language classification

Editor for this issue: <>


  1. Jacques Guy, Software for PCs

Message 1: Software for PCs

Date: Tue, 19 Oct 1993 13:04:59 Software for PCs
From: Jacques Guy <>
Subject: Software for PCs

I have uploaded today 19 October at into their
directory pc/incoming a software package for classifying and
reconstructing language families from lexicostatistical data
and for simulating lexical evolution and borrowing.
(It will be moved to their linguistics subdirectory in time).

For those of you who have tried lexicostatistics before and
found it wanting, I must emphasize now that the tree reconstruction
method implemented in this package has nothing in common with
glottochronology or lexicostatistics as you remember it. On
the very contrary, it expects vocabulary retention to vary
wildly from language to language and from time to time.

It consists of 5 program files, 3 sample data files, and a documentation
file, zipped into GLOTTO01.ZIP (76780 bytes), being:

 Name Size Contents
GLOTMRG.EXE 11088 Program. Merges separate sample wordlists
 into one, in a format that makes cognate
 identification easier.
GLOTPC.EXE 9392 Program. Input is a file containing identified
 cognate groups; output is a file containing
 a table of percentages of shared cognates.
GLOTTREE.EXE 23200 Program. Input is a file containing a table
 of percentages of shared cognates; output is
 files containing reconstructed tree and table
 of theoretical cognate percentages. The proportion
 of vocabulary retained since the preious split
 is shown on every branch of the reconstructed
 tree. The table of theoretical percentages
 gives a means of estimating the reliability
 of the reconstruction.
GLOTED.EXE 24976 An editor for browsing, modifying, and formatting
 tables of percentages of shared cognates for
GLOTSIM.EXE 29904 Program. Input is a file containing the
 description of the evolution and diversification
 of a language family or families; output is files
 containing the log of splits, innovations,
 borrowings, and percentages of shared cognates.
GLOTTO.DOC 51800 Instructions for use.
VANUATU.PC 239 Percentages of cognates shared by eight languages
 of Vanuatu, formerly New Hebrides.
VANUATU.SIM 425 Description of the evolution and diversification
 of a language family. Running GLOTSIM with
 VANUATU.SIM as input generates a language family
 with lexicostatistical properties mimicking
 those of the real languages in VANUATU.PC.
UTOAZTEC.PC 2487 Percentages of cognates shared by 32 Uto-Aztecan
 languages (from W.R. Miller 1984)

This package is freeware.

There are three main uses to which you can put it.

1. Classifying languages from sample wordlists.

2. Classifying languages from existing tables of cognate percentages.

3. Testing the validity and accuracy of any classification method
 relying on proportions of shared cognates, including the method
 implemented in program GLOTTREE.

Jacques Guy, Telecom Research Laboratories, 770 Blackburn Road,
 Clayton 3168, Australia
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue