LINGUIST List 5.1002

Sat 17 Sep 1994

FYI: Greenberg software

Editor for this issue: <>


Directory

  1. Jacques Guy, "Greenberg" software available

Message 1: "Greenberg" software available

Date: Wed, 7 Sep 1994 16:20:35 +"Greenberg" software available
From: Jacques Guy <j.guytrl.oz.au>
Subject: "Greenberg" software available
Status: RO


Well, yes, I thought that would catch your attention. Seriously, now.

CHANCE is a Monte-Carlo simulation which lets you investigate the
effects of chance resemblances between up to 40 unrelated languages each
represented by up to 500 words or features (grammatical, phonological,
or syntactic), and the effects allowing for semantic shifts when looking
for resemblances.

A complete discussion of the algorithm implemented in CHANCE, and of the
estimations of probabilities of chance resemblances, and results of
several thousand iterations mimicking the data presented by Greenberg
and Ruhlen are due to be published in the first 1995 issue of Anthropos
(March 1995) under the title "The Incidence of Chance Resemblances on
Language Comparison".

CHANCE is freely available in file chance01.zip in directory
pc/linguistics of the anonymous ftp site garbo.uwasa.fi (University of
Vaasa, Finland)

 Author: Jacques B.M. Guy
 Email address: j.guytrl.oz.au
Surface address: Telecom Research Laboratories, PO Box 249
 Clayton 3168 Australia

 Special requirement: nil
 Shareware payment from private users: no
Shareware payment required from corporate users: no
 Distribution limitations: nil
 Size: 10k compressed, 18k expanded

Long Description:

In their article entitled "Linguistic Origins of Native Americans"
(Scientific American, November 1992, pp.60-65) Joseph Greenberg and
Merritt Ruhlen claim that resemblances they find between Amerindian
languages and Indo-European, Semitic, and Dravidian languages stand
infinitesimal chances of being to due to chance, and therefore must
reflect a common origin, or borrowing. The mathematical formula for
computing the probabilities of chance resemblances becomes intractable
almost as soon as one attempts to allow for semantic shifts. Simulating
accidental matches, however, is a simple task, even though it can be
computationally expensive. But in these days when personal computers
sell for a song and outperform the mainframes of my student days, this
is of little consideration indeed.

CHANCE lets you investigate the effects of chance resemblances between
up to 40 unrelated languages each represented by a word list or feature
list of up to 500 items, and the effects of allowing for semantic
shifts.

HOW TO RUN CHANCE

Just type CHANCE at the DOS prompt. You will be asked four questions:

How many languages? (min 2 max 40)
How many words? (min 5 max 500)
Accidental match: one chance in how many? (min 2 max 500)
Size of semantic domains? (min 1 max 20)

1. How many languages: this is the number of unrelated languages
 that will be generated.

2. How many words: this is the size of the sample wordlist used
 for comparing those languages. This also covers the case where
 languages are classified not from wordlists, but from a handful of
 features (grammatical, syntactic, or phonological). In that case,
 give the number of features used.

3. Accidental match: this is your estimate of the chance of an
 accidental match. For instance, Greenberg and Ruhlen estimate the
 chance of an accidental match at 1 in 250 (this is an underestimate
 because they allow for metathesis, so that the actual chance is at
 least 1 in 125, and possibly up to 1 in 42). If you are investigating
 the chances of accidental match of features, the probability is
 generally much higher. For instance, there are only six possible
 Subject-Verb-Object orders (SVO, SOV, VSO, VOS, OSV, OVS) so that the
 minimum chance of an accidental match is 1 in 6.

4. Size of semantic domains: your answer controls whether you allow for
 semantic shifts, and if so, how wide. An answer of 2, for instance,
 means that semantic shifts covering up to 2 list items, but no more,
 are allowed. An answer of 1 means that no semantic shifts are
 allowed. Greenberg and Ruhlen, for instance, seem to have allowed for
 shifts over semantic domains covering 8 list items or more.

Once you have answered all four questions the simulation will start.
Its result are displayed continuously on screen, like this:

10 languages
200 words each
One chance in 200 of accidental match

 Semantic Shifts Allowed. Domain Size: 6

 Number of Reconstructions Attested by N Languages

 N: 3 4 5 6 7 8 9 10+
 Sum: 31291 2501 91 1 0 0 0 0
 Current: 52 4 0 0 0 0 0 0
 Mean: 58.8214 4.70113 0.17105 0.00188 0.00000 0.00000 0.00000 0.00000

 Simulation #532

 Press Esc to stop simulation, Space bar to pause.

The top three lines of the screen recapitulate the paramaters you
specified, here 10 languages represented by 200 words each with a 1 in
200 chance of accidental match. Next a line reminds you that you have
allowed for semantic shifts covering up to six list items (word meanings).
Next come the results of the simulation so far. "Sum" is the total number
of cases encountered so far where exactly where the same word has been
found in exactly 3, 4, 5, 6, 7, 8, 9 and 10 or more language. For instance
in this simulation, 31291 cases of the same word in 3 languages have been
observed so far, 2501 cases of the same word in 4 languages, 91 cases
of the same word in 5 languages, one case in 6 languages, and none in
7 or more. The line below shows how many cases have been observed in
the current simulation. Here 52 cases of the same word in 3 languages,
4 cases of the same word in 4 languages, none in 5 or more. The next
line ("Mean") give the average number of cases; it is the total number
("Sum") divided by the number of simulations run so far. The line
below tells you how many simulations have been run: here 532.

The bottom line prompts you which commands you may use. Here you
may either press the Esc key, and stop the simulation, or press
the space bar to pause (if for instance, you want to write down
intermediate results). Whichever you chose, the prompt line
will change accordingly. If you stop the simulation (Esc) it
will ask you if you want to run another simulation (with different
paramaters). If you only pause it (space bar) it will tell you
to press Esc to stop it, or any other key to continue it.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue