LINGUIST List 5.679

Sat 11 Jun 1994

Disc: Protolanguage

Editor for this issue: <>


Directory

  1. Jacques Guy, Protolanguage undecidability and retention rates
  2. , Re: 5.677 Protolanguage
  3. Trey Jones, T

Message 1: Protolanguage undecidability and retention rates

Date: Sat, 11 Jun 1994 16:22:45 Protolanguage undecidability and retention rates
From: Jacques Guy <j.guytrl.oz.au>
Subject: Protolanguage undecidability and retention rates


Mark Durie <mark_duriemuwayf.unimelb.edu.au> writes:

>By Jacques Guy's method, even one of the daughter languages could be a proto
>language, with 100% retention of vocabulary.

No. The daughter language is evidently not the protolanguage. However,
it is lexically _indistinguishable_ from the protolanguage. Therefore,
it is as if it were the protolanguage. Thus, the protolanguage, the
root of the tree, is the terminal node occupied by that hypothetical
100% retentive language.

>Guy's 'proof' that the root can
>be placed in infinitely many places only works on the assumption of
>infinitely arbitrary variations in vocabulary replacement rates.
 ^^^^^^^^^^

 No, definitely not. This is a misuse, again, of "infinite". It is
 also a misuse of "arbitrary". I doubt very much that lexical
 innovations are arbitrary. But, whatever their causes, we can be
 pretty sure that they vary, and that we cannot predict them with
 any useful degree of certainty. The outcome of the roll of a die
 is neither arbitrary nor, strictly speaking, random. We might,
 possibly, predict it if we were in possession of all the necessary
 information. But we are not. So it appears random. Ditto lexical
 innovations. And the variation cannot be infinitely arbitrary since
 it is necessarily confined within the range 0 to 1, or if you
 prefer, 0% to 100%.

>It is bizarre to suggest
>that a reconstruction that assumes retention rates ranging from 100% to 20%
 ^^^^^^^^^^^^^^^
>in the one family is as equally plausible as one which assumes a range of
>55%-65% retention rates across the family.

The proof makes no mention of retention rates but of _retentions_. A
retention rate is, pardon this Lapalissade, a rate of retention. There
is nothing particularly strange about a retention rate of, say, 95%
per generation. If it persists for a thousand years or so, more than
30 generations, the retention, 1000 years later, is a paltry 20%.
At the other end of the scale, Bergsland and Vogt (1962, in Current
Anthropology, look it up), have observed the following retention
rates per 1000 years:

 200-item list 100-item list
Icelandic
 rural dialect 97.6% 99%
 urban dialect 96.2% 98%
Georgian 89.9% 96.5%
Armenian 94% 97.8%

David Lithgow (pers. com. circa 1970) has observed a replacement
of some 20% of the basic vocabulary in Muyuw (Woodlark island) in
one generation. Raise 0.8 to the 33rd power, and that gives you
the retention rate of Muyuw per 1000 years should it continue to
evolve at that rate: 0.06%. So there is nothing bizarre, then,
in expecting even such apparently unbelievable figures: from 0 to
100% retention per thousand years!

All this, in my view, explains why the cradle of Indo-European
has been shuffled about so widely, from the Baltic to the
Middle-East and I know not where else so that the poor baby
must be feeling thoroughly sick by now.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: Re: 5.677 Protolanguage

Date: Fri, 10 Jun 1994 11:32:14 Re: 5.677 Protolanguage
From: <V187EF4Yubvms.cc.buffalo.edu>
Subject: Re: 5.677 Protolanguage

Mark Durie <mark_duriemuwayf.unimelb.edu.au> writes:

>By Jacques Guy's method, even one of the daughter languages could be a proto
>language, with 100% retention of vocabulary. Guy's 'proof' that the root can
>be placed in infinitely many places only works on the assumption of
>infinitely arbitrary variations in vocabulary replacement rates. Few
>advocates of lexico-statistics (I am not one) would share the covert
>assumption of Jacques' proof that replacement rates are maximally and
>arbitrarily variable. Even if one does not hold the controversial opposite
>assumption that vocab replacement rates are universally constant across time
>and space, most of us would not wish to go so far the other way as to assume
>that they vary completely freely and arbitrarily. It is bizarre to suggest
>that a reconstruction that assumes retention rates ranging from 100% to 20%
>in the one family is as equally plausible as one which assumes a range of
>55%-65% retention rates across the family.

The proof was not that we could never figure out a root to the tree, but
merely that cognate proportions alone are insufficient for locating the
root. His proof is mathematically valid. The are, however, other ways
to locate it (he mentions some in GLOTTO.DOC, though I don't recall if
he did so in the post).

Kind of ironic, someone interested in glottochronology like me coming to
Jacques' defence.

-Pat Crowe, SUNY at Buffalo
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 3: T

Date: Fri, 10 Jun 94 14:46:03 EDT
From: Trey Jones <treyBRS.Com>
Subject: T

Reply to: 5.677 Protolanguage proof

Mark Durie, writing about Jacques Guy's proof of protolanguage uncertainty
in lexico-statistics, makes a lot of assumptions about other people's
assumptions himself. It strikes me as ironic, given the recent flurry of
postings concerning the popular view and misunderdstandings of linguistics,
that Jacques Guy seems to have to fight so hard against linguists'
misunderstandings of mathematics and statistics. When I was first
developing an interest in linguistics, I came across a book entitled
_Everything_Linguists_Ever_Wanted_To_Know_About_Logic_But_Were_Ashamed_
_To_Ask_, which I thought was foolish, considering the predominating
"formal" bent of linguistics. I was woefully wrong (and slightly biased by
a formal training in mathematics and computer science..), and my depression
deepens every time I see "refutations" of basic mathematical reasoning. But
I digress.

To the point, the main false assumption that Mark Durie is making about
Jacques Guy's proof is that Jacques has any assumptions at all. The proof
concerns what you can mathematically and statistically determine (that
means FOR SURE) from lexico-statistical data. The answer, concerning the
position of the protolanguage, is NOTHING!

Mark Durie states:
>By Jacques Guy's method, even one of the daughter languages could be a proto
>language, with 100% retention of vocabulary.

That is entirely correct... and not a flaw at all, as we shall see.. Durie
continues:

> Guy's 'proof' that the root can
>be placed in infinitely many places only works on the assumption of
>infinitely arbitrary variations in vocabulary replacement rates.

Also true, and still not a flaw! let us consider an example. Suppose you
had a data set for languages A, B, and C, and you constructed a relational
tree such as this:

A-----:--------:-- <..and you were tempted to put the protolanguage here.
B-----; | Well, I hate to say it, but that could be rather
 | foolish of you, particularly if the languages in
C--------------; question were Spanish(A), Portuguese(B) and Latin(C).
 In this case one of the "daughter" languages is in
fact the protolanguage, Latin! (minor quibbling about Classical vs Vulgar
Latin aside.. this is for illustrative purposes only, do not attempt this
reconstruction at home, I am a trained professional. --Sorry, I am
occasionally possesed by Dave Barry.)

The point is that Durie has already made a HUGE (GIGANTIC) assumption that
all the data in question comes from the same time period, which is by no
means the case. Jacques tackles the more general case of all sorts of data
from various time periods. The fact is, YOU JUST CAN'T TELL. In fact, in
the example above, the protolanguage is actually most likely

Spanish -----:--------:
Portuguese -----; |
 |
Classical Latin--------------;
 ^^-here, at the place where Classical and Vulgar Latin
split.. very close to one of the "daughter" languages.. but that comes in
part from all sorts of extra- and para-linguistic evidence, like age of
written records, the fact that all the Romans are dead, general knowledge
of world history, stuff like that.

Durie writes:
> It is bizarre to suggest
>that a reconstruction that assumes retention rates ranging from 100% to 20%
>in the one family is as equally plausible as one which assumes a range of
>55%-65% retention rates across the family.

To me, it is bizarre to introduce such real world knowledge as likely
retention rates into a mathematical discussion. (As anyone who has studied
college algebra can attest, mathematics has nothing to do with the real
world [if it can help it].) We are talking about what the math can tell
you, not what likely guess you can make based on real world knowledge that
doesn't factor into the equations.

I keep reiterating the main point here in hopes that it will sink in,
somewhere, for someone who doesn't get it yet (but they are small hopes, as
Jacques understands..): You cannot DETERMINE the position of a
protolanguage in a lexico-statistically derived relation tree. You can use
outside evidence to help you narrow down the range of likely possibilities,
but that is outside the realm of what lexico-statistics can do for you.

Okay.. take your best shot..
-Trey Jones,
part-time Math Geek, part-time Ling Geek, full-time Computer Nerd.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue