LINGUIST List 9.769

Thu May 21 1998

Disc: Time Depth

Editor for this issue: Martin Jacobsen <>


  1. Marc Hamann, Re: 9.762, Disc: Time Depth
  2. Larry Trask, Re: 9.762, Disc: Time Depth
  3. Waruno Mahdi, Re: 9.762, Disc: Time Depth

Message 1: Re: 9.762, Disc: Time Depth

Date: Thu, 21 May 1998 11:14:55 -0400
From: Marc Hamann <>
Subject: Re: 9.762, Disc: Time Depth

I would like to weigh in on the time depth by proposing an analysis
which is not dependent on any particular method but which follows from
more general mathematical and scientific principles.

First, let me say that I reject outright any a priori time limit on
reconstruction, but assert that for any given situation there is some
limit beyond which reconstruction ceases to be meaningful.

My reasoning is thus:

Let p be the propability that any given element in a particular
reconstruction is a representation of an actual historical state.
(For the sake of simplicity, I will assume that each such "element" is
independent of all others in the languge).

For a living language, p would be a function of the certainty of our
linguistic analysis of it. Given the divergences of opinion regarding
the structure of LIVING languages, it is reasonable to assert that p <
1 even for them. So ignoring other sources of error, the p for a
given reconstruction will be the product of the p for all the
"parents" of the reconstruction. Eg. let us say that we did not have
any extant record of Late Vulgar Latin (or proto-Romance if you will)
and we had French, Spanish and Italian, each with a generous p of
0.99. The absolute BEST possible p for proto-Romance in this case
would be 0.99x0.99x0.99=0.97. In practice not only would the first
generation values for p be lower, but I suspect there would be a
"degradation factor" which would enter into it as well.

One can see that if one then took this proto-Romance and using say
proto-Germanic, similarily obtained tried to find a posited
proto-Western European, the p would degrade even further. It wouldn't
take to too many generations to degrade the p of the reconstruction to
p=0.5, i.e. where any given feature reconstructed had a equal chance
of representing the historical situation or of being pure fiction.

Given that no such result is even VERIFIABLE, barring time travel or
the discovery of ancient writings, it is pretty clear that at some
time depth a reconstruction becomes purely an academic exercise with
little hope of representing historical truth.

- ---
Marc Hamann
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: Re: 9.762, Disc: Time Depth

Date: Thu, 21 May 1998 17:20:36 +0100 (BST)
From: Larry Trask <>
Subject: Re: 9.762, Disc: Time Depth

This is a reply to Alexis Manaster Ramer's two postings on time depth.

Alexis complains about the tossing around of hard numbers for the time
depths of particular families, and I am sympathetic. Only in a very
few special cases (notably PIE) do we have anything in the way of hard
evidence for time depths. Most time depths I have seen proposed for
particular families are derived either from glottochronology (a very
dubious procedure) or from a seat-of-the-pants approach: "Well, this
family looks to me to be a little more divergent than Germanic, so
let's say 3000-4000 years." Clearly neither of these approaches can
lead to accurate dates, just as Alexis complains, though that is not
to say that the proposed dates are totally worthless: the may still
have a measure of utility as ballpark figures.

Quite separate is the issue of putting general time limits on
reconstruction, on the identification of genetic families, or on any
other historical procedure. A number of people have suggested (or
even asserted) that a definite and identifiable time limit holds for
one or another of these activities -- usually 6000-8000 years, in my
experience, occasionally less or more. Now Alexis is certainly right
to protest that *some* of these people are not very explicit about
what exactly the declared time depth is supposed to apply to, and that
some of them seem to confuse different enterprises. But not everyone
does so. Johanna Nichols, for example, has indeed maintained in print
that about 6000 years is probably the greatest time depth at which we
can hope to perform substantial reconstruction, but she also maintains
that genetic families can be securely identified at much greater time
depths. Her favorite example is Afro-Asiatic, which she regards as
secure but as much older than 6000 years and therefore beyond

As far as I can judge, most of the estimates of reconstructible time
depth lean heavily on PIE, which is almost universally regarded as
dating to about 6000 BP. Clearly it would be perverse to maintain
that we can't reconstruct as far back as PIE. But the real question
is whether we can substantially reconstruct to a time much earlier
than 6000 BP.

I think it's worth bearing in mind that IE is a very special case
among language families -- probably unique. European linguists struck
it lucky with IE, which exhibits a number of characteristics that made
life much easier here than in most other cases.

First, IE exhibits a sizeable number of living and recorded languages
- well over 100.

Second, it does not exhibit so many languages that we are drowning in
data. Compare, say, Niger-Congo, with over 1000 languages.

Third, the languages fall neatly into ten or twelve well-defined
branches, with most branches represented by multiple languages. Any
well-known IE language can be assigned unhesitatingly to one of these
branches, and hardly any problems of high-level subgrouping exist
(apart from the seeming weirdness of having ten or twelve more or less
coordinate branches in a single family).

Fourth, we have significant written records for some branches dating
from the second millennium BC, substantial written records for more
branches dating from the first millennium BC, and significant written
records for yet more branches dating from the first millennium AD. I
know of no other family so well supplied with substantial early
records in multiple branches.

IE therefore presents us with a large number of powerful advantages in
identifying the family and in reconstructing its ancestor. If we
didn't have such an agreeable number of languages, if we didn't have
this helpful branching structure, and above all if we didn't have that
mass of early written records, then the identification of IE might
have been only slightly more difficult, but the reconstruction of PIE
would have been vastly more difficult, and perhaps actually

So: in something close to the most favorable case imaginable, we have
been able to reconstruct about 6000 years back in time -- or so we
think. Even so, our reconstruction is only partial and is plagued by
difficulties, uncertainties, and downright mysteries. We have not,
*sensu stricto*, reconstructed PIE, but only some sizeable chunks of
it, and those chunks are often pretty fuzzy.

Now, what I *suspect* most of the people who are throwing out figures
like "6000 BP" are trying to say is this: in the most highly favorable
case, we can just about make it back to 6000 BP or so. But other
cases are not so favorable, and most of the cases (or potential cases)
are *very* much less favorable. Hence it is unreasonable to suppose
that the unfavorable cases can be substantially reconstructed -- or
even perhaps securely identified as families -- significantly further
back in time than this.

Of course, that's not what most of them *do* say, but I always assume
that's what they mean.

So, I think that these numbers, unfortunate as they may be, are to be
taken as reasonable ballpark figures, and not as representing an
impenetrable steel curtain.

The fundamental point, as another commentator has observed, is that
the evidence necessary for reconstruction, or even for identifing
families, just keeps fading out as we work back in time. Moreover, it
fades out *fast*. We don't seem to find any languages or families in
which nothing much has happened for the last 5000 years.
Consequently, it is unrealistic to suppose that we can reconstruct, or
even identify secure families, at a time depth beyond -- well, I'm not
going to do it. Let's just say, beyond a few thousand years.
Comparisons based on reconstructions offer a small ray of hope, but
even our best reconstructions are so fuzzy and so incomplete that it
requires a good deal of optimism to hope that we can get much further
back in this way -- even when there are genuine remote links lurking
out there awaiting our attention, which is not always going to be the

Larry Trask
University of Sussex
Brighton BN1 9QH
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 3: Re: 9.762, Disc: Time Depth

Date: Thu, 21 May 1998 19:25:40 +0200
From: Waruno Mahdi <mahdiFHI-Berlin.MPG.DE>
Subject: Re: 9.762, Disc: Time Depth

> One, if glottochronology is wrong (even only sometimes wrong) that
> is if some languages evolve less fast than assumed, then the
> calculations he cites lose all meaning. Yet this is a rather
> well-established fact I would have thought, as in the case of
> Icelandic whose rate of vocabulary loss per millennium is 1% or less
> on teh Swadesh list.

The principle error in Swadesh's method seems to be the assumption
that the retention rate for all 100 elements of the list (or 200 in
the expanded list) are equal. This has been shown not to be the
case. Obviously, when two languages differ in a high retention-rate
element, this has a different significance than if they differ in a
low retention-rate one. In fact, I once suggested a method of using
the table of word retention rates for the Swadesh list for evaluating
whether common vocabulary of two languages is predominantly inherited
or borrowed. After all, whether a 4% or even a 14% common vocabulary
testifies to a genetic relationship depends upon whether one can
exclude contact origin. At such low levels of shared vocabulary, one
has little chance of establishing sound laws. Even if one managed to,
regularity of sound correspondences does not exclude borrowing, when
all common vocabulary was borrowed at roughly the same time.

The elementary retention rates were calculated in:

Dyen, isidore, A.T. James, & J.W.T. Cole, 1967, "Language Divergence
 and Estimated Word Retention Rate", _Language_ 43:150-171.

See also on this topic:

Kruskal, Joseph B., Isidore Dyen, & Paul Black, 1971, "The Vocabulary
Method of Reconstructing Language Trees: Innovations and Large-Scale
Applications", pp. 361-380 in F.R. Hodson, D.G. Kendall, P Tautu
(eds.), _Mathematics in the Archaeological and Historical Sciences_,
The Hague - Paris.

Merwe, Nikolaas J. van der, 1966, "New Mathematics for
Glottochronology", _Current Anthropology_ 7:485-500.

Dyen, Isidore, 1964, "On the Validity of Comparative Lexicostatistics",
pp. 238-252 in _Proceedings of the Nineth International Congress of
Linguistics (Cambridge, Mass., 1962)_, London - The Hague - Paris.

On distinguishing borrowed from inherited common vocabulary:

Mahdi, W., 1988, _Morphophonologische Besonderheiten und historische
Phonologie des Malagasy_, Berlin - Hamburg: Reimer Verlag (see there
pp. 400-403).

Another factor having a bearing on the results of glottochronological
computations may be socio-linguistic. The rate of change in pre- or
early neolithic communities seems to be exceptionally high, but this
may also be caused by the small size of the language communities. It
may explain high degrees of diversity and high apparent number of
phila in New Guinea or in Amazonia. The moment the size of the
community exceeds that of a few settlements that are in constant
contact with each other, only a limited number of local innovations
get to become features of the language of the whole community, and
this sets a clamp on the rate of change.

This probably means, that in general one may hardly expect to succeed
in tracing genetic relationships back to very far before the
neolithic, the beginnings of which don't go back more than 10,000
years anywhere. Could this be the reason why it is so difficult to
prove the common descent even only of all Amerindian languages?

The anomalously low rate of change in Icelandic can perhaps also be
explained by socio-linguistic reasons. Formerly, the population was
divided over a number of settlements which were isolated from each
other almost the year round. Only once every year did the population
of all the N settlements meet at a gathering place. Thus, changes were
always limited at first to one locality, and then periodically
confronted with the "united conservative front" of all the N-1
communities, which of course had a tremendous "disciplining"
effect. There was no chance of an innovation first "infecting" two or
three closer localities, then gradually spreading to the rest of the
island through free competition with the more archaic feature it

Regards to all, Waruno

- ---------------------------------------------------------------------
Waruno Mahdi tel: +49 30 8413-5404
Faradayweg 4-6 fax: +49 30 8413-3155
14195 Berlin email:
Germany WWW:
- ---------------------------------------------------------------------
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue