LINGUIST List 6.66

Tue 17 Jan 1995

Disc: Comparative Method

Editor for this issue: <>


  1. David Solnit, Re: 5.1482 Comparative Method
  2. William Poser, masses of data in mass comparison?
  3. , Re: 6.10 Comparative method
  4. , Comparative Linguistics

Message 1: Re: 5.1482 Comparative Method

Date: Fri, 23 Dec 1994 16:22:50 Re: 5.1482 Comparative Method
From: David Solnit <>
Subject: Re: 5.1482 Comparative Method

Concerning John Cowan's query about Tai belonging to Sino-Tibetan:
First off, it is not quite true that nobody believes it now; as far as I
know it is still the official view in China that Tai (as part of
Zhuang-Dong or Kadai) as well as Miao-Yao belong in Sino-Tibetan. The
private views of Chinese linguists are another matter: many of them are
skeptical of Tai's affiliation with S-T. By the way, I use 'Kadai' to
mean the larger family including Tai as well as Kam-Sui, Hlai (Li), Gelao
and a number of others; some prefer 'Tai-Kadai' for this group.

Why did views change? Largely, I think, it was a matter of recognizing
that typological features like tone and monosyllabicity were less
resistant to diffusion than had been thought; in particular, they are
less resistant to replacement than is core vocabulary. The shift in
views about the affiliation of Vietnamese shows this in miniature:
Vietnamese is typologically very close to Chinese, Tai and Miao-Yao, and
it has large numbers of lexical items with obvious Chinese affiliations, but
its lexicon also includes many items, and among them many core items, that
have good phonological correspondences to Mon-Khmer. The situation with
Tai and Miao-Yao is similar, except that there is no obvious competing
genetic linkage to make once you discard Chinese. You are left with many
core vocabulary items that are unrelatable to Chinese.

Much of the credit for shifting opinion about the affiliation of Tai
should probably go to Paul Benedict, since he made more or less the above
arguments as part of his proposal for hooking Tai up with Austronesian, in
the first version of Austro-Tai. It is, I think, instructive to notice
what happened: hardly anybody went along with Austro-Tai, but it quickly
became difficult to find anybody explicitly advocating the old
Sino-Tibetan affiliation. I would not like to say that Benedict had
*disproved* the ST-Tai relation; I would prefer to say that he made a
persuasive argument that a hypothesis of ST-Tai affiliation was not a
useful or promising one.

I'm of the camp that thinks that you can't disprove a genetic relation in
linguistics. You *can* show that a given claim for a given genetic
relation is unconvincing, but that is not the same thing as showing that the
claimed relation is invalid.

And I can't resist putting in my 2 cents on the matter of writing grammars
of proto-languages. A bit of the disagreement between Karl Teeter and
Alexis Manaster-Ramer seems due to misunderstanding: Alexis has mostly
been using 'grammar' to mean 'morphology, especially inflectional
morphology like in Indo-European', while Karl seems rather to use
'grammar' in the larger, generative sense, as (a description of) the sum
total of a speaker's linguistic knowledge. In the latter sense, grammar
of course includes phonology, so presumably when we describe the phonology
of a proto-language we are going some distance towards meeting Karl's
demand for a grammar of that proto-language. The question is how much
farther we can expect to go. When I contemplate writing the grammar of
proto-Kadai, I am sure it would include plenty of phonology, and a fair
amount on derivational morphology and compounding (which in many of these
languages are practically the same thing). But no inflectional
morphology, and I must confess I have trouble conceiving of describing the
proto-syntax, except by characterising it as SVO since all the modern
languages are of that type. I'm not sure what sort of evidence I could
use do decide whether classifiers existed as a distinct grammatical
category in the proto-language, or which aspects were marked by verbs and
which by particles, or any number of other things that would take up many
pages in a grammar of any of the modern languages. It may well turn out
that many useful facts about the proto-syntax can be inferred, including
even some of those I just listed, but I really doubt we'll ever find an
analog to the est/sunt:ist/sind pattern.

) Well, what is known about how various hypotheses of relationship were
) rejected in the past? At one time, it was believed that Tai was part of
) Sino-Tibetan; nobody believes this now. On what basis did those learned
) in the art shift their paradigms (to mix a few metaphors)?
) I know very little about either language family, but the resemblances between
) them (tones, monosyllabicity, the Great Tone Split) are seductive. I think
) it would be instructive to hear, from someone who knows the history, just
) how these faux amis came to be disregarded.
) John Cowan sharing account ( for now
) e'osai ko sarji la lojban.
) --------------------------------------------------------------------------
) LINGUIST List: Vol-5-1482.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: masses of data in mass comparison?

Date: Fri, 23 Dec 94 15:40 PST
From: William Poser <>
Subject: masses of data in mass comparison?

In the discussion of Greenberg's "mass comparison", the
possiblity is raised (e.g. in Benjy Wald's interesting posting on
the African classification) that errors may be swamped by the
large number of forms. Perhaps so, in some cases, but I think
that it is important to note that in some cases at least
Greenberg's claims are based on minute numbers of equations. For
example, according to my count of forms in LIA, his inclusion of
Waicuri in Hokan and thence Amerind is based on a total of SIX
(6) forms, his inclusion of Maratino in the same groups on a
total of THREE (3) forms. Where I come from this isn't what we
call an overwhelming mass of data.

Bill Poser

Bill Poser, First Nations Studies, University of Northern British Columbia,
3333 University Way, Prince George, British Columbia, V2N 4Z9, Canada
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 3: Re: 6.10 Comparative method

Date: Thu, 12 Jan 95 16:22:34 ESRe: 6.10 Comparative method
From: <>
Subject: Re: 6.10 Comparative method

Re: Jacques Guy's reply to something I said much earlier:
Do we really have to throw about words like 'false'. I may
have been wrong when I said that I knew of no published
examples of languages that had clearly lost significantly
more than the magic figure of 86% of the Swadesh 100-word
list per millennium, and I did ask for any counterexamples.
The Eastern Greenlandic example cited by Jacques based on
Bergsland and Vogt's 1962 paper turns out to be a case
where, as Swadesh pointed out in his response, and as I think
the authors admitted, it is possible that the rate was not
really higher after all. Jacques also referred to some other
language, starting with M, but never made it clear whether
he was talking about the vocabulary at large or about the
100-word list, nor did he cite any references. If this
example is documented and holds up, I will have been wrong,
but false seems a bit harsh.

I should add that I personally am not a believer in glottochronology
and that today most of those who are apparently no longer believe
in the constant rate of loss. My point was rather that it was
the people who keep talking about a supposed limit on how far
back the comparative method can go who are either explicitly
(as in the case of Bender) or implicitly assuming something like
a constant rate of loss.

However--and this is important--what Bergsland and Vogt did demonstrate
quite clearly in 1962 is that there are, even if not often, examples
of languages which have lost words from the 100-word list much slower
than at 14% per millennium. Icelandic in particular shows almost no
loss (something like maybe 2%) over the last millennium.

Talone means that the calculations about how vocabulary is lost
and so there is supposedly nothing to cmpare after x thousnads
of years are irrelevant to anything in the real world. Even if there
are languages which have lost vocabulary faster, in cases where we
know nothing and are just starting (eg. Amerind or Nostratic), we
have no way of knowing what the rate might have been. It could have
been fast or slow. Hence, no a priori argument can be made that
the comparative method cannot reach beyond x millennia and therefore
there is no basis for telling people that theoreis like Amerind
Nostratic should be dismissed a priori. Whether such theories are
right or wrong can only be determined by examining the data, not
by playing games with mathematics.

(This is getting awfully long. I will therefore defer to another
occasion a discussion of what ELSE is wrong with these mathematical
arguments, and why, pace Jacques, n-ary comparison is better at
avoiding chance resemblences than binary IF it is done right).

Alexis MR
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 4: Comparative Linguistics

Date: Sat, 14 Jan 95 15:30:37 ESComparative Linguistics
From: <>
Subject: Comparative Linguistics

Since the discussion has been going on for a while, and lots of
issues have been explored, it occurred to me (if it not too
presumptous) to try to summarize some of the progress that I think
we have made and then launch a new set of questions:

Thus, I think that we have established

(a) that language relationships can be established in the absence
of morphology (although it is much easier to do it if there is
morphology and it cooperates),

(b) that language relationships can be established before a complete
comparative grammar of the resulting proto-language is written (although
the more grammar one can point to the more compelling the case for the
relationship in question),

(c) that there is such a thing as comparative syntax (although there is
not much of it, at least as yet, and that it is a lot easier to do
syntactic reconstruction with the help of morphology),

(d) that there is no such thing as a constant rate at which languages
lose "basic" vocabulary (in particular the words in the Swadesh list)
and that consequently

(e) any argument based on a constant rate of basic voc. loss which
purports to show that after x millennia any related languages will
no longer have any shared vocabulary left is not going to work,
at least not in general,

(f) that there is apparently no OTHER argument at all for the claim
in (e) to the effect that the comparative method ceases to identify
related languages after 5-10 millennia (as has widely been claimed).

Am I right that this much is pretty much agreed to?

If so, I would like to throw out the following for further discussion:

Some biologists who are concerned with the rise of complexity (e.g.
multicellularity) argue that since life started out with single cells
and since you cannot get any simpler than that, that means that the
rise of complexity could simply be due to chance. Namely, since you
cannot get any simpler, the only way to move is towards more complex,
and so once in a while that's what you will get.

The reason I mention this is because the question when we deal with
a controversial theory in comparative linguistics, like Nostratic,
is whether there is any chance that evidence supporting such a
theory will be found. Now, it seems to me that ALL the arguments
against such theories (except in the case of Altaic) have involved
people either talking about a priori methodological points or else
looking at just one of the language groups which such theories seek
to unite. So, there have been people criticizing some of the
Indo-European implications of Nostratic, for example, but the point
is that that way you cannot in principle ever discover any supporting
evidence, even if it exists. Instead, if such evidence is to be found
one MUST be willing to suspend disbelief to the extent of looking at
two or more of the language groups claimed to be related. For the
only kind of evidence that is supporting would have to come from such
comparisons. Now, I happen to have found several sets of words with
appropriate semantics which fit the sound laws proposed for Nostratic
by Illich-Svitych by looking, for example, at IE, Uralic, and Altaic
materials. Naturally, this makes me more and more positively inclined
towards the theory. I have also found (and been publishing) all
kinds of problems witht the theory, but my point here is that only
by looking at sets of language families can we hope to discover
the positive instances if they exist. This is why I keep saying
that we need discussion of substantive factual issues rather than
methodological ones (although the latter are fun, too).

Alexis MR
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue