Editor for this issue: <>
I am forwarding the following for posting at the request of Robert Rankin (rankinMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issueukanvm.cc.unkans.edu). I do not subscribe to this list and have no wish to join the fray at present, but when my name is mentioned sometimes the file is forwarded to me via e-mail. Thus the following: Andy Anderson cites me on three points in a series of recent postings. I have known Andy upwards of 30 years and do not feel that he would in- tentionally misrepresent my views, but I also feel a couple of things need clarification. First, I am uncomfortable about being formally cited as a (secondary) source of information on Lyle Campbell's paper from the Boulder (Green- berg) Conference of ca. 1990. If Andy wishes to distribute an attack on the paper or its author in written form, he should first obtain an actual copy of it or, alternatively, await its publication. I guess I shouldn't have brought it up in our conversation at the SSILA/AAA meetings. Second, I am said to have reported that the geneticists who have studied the mitochondrial DNA (mtDNA) of sundry Native American and Siberian peo- ples claim that there were/are "two subgroups within Amerind (aside from the Eskimo and Athabaskans)." This is not what I (or the paper's authors Wallace, Torroni and Schurr, et al.) said. The authors did not address themselves to the linguistic problems and most certainly didn't talk about "subgroups". Nor did I, since I do not regard the historicity of anything like "Amerind" as even remotely established. The authors of the paper did posit at least four "migrations". They do not discuss the most recent, Eskomo-Aleut, in their abstract, but I THINK they gave a time depth figure of about 6000 years BP (Before Present) for it orally--don't quote me. FROM THEIR ABSTRACT: For what they call Na-Dene their figure is 7000-10000 years BP. Then they say they have evidence for at least two "migrations" preceding that. One comes between 12000-15000 BP and the earliest between 26000-34000 BP. Figures as high as 40000 BP were mentioned orally, as I recall. They did not attempt to correlate their figures with our knowledge of periods of glaciation or the periodic existence of the land bridge in Beringia. I leave it to readers to decide what this portends for the Amerind hypo- thesis or its proposed (Glotto)chronology, but a warning is in order in any event. Note that I have written "migration" in quotes above. This is not because I wish to pejorate the term; it is because geneticists use it in a very special way. For them it has to do solely with the ap- pearance of specific genetic material in American populations. They then assume a common ancestor and calculate the number of millennia by positing a uniform mutation rate for mtDNA. The material and theories they work with force this definition of migration on them. All this says nothing about the situation "on the ground." In reality though, each of these genetic migrations can have included many distinct movements of people across Beringia over a great many years--perhaps centuries or even millennia. And they may have represented many ling- guistic groups. All that is required in order for entire clusters of migrations "on the ground" to get read as a single mtDNA "migration" is a relatively homogeneous gene pool in Eastern Siberia over the particular time span when the "genetic mutation" occurred. The evidence does indeed suggest four GENETIC migrations, but it really says little or nothing about how many "real" migrations there were with- in each of the four clusters, nor does it say anything about linguistic diversity--much less "subgroups of Amerind." We may wish it did, but it doesn't. I do note with interest however the rough correlation between the geneticists' oldest figures and the calculations of Nichols (1990 in Language 66.3) based on linguistic diversity in the Western Hemisphere. The more recent sets of mtDNA dates fall within the esta- blished archaeological ballpark for Clovis believers, although the earliest set certainly does not. One very short contribution of my own here--mostly my wife's actually, since she is a molecular geneticist and we talk about these things over breakfast. The yardstick used by mtDNA geneticists in these cal- culations may not be appreciably better than that used in glottochrono- logy, i.e., genetic mutation takes place at a rate which is only RELA- TIVELY constant. It can be speeded up by various singular events from cosmic ray bombardment to ingesting certain fungi infecting the grain from your cache pit. Biologists try to allow for this sort of thing, but as you can see from the plus/minus dates for each cluster, we are not talking about something as precise as dendrochronology or even radiocarbon dating. The mtDNA studies are very interesting but we must bear in mind their limitations and special use of the term "migration". Lastly, in an earlier posting Andy mentions that I had examined Green- berg's notebooks and determined how he had mislabeled so much of his Siouan data in LIA. Andy's description of the way the notebooks are laid out is correct, but I have only actually seen xeroxes of the pages of Siouan entries, not the notebooks themselves. I might add that the Siouan entries in the notebook are hard by the Iroquoian, Caddoan, Yuchi entries demonstrating once again that Greenberg had decided on the final classification of these families when he laid out his notebook design and before the vocabularies from the languages were entered. My thanks to John Koontz for posting this. Sincerely, Bob Rankin (University of Kansas) (rankin
ukanvm.cc.ukans.edu)
I wish to make some comments on an issue that recent discussion of Nostratic and the problem of "demonstrating" distant genetic relationships has skirted around that I believe underlies some of the issues that various people have been directly addressing. An assumption that seems to underly much of the discussion is that hypotheses regarding genetic relationships are not interesting unless they can be proven to be true. I find this a rather odd assumption, and one that does not seem to be made about any other kinds of hypotheses in linguistics (or anywhere else in science as far as I know). And let us set aside for the sake of argument the oft-noted point that the notion of proof is not really applicable to empirical hypotheses, and assume that the term is to be used loosely for some arbitrary high level of certainty. It seems fair to say that there is a fairly widespread disinterest in hypotheses like the Nostratic hypothesis because it is widely believed (and I will assume it is true here for the sake of argument) that the available evidence for Nostratic falls short of this imaginary level of certainty which deserves the label "proven". A common type of reaction to unproven hypotheses is that it has not been demonstrated that the observed similarities might not be due to chance and/or borrowing. But suppose that someone were to take the same attitude towards comparative reconstruction of protolanguages. Suppose that someone were to object to comparative reconstruction of anything but very shallow groups on the grounds that one can never prove that the reconstructions are correct. Just as one can object to certain claims of genetic relationships on the grounds that one cannot conclusively eliminate the possibility that the observed similarities might be due to accident and/or borrowing, one could equally well object to virtually ALL hypotheses surrounding comparative reconstruction on the grounds that one cannot conclusively eliminate alternative possibilities. The comparative method is a way to come up with the best guess one can make about a protolanguage; it never provides proof that the reconstruction is in fact correct. So why bother doing it? The answer should be obvious: hypotheses which represent our best guesses at any point in time are what much of science is about. By why do so many linguists seem to object to applying the same way of thinking to hypotheses about genetic relationships? Why is it that many historical linguists find the hypotheses like the Nostratic hypothesis either laughable or upsetting? Why don't they react the same way to comparative reconstructions, since they also are "unproven"? Why don't they rush out and read everything they can find on Nostratic and conclude "The evidence is tantalizing but not conclusive; it's a really exciting hypothesis"? Why is there such a double standard? I want to suggest an answer to this question, an answer which, if right, provides insight into the nature of many debates surrounding controversial hypotheses of genetic relationship. Namely, some people find questions of genetic classification intrinsically interesting, quite apart from any detailed historical work that plays a role in supporting hypotheses. Other people, however, are primarily interested in the detailed historical work itself, and do not find questions of genetic classification intrinsically interesting, but only interesting in so far as they are an inevitable consequence of historical work. People of the first sort are more likely to find recent work reclassifying Penutian languages exciting, while people of the latter sort are unlikely to react that way, unless they are Penutian specialists. As one moves back in time, the ability to apply the comparative method becomes increasingly difficult, and detailed historical work becomes increasingly speculative (and to many historical linguists, dissatisfying). But at any time depth, we can always be much more confident of the genetic classification than we can of any comparative reconstructions. Our confidence in Indo-European as a language family is surely greater than our confidence in ANY specific claims about Proto-Indo-European. But as we move further back in time, we should expect there to be hypotheses that we cannot be entirely confident of, but for which there is at least some promising evidence, where any comparative reconstruction is going to be sufficiently speculative as to not be satisfying to linguists interested in traditional comparative work. And since these linguists are not interested in genetic classification except as a biproduct of detailed historical work, such linguists are likely to find the hypotheses uninteresting. On the other hand, for linguists who find questions of genetic classification inherently interesting, the fact that detailed historical work may not be possible is irrelevant, and the fact that the hypothesis is unproven or unprovable may be no more a source of concern than the fact comparative reconstructions are always unproven and unprovable. If this view is correct, much of the debate surrounding controversial hypotheses in genetic classification is based, not on substantive questions, but simply on what sorts of questions different people find interesting. Matthew DryerMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue
) 3) ) Date: Thu, 22 Dec 1994 21:02 -0500 (EST) ) From: Mike_MaxwellMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuesil.org ) Subject: Evidence against Greenberg? ) ) Perhaps the best evidence against Greenberg's hypothesis would be to show ) that his methods, when applied *in the same way* to randomly chosen samples ) of languages of the Earth (including some Amerindian languages), group them ) in the same way and with the same degree of (un)certainty as those methods ) group Amerindian languages (less the Athabaskan languages) together. (I ) put the stars around "in the same way" because one can easily distort ) someone else's methods.) As I understand it, some people have tried ) applying Greenberg's method to one Amerindian language and one other ) language (Finnish was one such, I believe), but I have never heard of a ) large-scale comparison being done in this way. (And I believe Greenberg ) says his method is best used for mass comparison, not one-on-one.) ) Here we go again. Some bean counter some day will tot up the number of times "Greenberg" occurs here and will rate the corresponding work as "highly influential". Never mind. There is no difference between mass comparison and pair comparison. When you engage in mass comparison you carry out a large number of pair comparisons. The greater the number of comparisons, the more chances you have of finding cognates.. and chance resemblances. Take two dice and roll them. How often will they show the same score? Take a bagful of them and empty it onto the floor. Matches galore. But that does not matter. We've had recently a long, long, exchange on the comparative method, in which Alexis Manaster Ramer made a point -- which he seemed to believe as important -- that no language had been found to retain less than 86% of some sample wordlist (Swadesh's 100? Doesn't matter as you shall soon see) per thousand years. The claim is false, but never mind, I'll grant it as true. I'll even grant you 90% retention. America, they say, was populated 18,000 years ago. Well, not so, evidence from Brazil now seem to push it back to 50,000 BP. But I'll grant you 18,000 BP. And that everybody since the Great Crossing was careful not to be linguistically overly innovative, so that there exist at least two maximally distant languages which have retained 90% of their vocabulary millennium in millenium out. Today you could expect to see between them 0.9^(18*2) = 0.0225, i.e. 2.25% words in common. On that famous 100-item highly stable "basic" vocabulary. So that's your Proto-Amerind reconstituted. Now, of course, we have not taken chance resemblances into account. If you remember Greenberg Sci.Am. article and his calculations, he estimates the probability of chance resemblances at 1 in 250. But he forgets that he allows a bit of metathesis. In fact, if you read carefully Ruhlen's "On the Origin of Languages" complete anagramming, since he list Irish "bligim" as cognate with his *malk'a. There are six ways in which you can combine 3 consonants, so that is really one chance of resemblance in 42 (250/6 = a tad under 42). Using their figure, then, how many chance resemblances show you expect to find in a 100-item wordlist? 100/42 = 2.38. Bingo! More than real cognates after 18,000 years with very conservative languages. Now, *if* America was really populated 50,000 years ago we should see 0.9^(50*2) = 0.002656% of your 100-item list preserved. That's one word in 37,649. So out of every pair of 100-item lists you will find, on the average, 1/37649*100=0.0027 wrods in common. Meaning that you can look forward to examining some 376 pairs before you find one single cognate. But thanks to mass comparison, you are sure to find it. Only compare 50 seemingly *unrelated* languages (Because you want to pick maximally distant languages). That gives you 50*(50-1)/2 = 1225 pairwise comparisons. With a bit of luck, that will give 3 or 4 cognates, each attested by 2 or 3 languages. ... and stacks of spurious resemblances, each attested by far many more languages than your true cognates. Perhaps America was not populated 50,000 years ago. But Australia was at least 40,000 BP. That does not prevent some from reconstructing Proto-Australian. And trying to link it to Indo-European. Enough fun with figures. Why don't you try to *simulate* a paltry 30,000 years worth of evolution of 30 languages each represented by 100 words, with a one-in-250 (see how generous I am) chance of resemblances? (Warning: advertisement follows) Download glotto02.zip from from directory /pc/linguistics at garbo.uwasa.fi, unzip it, read the documentation about programs GLOTSIM, GLOTTREE do it, and see. (De toutes facons, autant souffler dans un violon. C'est tellement plus rigolo d'aller s'imaginer qu'on peut demeler le passe perdu dans la nuit des temps). j.guy
trl.oz.au