LINGUIST List 5.1462

Sat 17 Dec 1994

Disc: Comparative Method

Editor for this issue: <>


  • Geoffrey K. Pullum, Re: 5.1448 Comparative Method
  • , Comparative Method
  • Ecological Linguistics,Anderson,PRT, Typology of Historical Change

    Message 1: Re: 5.1448 Comparative Method

    Date: Wed, 14 Dec 1994 18:17:42 Re: 5.1448 Comparative Method
    From: Geoffrey K. Pullum <>
    Subject: Re: 5.1448 Comparative Method

    The ongoing discussions about the comparative method do not seem to be getting anywhere on achieving real consensus in Greenberg and anti-Greenberg camps on the question of what would could as valid evidence that certain language families ARE related at a large time depth. I wonder if it would not be a good idea to hear something -- from the defenders of wide-ranging and large-time-depth comparison, preferably -- concerning what would count as evidence AGAINST a genetic relationship? As a concrete example, take the fact, recently cited by Poser, that ALL the Muskogean evidence in Greenberg's book has been found to be tainted by data errors (Geoffrey D. Kimball, A Critique of Muskogean, "Gulf", and Yukian Material in _Language in the Americas_, IJAL 58 (1992), 447-501). I can imagine how one might want to maintain that even this total collapse of the case on Muskogean merely puts us back in a state of being neutral, a priori, on whether Muskogean languages are related to other Amerindian languages, or to Nostratic for that matter. Anti-Greenberg Amerindianists are perfectly prepared to agree that the Amerind languages MIGHT have descended from a common source now lost. That's neutrality.

    But suppose we move from that neutrality to the position that we will assume as a default that Muskogean IS Amerind, and so are all the languages of South America, and indeed, that Amerind is related to Sino-Tibetan and both to Indo-European and thus Nostratic and all of the above to Khoisan... Let us assume for the sake of argument that the world's languages are all genetically related; but let us take this to be an empirical assumption -- not just a willingness to reject the closet racism that Poser says Ruhlen once alleged in his critics, or a yearning to find universal brotherhood, but an assumption against which evidence can in principle count. Now, what sort of linguistic evidence would count, for Greenberg and Ruhlen and Illich-Svitych, as DISPROVING the inclusion of Muskogean or any other family in in the conjectural (though tentatively assumed) Proto-Gaeic?

    That is, what sort of data pattern or configuration of phonological and grammatical properties could suffice to make the macrocomparativists throw in the towel and go outside to meet the press and concede defeat? There ought to be some imaginable scenario that would end up with Ruhlen telling a group of reporters from the Stanford Daily and American Scientist and other supermarket tabloids, "Well, we thought we could sustain the whole Proto-Gaeic thing, but that set of paradigms on Haida has us beat; we've had to concede the Haida case; according to our tests, Haida is unrelated to the other human languages." (Much scope for new press attention here: "HAIDA INDIANS ARE ALIENS FROM SPACE, TOP EXPERT ADMITS.") But what sort of scenario would it have to be, to get the Greenberg camp to admit that it was in grave trouble on some relatedness claim?

    To be fair, orthodox comparativists might well say that if you put it like this, no answer should be expected. One can argue that a certain methodology applied to a certain set of data yield no evidence for relatedness between Burushaski and Bushman, but not that it refutes such a relatedness. A positivist view of historical linguistics would see it as maintaining hypotheses about verifiable relatednesses in a very particular form: when I say that German "pfennig" comes from an earlier Germanic form with initial "p" that will be seen in languages like English with no history of a High German sound shift, I am counted as having been supported by the observation that English speakers say "penny"; if the form turned out to be "twenny" I would be in trouble; given German "Pfund" I am committed to something like "pund" in English, and (given the Great English Vowel Shift) the discovery of "pound" is more good news for me; and so on. The predictions I am making are about an indefinitely extensible set of pairs (Ger:pfxxx, Eng:pxxx).

    Now, the falsity of one of these could conceivably taken to refute brittle forms of the hypothesis that English cognates of German pf-words always begin with p-, but it isn't nearly enough to be counterevidence to the whole English/German relatedness claim, of course. That claim would not be given up unless there was a complete collapse of all the evidence: if "pound" was established textually to have been a coinage by a novelist who had never heard German, if "penny" was shown to be borrowed from Italian "penne" during a period when pasta had been used for small change, and then all the other sound correspondences started collapsing as well.

    I'm asking this: if the 100% collapse of Greenberg's Muskogean evidence, as alleged by Kimball, does NOT count as a complete collapse of the case that Muskogean is included in Amerind (hence, a fortiori, of the case that it is in Proto-Gaeic), then I think I need some help in understanding what COULD be evidence against that inclusion. There had better be something.

    +---------------------------------------------------------------------------+ | G e o f f r e y K. P u l l u m * | | Stevenson College, University of California, Santa Cruz, California 95064 | | (408)459-4705 * Messages (408)459-2555/2905 * Fax (408)459-3334 | +---------------------------------------------------------------------------+

    Message 2: Comparative Method

    Date: Wed, 14 Dec 94 21:19:58 ESComparative Method
    From: <>
    Subject: Comparative Method

    In response to Poser, Nichols does on p. 6 of her book claim that there is no way for the comparative method to distinguish between Nostratic and "a much larger grouping of most lineages of the Old World and New World", that is, as she herself says, between hypothetical groupings of around C.12000 and c. 40000 years ago, and she does say that this "Because the cut-off point is so shallow", the cut-off point being the ceiling of 6000-10000 years which she imposes (wihout any basis, as noted in my earlier messages) on the comparative method. Since this amounts to a rejection of the Nostratic hypothesis (not as false perhaps but as unverifiable/ unfalsifiable, I guess), this means that I am right and Poser is wrong about whether there have been people who have rejected particular theories of linguistic relationship on the basis of this mythical ceiling idea.

    In response to Teeter: I think (and I hope Karl will endorse this) that our disagreements are really quite minor, but they are real as far as they go. For example, while Karl is obviously 100% right about Meillet's position in the Scientia article (where Meillet says that lexical comparisons can never prove a relationship, and only morphological ones can), in his 1925 book Meillet repeatedly states that you CAN establish a linguistic relationship purely on the basis of lexical correspondences, makes the same point that I have been making over and over again here on LINGUIST that for some language families this is the ONLY way of showing relationship since they lack morphology, and even makes the same point that I did about how certain things can only be done ONCE you have established, at least tentatively, that the languages you are dealing with are related. As a matter of fact, he even shows how you could demonstrate the relatedness of the Romance languages PURELY on the basis of a lexical comparison, using the numerals 1-10, and then shows how you could do that for the older Indo-European languages too (although there he begins to slip in a little morphology).

    I would also like to add that I think it is a serious mistake to pretend that there are no models for comparative linguistics besides Indo-European, because it is so utterly atypical of the language families of the world. There ARE plenty of equally well established families, several of which are OLDER in the only sense that matters, that is, not in years before the present but in years before the earliest written records and many of which are more useful models for those working on families not yet established (Afroasiatic, Austronesian, Austroasiatic, Uto-Aztecan, Altaic, etc.). Which is not to say that there is anything wrong with knowing as much as possible about IE, but rather that there is much wrong with knowing naught BUT Indo-European. I am not sure but I think that this is what Eric Hamp had in mind in a recent paper in the Davis/Iverson volume when he complained about how the teaching of historical linguistics is hampered by textbooks which largely draw their material from IE (or indeed from some favored parts of it, such as Romance).

    And I am happy to have Sally Thomason point out that morphological elements can be borrowed. Meillet we must remember was greatly troubled by the possibility of such a thing and of the existence of mixed languages. He tried to debunk every examples around, and thought (wrongly I think) that if such languages exist, then they cannot be handled by the comparative method. The fact that such languages do exist (e.g., Mitchif ) and yet pose no problem (so that we have no trouble tracing certain parts of Mitchif to French and others to Cree) means that Meillet was worried for naught. But it also means that language classification on the basis of morphology is no more infallibible than that on the basis oif lexical material. You work with what you have available, which in some cases may be largely morphology and only a few obvious lexical parallels (that's how Afro-Asiatic was first established), morphology AND lexicon (INdo-European), lexicon and A SINGLE morphological parallel (Algic, as Victor Golla reminded me just the other day), lexicon only (Vietnames and the rest of Mon-Khmer), and so on and so forth.

    Message 3: Typology of Historical Change

    Date: 15 Dec 94 22:34 GMT
    From: Ecological Linguistics,Anderson,PRT <>
    Subject: Typology of Historical Change

    Typology of Historical Change

    This note tries to make explicit what I take for granted, and have discussed with others on occasion, but which perhaps needs a more explicit statement.

    One of the most fruitful avenues of research in distant language comparison, I believe, is the growth of the field I call

    Typology of Historical Change.

    Under this rubric I include for example the work of Johanna Nichols (whether or not I agree with any data, findings or particulars of method, is not relevant to my point; I still think it helps our thinking along).

    I also include, and this is a challenge I want to issue, the

    ***mode of discourse***

    in which Mr. Vovin asked recently for help in finding typological parallels to a hypothesis he was interested in that a phrase meaning "water falls" could fossilize (?) into a basic word for "rain". In response to his query, he got back some positive answers, examples which people claimed fit this description.

    As a method of reasoning, this is what we need more of. That is, more accumulations of attested examples of particular changes, to educate our intuitions of what we naively think are "possible" semantic shifts by ever more experience with what actual semantic shifts are known or suspected. It will help us to improve our methods of guestimating possible language relationships, because it will at least say that a given hypothesized semantic shift is frequently attested, so it is not straining to compare lexical items whose meanings differ in such and such a way. Whereas by contrast another hypothesized semantic shift is not firmly attested. So such an unattested semantic shift should probably not be used in those distant language comparisons which are themselves the most difficult to do, because over large time spans the number of context-sensitive conditioning environments is as great as the number of lexical items available to compare, and thus there are few or no ***recurring*** sound correspondences.

    In other words, as we move towards deeper comparisons, we must more and more rely on ways of measuring "distance" of semantic shift and "distance" of phonological change, rather than measuring repeating sound correspondences and semantic identities. We do not yet have our tools for doing this very well sharpened, but we can proceed gradually to sharpen them. A study of the known attested cases is the best start.

    In other words, if someone really wants to see how our methods fare with gradually more distant language comparisons, and to see how some new methods may fare, they should tabulate, for all known language relationships,

    (a) the proportion of sound-correspondence repetition in the comparable vocabulary (and what "comparable" means is itself a variable, not exclusively defined by (b) and (c))

    (b) the "semantic distance" along attested paths of semantic change of lexical items being compared. Where multiple such shifts have been attested, the estimated "distance" counts as closer, smaller. Where few such shifts have been attested, the estimated "distance" counts as greater. We of course do not have enough such information in database form to use at present, but whatever we do have can be used provisionally, as explained in (d)

    (c) The "phonetic distance" along attested paths of phonetic change of lexical items being compared. There is relatively more of this knowledge available than for phonological change.

    (d) Exploring how the three measures above vary as we go to greater time depths. That is, using first the more assured cases, then the less assured ones,

    How does a weighted average of "closeness" of compared lexical items vary as we go to increasing time depths?

    How does the proportion of regular and often recurring sound correspondences to unique or rarely recurring sound correspondences vary as we go to increasing time depths?

    It is the development of the tools in (b,c) which will most advance our abilities to compare at greater time depths, improve our methods.

    I will be very grateful to anyone who points me to studies which approximate to parts of the program outlined just above.

    "The Comparative Method" currently does not have the benefit of fully developed tools of this kind. To that degree, the current comparative methods can be considered less rigorous than they ought to be, and for that reason not as powerful at distant language comparisons as they will sometime come to be. A future comparative method can use these tools more and more precisely.

    The real challenge today to existing comparativists is to avoid artificially fossilizing the term "The Comparative Method", to avoid treating its methods as fixed and not subject to improvement and supplementing with newer and more powerful methods, as are the methods in any other science. It would be healthy if the word "the" were dropped from the term and it were made a mass or plural term "comparative methods". That implies no lessening of rigor. Indeed, as I have been at pains to point out above, I firmly believe some of the limitations of the present state of comparative methods result from a ***lack of rigor*** in the area of the typology of possible changes (phonetic, semantic, grammatical).

    Work discussed by Bill Croft in the topic of syntactic reconstruction and typology is certainly relevant to the concerns raised here. I think we are seeing the beginnings of a new paradigm in the focus on paths of change in language, and comparative-historical linguists will be left behind if they do not add these techniques to their box of tools (while keeping all the good techniques they already have).

    Lloyd Anderson