Editor for this issue: <>
(*sigh*) > From: AVVOVINMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issueMIAMIU.ACS.MUOHIO.EDU > Chance similarities are always a possibility, but to the best of my > knowledge I have never seen a proof to a widely circulating idea that one > can take any two languages at random and prove that they are genetically > related on the basis of chance similarities between them. It will be > absolutely impossible to find any REGULAR phonetic correspondences bet- > ween any look-alikes due to chance. Besides, how many chance look-alikes > it is possible to find among two unrelated languages? Very few indeed. Only if you do not allow for semantic shift. Download file chance01.zip which is in directory pc/linguistics of the anonymous ftp site garbo.uwasa.fi, unzip it, read the doc file, and run the language simulation program chance.exe with the parameters you please, and see. And when the next issue of Anthropos comes out, in March next year, read the explanation, entitled "The incidence of chance resemblances on language comparison". > From: amr
zeus.cs.wayne.edu > > But we would need to be told what rate of loss Jacques > is assuming and what kind of branching (this is absolutely crucial). The figure of 40,000 years is assuming a retention rate of 98% per 1000 years. Thus 40,000 years later, 80,000 years worth of evolution separate any two maximally distant languages: 0.98 to the 80th power is 0.1986, i.e. 19.86% vocabulary still in common. The branching assumed is that of an n-wise maximal tree. > Also, it is vital to know what vocabulary we are talking about. Does > anybody know of any counterexamples to the claim that the old > Swadesh list has a rate of loss of no more than 14% per millenium? Of course. Once again Bergsland and Vogt 1962 article in Current Anthropology. Then Blust's 1981 paper at the Third International Conference on Austronesian Linguistics. > My colleagues and I have been doing calculations using that rate > as well as the 27% rate claimed by M. L. Bender for a different > 100-word list. And the results are that for any reasonable-szie > family (i.e., not Basque or Sumerian or some other complete > isolate), we should expect to be able to recover enough for > comparative work for much longer than 10,000 years, but it all > depends on how many languages and, even more, how they branch. Let me see. 27% replacement is 73% retention. So, two maximally distant languages will have, 0.73^2*10 = 0.0018, i.e. 0.18% left in common, and on a 100-item wordlist that is ONE cognate at best, more probably zero. > I cannot see how Jacques arrived at 100 years or whatever it was Elementary, my dear Alexis. Let me grab my pipe first as an aid to thinking. Now, (puff puff), I had mentioned Muyuw which had this distinction, worthy of the Guinness Book of records, of having innovated 20% of its everyday vocabulary in one single generation (I had written 30%, but I seem to remember it was more like 20%. It's 25 years ago, you know. You'd had to ask David Lithgow, who was with SIL at the time, for the correct figure). Let us say 20%. That is 80% retention, but, this time, per generation. There are 3 or 4 generations per 100 years. So, if two languages have been so eccentric as to evolve at the rate of Muyuw after they split, they will have undergone 6 to 8 generations' worth of lexical evolution one century later: from 0.8^6 = 0.2621, i.e. 26%, to 0.8^8 = 0.1678, i.e. 17%. And two centuries after their split: from 0.8^12 = 0.0687, i.e. 7% to 0.8^16 = 0.0281, i.e. 3% vocabulary left in common, at which stage it cannot be distinguished from chance resemblances. I had mentioned 30% replacement per generation instead of 20% so: One century after the split: from 0.7^6=12.8% to 0.7^8=5.8% Two centuries: from 0.7^12=1.4% to 0.7^16=0.33% Good Lord, Holmes!
With apologies for the untimeliness of some of these comments (I received a large accumulated backlog of postings in one, intense batch), some factors that need to be taken into account in evaluating/using the comparative method. The first of these is semantic shift. It's been alluded to in the discussion. Without a full model of semantic relatedness and possible shifts, it's probably impossible to contemplate automatic, algorithmic application of CM, as advocated by Stephen Spackman. Aside from "standard" examples, like English HOUND being cognate with German HUND 'dog', I have culled the following from my (alas STILL) unpublished collection of c. 300 Proto-Semitic and West Semitic roots containing sibilants; since the point here involves semantics, I give only glosses. Anyone who wants specific citations, attestations, etc. need only holler electronically... (1) in four different branches of Semitic, we have cognates meaning 'cut/piece of roast meat', 'decorate/engrave', 'pierce an abscess', 'adorn, tatoo'. (2) a root with Semitic cognates meaning 'sheep', 'head of small cattle', 'sheep', 'sheep/goat' and outside of Semitic means 'pig' (Egyptian) or 'cow' (Egyptian, Proto-Cushitic, Proto-Chadic). (3) a form meaning 'shoe' or 'sandal' throughout Semitic, except Ugaritic, where it means 'hemline' is apparently cognate with reconstructed E Cushitic 'footprint'. (4) a root that throughout Semitic means 'cucumber' means 'yoghurt' in Jibbali (Modern South Arabian). The greater the time depth, the greater the likelihood of semantic shift, and the greater the shift. These changes provide at least a partial answer to the question posed by Mike Maxwell about vocabulary loss. Again with regard to vocabulary loss, Alexis Manaster Ramer observes that most of the Proto vocabulary will surely be preserved in at least two descendants. First of all, I'm not sure that this is true at the time depths that we're discussing. However, for purposes of discussion, I'll grant the possibility. Even within a well established language family like Semitic, attested (and attestable!) vocabulary sizes vary enormousy due to accidents of preservation (for ancient languages) and to more recent language attrition and death, due to population movement and other less benign factors (for modern languages). Thus, the Arabic vocabulary available for comparison is much larger than the Epigraphic South Arabian or the Old Aramaic vocabularies. Clearly some of the lexical items unique to Arabic are inheritances, from Central Semitic, from West Semitic, from Proto-Semitic, or from Proto-Afroasiatic. Some of it was borrowed, from any of a large number of languages during the period of Islamic expansion. And some of it just happened. There's no guaranteed way of telling which is the case for a particular word. As a result, I find it very disturbing to see isolated Arabic forms that I know not to have Semitic cognates used in attempts to demonstrate an Afroasiatic affiliation with other language families. Given Proto-Whatever, with well worked out, recurrent correspondences and lawful series of changes in its descendants, it would be possible to tell whether an isolated Arabic form "fits" the big picture. But when the goal is to establish Proto-Whatever, using the isolated Arabic form constitutes begging the question. With regard to a narrower issue, raised by R. M. Blench, I think it's a natural tendency to think 1000 years one way or the other doesn't matter when we're talking about the distant past, though of course it matters enormously closer to the present. 10,000 years is simply a nice round number. It's worth pointing out also that Omotic is a relatively new grouping, probably c. 20-25 years old, at most. In Greenberg's original classification of African languages, it was South Cushitic. Once Omotic was recognized as a separate phylum, some people immediately started to doubt its inclusion in Afroasiatic. That was around when I started being more of a phonetician than a Semitist. Obviously, if more recent work has increased the time depth of Omotic, those of us who believe it is still Afroasiatic will have to revise our chronologies accordingly. This is probably enough to fill up people's mailboxes for now. It's nice to have a Linguist discussion that takes me away from what I really should be doing. Alice FaberMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue
Despite the fun everybody is having with this topic, a certain narrow-minded- ness I perceive in the discussion has been bothering me. It may be my misunderstanding, but it seems underlying the debate remains the image of the genetic tree. Lurking in the background is the notion that all the world's languages hang from the tree and thus go back to a single root. I realise that is not immediately relevant to the issue of a cut-off point for possible reconstruction, but we know there are some linguists who are motivated by this idea(l) -- an idea which has a very strong claim about language and its users, and might even trivialise attempts to connect linguistic universals (whatever they are) with innate properties of the human mind (however that's interpreted). Some of the issues which I haven't seen raised in the context of the discussion so far are like the following: Does the discussion have a tacit assumption that the most common form of linguistic evolution (in the world, not in linguistic textbooks) is tree- like internal split? Is that assumption justified? If so, why is it that for languages we know (for sure) are genetically related, we can't figure out their intermediate branchings, like is Germanic closer to Latin or Greek etc etc etc etc. -- not to mention what kind of British English American English split off from (I of course don't accept this formulation of the issue at all, but I think y'all understand what I mean). Next there's the apparent paradox between the uniformarian hypothesis of language change (in reference to the processes of change) and the FACT that the historical record shows a steady decline of the NUMBER of unrelated (as far as we know) language families existing at the same time in the world (bye bye Sumerian, Etruscan etc). It's pretty obvious why that logically has to be the case, but it does change the nature of the world as far as language diversity (or does it?-- sure it does, as far as traditional genetic concepts of language diversity go). And then, we see the same "impoverishment" with regard to branches of known language families, what's left of Italic beside Latin, etc etc. But here do we get new branches to preserve the entropy? No! Because we don't quite know what they're branches of -- recall the problem of intermediate branching (consequence of the tree). My point is this. It's not surprising that we should have problems understanding whether or not there is a ceiling (or is it a floor) to the ability of the comparative method to establish genetic relationships between languages families, because of its apparent assumptions about the nature of linguistic diversification. It is not even adequate for intermediate classification when we CAN demonstrate genetic relationship. So the rest of my point is that the comparative method, with its ideological baggage in the form of trees and splits, is only PART of an adequate theory of linguistic evolution and diversification. It is indisputably an essential and intellectually admirable part, but I repeat the question: how common is the form of linguistic evolution and diversification that allows the comparative method to work as it was intended to work, as opposed to other forms of linguistic evolution and diversification. The answer obviously has consequences for how to approach the relationships among distinct language families. BenjiMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue