Editor for this issue: <>
Content-Length: 1653 I, too, had noted that there seemed to be two notions of comparison here, but unlike Scott Delancey I did not assume that there we should distinguish between comparison for the sake of building a reconstruction and simply for the sake of determining possible relationship. To my mind, the numbers that Jacques Guy posted demonstrate, not the weakness of n-ary comparison, but its strength: If we are looking at a grouping of languages of which we are uncertain of relationships, and the number of potential n-way cognates is as low as random chance would dictate, then the likelihood is against their being closely-enough related to pursue reconstruction. I think this answers, by the bye, David Powers' perhaps rhetorical question regarding the assumptions under which Janhunen's claims could be considered a fallacy. I'm not quite sure how to address Powers' conclusions, however. The methods of comparison most of us accept have built into them a checking mechanism, such that acceptance of some set of matches as "true" (in Powers' terms) constrains the set of further matches we can accept: What he considers to be "false" matches may not, under these constraints, be treated as matches at all. Under this methodology, n-way comparison *does* increase the ratio of signal to noise in the data. I do not have access to Janhunen's original statements. If Alexis Manaster- Ramer has summarized them accurately, I have to conclude either that Janhunen is unfamiliar with the actual workings of the comparative method, or that the conclusion summarized by AMR is disingenuous in the extreme--and is indeed a fallacy, either way. Rich AldersonMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue
Critics of Manaster Ramer miss the main point: Janhunen calls by "similarities" parallels between Japanese and other Altaic languages BASED ON REGULAR PHONETIC CORRESPONDENCES without bothering to prove that these are real "look- alikes". The parallels based on regular correspondences are not CHANCE or RANDOM parallels and therefore the proposed statistical games do not apply to the case. If you do not believe it, take any pair of a priori unrelated languages, such as for example Mandarin and Eskimo, and try to establish regula r phonetic correspondences. Needless to say, adding to this company Zulu, Basque, and Nivx is not going to "improve" the picture. Alexander Vovin avvovinMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuemiamiu.acs.muohio.edu
Gotcha! There are two separate fallacies in the argument against n-ary comparison which I discussed recently and which Powers, DeLancey, and Guy are now apparently seeking to defend. (1) Janhunen says that the probability of a match occurring purely by chance when you compare Japanese with four languages is four times what it is when you compare it with one language. This simply cannot be true because probabilities are values between 0 and 1. If the probablity in the case of a binary comparison was say .5, then he would be predicting that it would be 2 in the case of n-ary comparison, which is impossible, because 2 is not between 0 and 1. (2) The other fallacy is not purely mathematical, although I suspect that it involves elements of confusio. In any case, no one who argues for n-ary comparison EVER talks about getting a match in 2 out of n languages. Now, if we look at Guy's numbers, in his scenario of a 100-word list with no shifted meanings, he came up with 14.5 probable spurious mathces in a binary comparison but only 5.8 when you are looking for a match between 3 out of 5 languages, 0.13 when you look for one between 4 out of 5, and he does not give the much smaller number yet in the case of 5 out of 5. I am not sure how Jacques defines spurious and so I have not verified the numbers, but they are certainly on the right orders of magnitude. As you consider more and more languages (also as the initial probability of a match declines, which usually happens as you go from toy models to real data), what happens is that you need fewer and fewer out of the n languages being compared to agree. Thus, in Guy's example a match between n - 2 languages out of 5 was less likely to occur by chance than one between 2 out of 2. But if n were 100, i.e., you were comparing 100 languages, then you would not need n - 2 (i.e., 98) languages to agree to be able to do better than with a binary comparison. It would be many many fewer (although I don't know how many since I do not know what formula Jacques is using and what he is assuming about the initial probability of a match). Maybe, he could kindly supply the numbers. And in light of all this, let us add another argument for rejecting Indo-European: Bopp never offered a mathematical demonstration that the relationships he proposed were unlikely to be due to chance, much less by doing a binary comparison of every pair of Indo-European languages. Which I think just goes to show how unrealistic the whole idea of doing such comparisons is. But if you do want to do them, then at least let us be clear about how to do them so as to minimize false positives (i.e. matches due to chance and not really reflective of common origin) as well as false negatives (i.e., failures to find genuine historical connections). On the second point, there are arguments that n-ary is better. Alexis MRMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue