Editor for this issue: <>
"Part I", for, leisure allowing, there will we follow-ups, in which I propose to discuss Sforza's methodology, which I consider worthless: in breach of the scientific method, and mathematically incorrect. GENES, PEOPLES AND LANGUAGES? An Examination of a Hypothesis by Cavalli-Sforza (Scientific American, November 1991, pp.72-78) To summarize the author's findings in his own words: Our [genetic] reconstruction finds striking parallels in a recent classification of languages. Genes, people and languages have thus diverged in tandem. The results are illustrated in two facing trees, on pages 76 and 77 of his article in Scientific American, November 1991, which I reproduce here. The tree lists 38 populations, classified genetically. In the linguistic part of the graph 19 language families are listed [note 1]. The chart is headed by this caption: "Correlation of Peoples and Languages", by which we may feel justified in believing that it presents the evidence for the claimed correlation. I shall grant the author's reconstruction, and taking it as correct and true, show that, contrary to what is claimed in the article and repeated in the caption, the evidence presented therein shows, on the very contrary, that there is no correlation between peoples and languages. Here is the chart in question, reproduced as well as I could. I have made every effort in trying to reproduce the lengths of the branches of the genetic tree faithfully, without completely succeeding. At its widest, this ASCII tree is 94 columns wide and may be incorrectly displayed on your terminal. With these warnings: Correlation of Peoples and Languages Genetics Populations Linguistic Families ______.---------- Mbuti Pygmy ----(Original language unknown) | |____.----- W.African --.__ Niger-Khordodanian ___________| |__.-- Bantu ---' | | `-- Nilotic ------ Nilo-Saharian | |__.-------------- San(Bushman) ------ Khoisan | `-------------- Ethiopian --. | .------ Berber ---|-- Afro-Asiatic--. | .--| _.-- SW Asian ---' | | __| `-| `-- Iranian --. | | | | `---- European ---|__ Indo-Europ.____| | .--| `--------- Sardinian ---| ====|============. | ____| |_____.------ Indian ---' | | | | | `------ SE Indian ------ Dravidian -----| | | | `--------------- Lapp --.__ Uralic ________| | | | _____.-- Samoyed ---' ========|============| | | __| `-- Mongol --. |-Nostratic | | | | | .----- Tibetan ** | ** Sino-Tibetan | |==Eurasiatic | _| .--| `--|__.-- Korean ---| | | | | | | | `-- Japanese ---|__ Altaic ________| | | | | .--| `----------- Ainu ---| ========:============| | | | | | .----------- Siberian ---' : | | | | | `--|__.-------- Eskimo ------ Eskimo-Aleut ==:============| | | `--| `-------- Chukchi ------ Chukchi-Kam. ==:============' | | | __.-------- S.Amerind --. : | | | .--| `-------- C.Amerind ---|-- Amerind .......: | | `--| `----------- N.Amerind ---' |______| `-------------- NW Amerind ------ Na-Dene ------- | .--------- S.Chinese ****** Sino-Tibetan | |______.-- Mon-Khmer ------ Austro-Asiatic. | _____| `-- Thai ------ Daic-----------| | | |--------- Indonesian --. |-Austric | _____| |--------- Malaysian ---| | || | `--------- Philippine ---|-- Austronesian --' || |.-------------- Polynesian ---| || `|___.---------- Micronesian ---' | `---------- Melanesian --.__ Indo-Pacific -- |___.----------------- New Guinean ---' `----------------- Australian ------ Australian ---- (Linguistic classification from Merritt Ruhlen, A GUIDE TO THE WORLD'S LANGUAGES) Let us examine the linguistic tree. First, we must note that the caption "Linguistic Families" above the linguistic tree is misleading. The leftmost branchings do not correspond to any linguistic classification, but are merely lines connecting a language family (e.g. Indo-European) to the populations that speak it (e.g. Iranians, Europeans, Sardinians, Indians). Indeed I need not point out that there is no such thing as a Sardinian or a European branch of Indo-European. The final branchings of the linguistic tree, then, do not represent linguistic, but demographic data. The author having presumably found it impossible to connect Sino-Tibetan to the Tibetan and Chinese without crossing other lines, he has thus "Sino-Tibetan" appearing in two different places in the chart. Second, those lines linking language families to their speakers are selective and misleading. I see no line connecting Uralic to Europe (Finland, Estonia, Hungary) and Southwest Asia (Turkey). I see no line connecting the Austronesian language family to Melanesia (Vanuatu, Solomon Islands, New Caledonia, all Austronesian speakers), and New-Guinea (perhaps half of which speak Austronesian). I deplore the absence of genetic information on the westernmost Austronesian speakers of Madagascar, off the eastern coast of Africa. This said, let us dismiss those objections off-hand, grant that the evidence is true and correctly presented, and examine it for correlations between language and genes. 1. Sino-Tibetan. Sino-Tibetan is shown as spoken by Tibetans and Southern Chinese [note 2]. Tibetans, however, are shown in the genetic tree to be related to (in order of decreasing relatedness): Koreans and Japanese (Altaic); Samoyeds (Uralic) and Mongols (Altaic); then to Ainus (Altaic); next to Siberians (Altaic), Eskimos (Eskimo-Aleut) and Chukchis (Chukchi-Kamchatkan); then to speakers of the great families of American Indian languages (Na-Dene and the rest, lumped here under "Amerind"); and finally to the Chinese. In other words, the only populations more distantly related to the Tibetans than their fellow Sino-Tibetan speakers are those found in Africa: Mbuti, West African, Bantu, Nilotic, Bushmen, Ethiopian. Traversing the genetic tree in the same manner as has just been done, to connect now the Southern Chinese to the Tibetans, one finds that the closest relatives of the Sino-Tibetan-speaking Southern Chinese are, in order of increasing genetic distance: the Mon-Khmer, and Thai and Malay populations, speakers of three distinct language families (Mon-Khmer, Thai and Austronesian [note 3]); then the Polynesians, Micronesians and Melanesians, speakers of Austronesian again and of "Indo-Pacific" (properly Non-Austronesian); next the New-Guineans (Non-Austronesian) and Australian aborigines (Australian); after which only then do we reach the Chinese. The correlation between Sino-Tibetan and genetics is thus strongly negative if anything. 2. Afro-Asiatic. I imagine that Afro-Asiatic corresponds more or less to the language family called, in my younger days, Hamito-Semitic. Afro-Asiatic is spoken by the Ethiopians, Berbers, North Africans, and Southwest Asians (read: the populations of the Middle East). The closest relatives, genetically, of the Ethiopians are the San Bushmen, sole speakers of Khoisan; then, again in order of decreasing relatedness: Mbuti Pygmies, speakers of an isolate, West Africans and Bantus, speakers of Niger-Kordofanian, and Nilotic speakers of Nilo-Saharan; next, to connect Ethiopians to their fellow Afro-Asiatic speakers of North Africa and the Middle East, we have to pass through the origin of the tree. Thus the Ethiopians are maximally distant genetically from their fellow Afro-Asiatic speakers. The correlation here between genes and language is maximally negative. Consider now another Afro Asiatic-speaking population: the Southwest Asians. Their closest genetic relatives are the Iranians, speakers, of course, of Indo-European. Their next closest relatives, the "Europeans", are again Indo-European speakers. Only then do they meet with their Berber and North-African fellow Afro-Asiatic speakers. Thus the genetic evidence presented shows Middle Eastern populations as closer relatives of Indo-European speakers than of their own. A negative correlation again. 3. Indo-European. Four populations only are listed as Indo-European speakers: Iranians, Europeans, Sardinians, and Indians. The Iranians, we have seen, are most closely related to the Afro-Asiatic speakers of the Middle East; the Europeans (presumably Romance, Germanic and Slavic speakers) are more closely related to the Iranians (I-E), Middle Easterners, Berbers and North Africans (all three Semitic speakers) than they are to the Romance-speaking Sardinians. The Indo European-speaking Indians themselves have for closest relatives the Dravidian speakers of South India, and are no more closely related to other Indo-European speakers than they are to Afro-Asiatic speakers. Thus, out of four Indo-European populations, none has for closest relative another speaker of Indo-European. 4. Uralic. Only two member populations here: Lapps, Caucasoids related to the Hamito-Semitic, Indo-European and Dravidian speakers of North-Africa, the Middle East and, Europe and the Indian continent; Samoyeds, relatives of the Asian and American speakers of Sino-Tibetan, Altaic, Eskimo-Aleut, Chukchi-Kamchatkan, Amerind and Na-Dene -- seven different great language families, no less. 5. Altaic. Five member populations: Mongols, Koreans, Japanese, Ainus, and Siberians. As already seen, the Mongols' closest relatives are the Uralic-speaking Samoyeds. Within these five, the only Altaic speakers more closely related to each other than to a linguistic outsider are the Koreans and the Japanese; but they are not more closely related to the remaining Altaic speakers than to the Tibetans (Sino-Tibetan) and Samoyeds (Uralic). The Siberians are closer relatives to the Eskimos and Chukchis (Eskimo-Aleut and Chukchi-Kamchatkan) than to any Altaic speakers; the Ainus are no more closely related to the Koreans, Japanese and Mongols than they are to the Tibetans and Samoyeds. Once again, no correlation. 6. Amerind. The three populations listed are indeed all more closely related to one another than to any linguistic outsider. 7. Austronesian. We have here five Austronesian-speaking populations: Indonesians, Malays, Philippinos, Polynesians and Micronesians. Indonesians, Malays and Philippinos are shown in the chart as equally closely related to one another as to the Sino-Tibetan speakers of South China, the Austroasiatic-speaking Mon-Khmer, and the Daic-speaking Thai. The Austronesian-speaking Micronesians have for closest relatives not the Polynesians (also Austronesian speakers) but the Melanesians, who are given as speakers of Indo-Pacific. Again, no correlation. 8. Indo-Pacific. Only two populations here: New-Guineans, whose closest relatives are the Australian aborigines, members of an isolate language family (Australian); then the Southern Chinese (Sino-Tibetan), Mon-Khmer (Austroasiatic), Thai (Daic), the five Austronesian-speaking populations listed, and finally the Melanesians (Indo-Pacific). The other Indo-Pacific population are the Melanesians, whose closest relatives are the Austronesian-speaking Micronesians, and next Sino-Tibetan, Austroasiatic, and Daic speakers. The correlation here between language and genes is again nil, if not negative. 9. Niger-Kordofanian. Two member populations: the West Africans and the Bantu. The Bantu's closest relatives are not the West Africans, but Nilotic populations, speakers of Nilo-Saharian, an isolated language family. Once again, no correlation. Still remain ten language families to examine, namely: The Mbuti Pygmies' unnamed isolate. Nilo-Saharan. Khoisan. Dravidian. Eskimo-Aleut. Chukchi-Kamtchatkan. Na-Dene. Austroasiatic. Daic. Australian. Each of those ten language families being represented by only one population, there is nothing there to correlate: one cannot correlate a single observation to anything. Thus, in 10 language families out of the 19 used by the author, there is nothing to correlate with the genetic data. Of the 9 remaining language families we have observed only one case where language and genetics concord: the American Indians (Na-Dene speakers excluded). In the other 8 language families we have observed either a total absence of correlation, or even a strongly negative correlation in two cases: Afro-Asiatic and Sino-Tibetan. FOOTNOTES [note 1] The author had to repeat Sino-Tibetan in two different positions in his tree, because it is spoken by two genetically widely-divergent populations: Tibetans and Southern Chinese. [note 2] One may wonder at the absence of the rest of the Chinese population. [note 3] I have reverted momentarily here to calling these language families by their more transparent names of my students' days.Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue