Fri Jul 08 2016

Review: Comp Ling; Historical Ling: Lewis, Pereltsvaig (2015)

Date: 02-Jan-2016
From: David Stifter
Subject: The Indo-European Controversy
AUTHOR: Asya Pereltsvaig
AUTHOR: Martin W. Lewis
TITLE: The Indo-European Controversy
SUBTITLE: Facts and Fallacies in Historical Linguistics
PUBLISHER: Cambridge University Press
YEAR: 2015

REVIEWER: David Stifter

“The Indo-European Controversy” by Asya Pereltsvaig and Martin Lewis takes an article (Bouckaert et al. 2012; hereafter referred to as Gray & Atkinson 2012) from the journal Science as the starting point for their wide-ranging evaluation of non-linguistic contributions to this debate. Chapter 1, “Ideology and interpretation from the 1700s to the 1970s”, contains a concise history of the reception of Indo-European studies since their beginning in the late 18th century, its (mis)use by scholars from extralinguistic disciplines, and its impact or lack thereof on political ideologies in the 19th and 20th centuries; while in Chapter 2“Anatolia vs. the Steppes”, Colin Renfrew’s influential contribution of 1987 is evaluated. His so-called ‘Anatolian hypothesis’ about the spread of the language family via demic diffusion, accompanying the spread of agriculture in the neolithic, has been fervently rejected by basically every linguistic expert conversant with the actual facts of Indo-European Studies, but it continues to exert its appeal on non-specialists. The driving force behind this debate is not so much a linguistic one than one informed by historical and archaeological ideologies, without being rooted in the hard facts of linguistics.

In the central section of the book, “Part II. The failings of the Bayesian phylogenetic research program”, the authors take the methodological fallacies, fundamental misapprehensions, and practical errors apart that riddle Gray & Atkinson’s work. In Chapter 3, “What theory we want and what theory we get”, the authors demonstrate that creating trees of language relationships on the basis of lexical comparisons alone is an approach that is bound to lead to erroneous results. Diachronic phonology offers more reliable criteria to model language relationships.

Chapter 4 “Linguistic fallacies of the Bayesian phylogenetic model” meticulously lists errors and misunderstandings that beset the fundamental assumptions of Gray & Atkinson’s work: the failure to distinguish innovations from retentions, including the failure to understand that because of the nature of linguistic data, especially because of universal phonological tendencies, there is in many instances an intrinsic directionality in change. Phonological change is not unlike more traditional physical events in that they are predicated by the arrow of time and their results may, mutatis mutandis, be viewed as an increase in entropy. For this reason, what is a linguistic innovation can by itself be evident for the expert, whereas a statistical calculation only based on frequencies will unavoidably lead to a wrong picture. Other potential pitfalls are an over-reliance on lexical data, which is intrinsically imprecise because its significance can be blurred by factors such as borrowing, unidentified or only inadequately identified borrowing, or divergent lexical usage, and not awarding sufficient importance to phonology and grammar, which offer more reliable and consistent information about linguistic cognacy. Another stumbling block is the unreflected use of Swadesh lists of core vocabulary for language comparison with their very subjective, if not random character of word selections.

Chapter 5, “Dating problems of the Bayesian phylogenetic model” addresses the inherent difficulty of assigning clear-cut dates to linguistic developments. While glottochronology in the traditional sense has long fallen out of favour, the Bayesian approach advocated by Gray & Atkinson offers only little progress because distorting factors (such as undetected borrowings) can easily skew the statistics. If the amount of data is not large enough, small changes in the input will have massive repercussions on the output. The authors of the present book point out that, while Gray & Atkinson make use of ostensibly objective calibration points, i.e. historically secured events, the significance of such dates is misunderstood. While historical events may be the trigger for sociolinguistic developments that will eventually result in linguistic change, they cannot be identified with the dates of those changes as such. The inherent fuzziness of the data leads to distortions in the calculations that add up to a 3000-year timing difference between the computational model and the consensus or near-consensus view with which scholars in the field actually operate. In Chapter 6, “The historical-geographical failure of the Bayesian phylogenetic model”, Pereltsvaig & Lewis demonstrate how the static and dynamic maps produced by Gray & Atkinson (printed and online) ignore simple historical facts about the distribution of languages and their speech communities and how they end up with historically and politically bizarre scenarios.

Chapter 7, “Unwarranted assumptions”, questions the many fundamental concepts that underlie Gray & Atkinson’s theories of languages spread, i.e. their ideas about migration vs. diffusion, how language spread interacts with population spread, topographical factors (such as sea coasts), and a simplistic view of the contiguity of languages over specific regions. Although images of mass population movements in the old sense of ‘Völkerwanderungen’ are surely simplistic, historical sources abound in evidence for migrations and transplantations of speech communities. The migration model thus receives factual support from countless real-life examples from across the globe, against the theory-driven diffusion model; the crucial point being that even diffusion requires the operation of ‘agents’ to actually take place (Vogl 2012). If the agents are human beings or groups of humans, contingencies come into play that will disrupt any neat numerical model. The final section in this chapter is devoted to those “fallacious and unexamined assumptions” (155) by which biological evolution tends to be used as an analogy for linguistic evolution. Arguing that the similarities are superficial and only pertain to the use of cladistic trees, albeit for quite different purposes in each of the two disciplines, the authors state that the fundamental procedural principles of linguistic and genetic differentiation could in fact not be further apart.

After all the deconstruction, the third section of the book, “Searching for Indo-European origins” , is devoted to the assembling of facts that allow linguists to make constructive statements about the where and the when of the Proto-Indo-European language and its break-up. In Chapter 8, “Why linguists don’t do dates? Or do they?”, Pereltsvaig & Lewis give an introduction to linguistic paleontology and then roll out the major arguments associated with the problem of wheeled transport as an example of this approach. They arrive at the conclusion that Proto-Indo-European could not have started to break up before the origins of wheeled transport ca. 3500 B.C. Another equally important dating criterion of material culture is not mentioned here even though it pins Proto-Indo-European effectively to the same chronological horizon: the presence of the words for ‘sheep’, *h2ou̯i-, and ‘wool’, *h2u̯l̥h1/2no/eh2-, and in particular the fact that the latter word seems to be derived from the former (the initial sequence h2u̯ is identical with the consonantal skeleton of *h2ou̯i- ‘sheep’), give evidence for the acquaintance of the Proto-Indo-European people with woolly sheep. Sheep with sufficient wool for economic exploitation were not known before the 4th mill. b.c. (Mallory & Adams 2006: 238).

Chapter 9, “Triangulating the Indo-European homeland” exploits linguistic archaeology and language contact data to find clues about the localisation of the homeland. The authors argue that, while all the evidence is not always fully conclusive, its cumulative drift is in favour of the traditional theory that Proto-Indo-European was situated in the Pontic Steppes in the 4th millennium BCE In Chapter 10, “The non-mystery of Indo-European expansion” , the authors sketch a sociolinguistic and anthropological scenario to explain the apparent ‘success’ of Indo-European languages. Chapter 11, “Whither historical linguistics?” is concerned with possible alternative methods for establishing and computing language relatedness, e.g. the Parametric Comparison Method (PCM), developed in the ‘Language and Gene Lineages project headed by Giuseppe Longobardi (University of York), a method that is based on generative syntax and that compares fundamental parameters of languages and which holds a promise for exciting future results. In the book’s conclusion, “What is at stake in the Indo-European debate”, the authors find very harsh words against Gray & Atkinson and the journal “Science” (esp. p. 230), the central accusation being that their theory is unempirical.


A lot of professional frustration permeates the introductory pages, but it is the sting of this frustration that motivates Pereltsvaig & Lewis to engage critically with the methodological premises of the allegedly scientific approach. The authors employ a very polemic, pointedly formulated style which at times reminds the reader of lawyers pleading their cause in a criminal court. Right from the ‘Introduction’ (1–15), Pereltsvaig & Lewis leave no doubt about their own position and their intention to refute every bit of the article of contention. Speaking of “incorrect and … incoherent linguistic information” (2), they accuse Gray & Atkinson of building on “erroneous and unexamined suppositions about language differentiation, distribution, and expansion” (3). They view extralinguistic attempts at solving central questions of Indo-European antiquity in a very critical light. Pereltsvaigs’s and Lewis’s basic tenet is that the distribution of human languages is “only vaguely analogous to organic evolution”, “has nothing in common with the spread of viruses” (3), and should therefore be described with different models altogether.

On a positive note, the authors draw from an admirable knowledge of sources, not infrequently making reference to fairly obscure publications, and they employ a broad range of facts from diverse scientific fields. They convert their ‘ammunition’ into a forceful attack not only against Gray & Atkinson, but against the nonchalant reception and treatment of languages, and of historical linguistics in particular, by the public and big media. Their own position, on the basis of which they critically assess alternative hypotheses, reflects the broad consensus of scholars in Indo-European Studies. Therefore, the book is unlikely to change the opinion of anyone who already works in the field, simply because, judging by my own professional experience, most colleagues in the field hold this view anyway. Within the field, the authors would be preaching to the faithful. In fact, much of what is said in the book is commonplace in historical linguistics and in Indo-European Studies in particular. A big achievement of Pereltsvaig & Lewis, though, is to argue the case from so many diverse angles, and for this reason the book can be useful even to specialists.

Bayesian methods have become a popular and powerful tool in many disciplines where large data has to be analysed. In fact, the critique which Pereltsvaig & Lewis direct against the ‘Bayesian phylogenetic model’ does not actually address the mathematical principles of the Bayesian approach as such at all, but is rather concerned with the erroneous application and the non-expert handling of the linguistic input data by Gray & Atkinson. It would, in fact, be interesting to see what results can be achieved if experts in historical linguistics, who have a well-grounded understanding of the data, work together with experts in Bayesian methods.

I want to finish on a more positive note. New facts such as advances in scientific disciplines outside of historical linguistics (e.g. Haak 2015) lend unexpected, but very welcome support to the basic tenets Pereltsvaig & Lewis. The very fact that a book like this had to be written, even though all the facts have been on the table for decades, makes a disillusioning statement about the lack of public impact of the work of historical linguists. It is therefore to be hoped that the book’s main audience will be scholars in neighbouring disciplines, or even further away, who may easily be blinded and led astray by the outwardly shining ‘scientistic’ arguments used by Gray & Atkinson. If it achieves this objective, it will be a very important contribution to the scholarly debate about the origins and the expansion of the Indo-European languages, even though the style in which it is written may at times come across too polemically.

Occasionally, minor errors catch the eye, e.g.: in the satem-group of Indo-European, there is no uniform outcome *s of the Proto-Indo-European palatal *k’ (p. 65), but the precise nature of the sibilant differs from language to language, ranging from plain [s] in Iranian (providing the source for the modern term ‘satem’), to a variety of palatal sounds that remained separate from s in most of the other languages in the group. The self-designation of the Spanish language is not “espagnol” (p. 111), but español. Also, the ruling house of the Austro-Hungarian Empire were the Habsburgs, not the “Hapsburgs”. The (pre-)PIE word *Hok̑tō(u̯) ‘tetrad’, the putative basis of the Proto-Kartvelian loan *otχo-‘four’, was by no means lost in later Indo-European (p. 192), but survives across the board as the numeral ‘eight’. Finally, on p. 198, the presentation of the facts is misleading: the reader gets the impression that Indo-European *pork̑os ‘piglet’ with a palatalised *k̑ had been borrowed into Proto-Finno-Ugric; however, the Proto-Finno-Ugric word is *pɔ̄rš́ɔs borrowed from already satemised Proto-Indo-Iranian.


Bouckaert, R., Lemey, Ph., Dunn, M., Greenhill S.J., Alekseyenko A.V., Drummond A.I., Gray, R.D., Suchard M.A and Atkinson, Q.D. 2012. ‘Mapping the Origins and Expansion of the Indo-European Language Family’. Science 337 no. 6097 (24.8.2012), 957–960 [DOI: 10.1126/science.1219669].

Haak, W. et al. 2015. ‘Massive migration from the steppe was a source for Indo-European languages in Europe’, Nature 522 issue 7555 (11.6.2015, publ. online 2.3.2015), 207–211 [DOI: 10.1038/nature14317]. Precis (12.2.2015) at:
Mallory, J.P. and Adams, D.Q. 2006. The Oxford Introduction to Proto-Indo-European and the Proto-Indo-European World, Oxford – New York: Oxford University Press.
Renfrew, C. 1987. Archaeology and Language: The Puzzle of the Indo-European Origins. London.

Ringe, D. et al. 2002. ‘Indo-European and computational cladistics’, Transactions of the Philological Society 100, 59–129.

Vogl, G. 2012. ‘Fundamentals of diffusion and spread in the natural sciences and beyond’, in: Migrations. Interdisciplinary Perspectives. Eds. M. Messer, R. Schroeder, R. Wodak, Wien: Springer 2012, 261–266.


David Stifter, Professor of Old Irish at the Department of Early Irish at Maynooth University, Ireland. This review was written as part of the research project Chronologicon Hibernicum ( that has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 647351). Part of the project will be to use Bayesian methods for the dating of Early Irish language developments.

