LINGUIST List 21.3150

Tue Aug 03 2010

Review: Genetic Classification: Sidwell (2009)

Editor for this issue: Joseph Salmons <>

        1.    Anish Koshy, Classifying the Austroasiatic languages

Message 1: Classifying the Austroasiatic languages
Date: 03-Aug-2010
From: Anish Koshy <>
Subject: Classifying the Austroasiatic languages
E-mail this message to a friend

Discuss this message

Announced at

AUTHOR: Sidwell, Paul James TITLE: Classifying the Austroasiatic languages SUBTITLE: History and state of the art SERIES TITLE: Studies in Asian Linguistics 76 PUBLISHER: Lincom YEAR: 2009

Anish Koshy, Department of ELT, Linguistics & Phonetics, The English & Foreign Languages University, Hyderabad, India


Originally intended as a companion to Parkin (1991), 'Classifying the Austroasiatic languages' is a reference work and a history of the field, including 55 figures, 17 tables and 43 plates. This survey developed from a 50-page report prepared for the 'Multitree Project' (hosted by LINGUIST List). Sidwell introduces the complexities of classification and conflicting views of scholars who have attempted it, especially the relationship holding between various Austroasiatic branches and the chronology and extent of diversity within Austroasiatic, and ways to remedy those problems. Unresolved areas include: (a) a lack of proto-Austroasiatic reconstruction, limiting what we know about retention or innovation, (b) a lack of consensus about how Austroasiatic should be split between Munda and other languages, (c) variation within branches, (d) lack of adequate resources, survey works, compilations, etc.

Sidwell sees an 1850 footnote by J.R. Logan as the first recognition of Austroasiatic who later posited a Mon-Annam formation, which apart from a few odd inclusions contained the entire Austroasiatic phylum as we know it today. Sidwell credits Logan with recognizing what came to be known as the conservative nature of Munda. Sidwell also reviews Austroasiatic scholarship in the context of the racial theory of Turanian languages, a cover term for languages of 'uncivilized' races as opposed to the more 'civilized' Europeans, the Aryan race! Other significant developments and debates in Austroasiatic scholarship in the pre-20th century years include the debate on the inclusion/exclusion of Dravidian from the Austroasiatic phylum (Müller and Grierson), the status of Munda, Khasi and Nicobarese as separate sub-groups or families (Cust and Kuhn), the relationship between Cambodian and Vietnamese (Cust), the role of chance and diffusion in the sharing of common features (Forbes and Blagden), the status of Khasi vis-à-vis the other Mon-Khmer languages (Kuhn), the unfortunate correlations drawn between languages on the basis of racial considerations like skin-tone (Keane and Blagden) and the contribution of the Linguistic Survey of India's (LSI) in terms of providing data on many languages for the first time.

The first half of the 20th century saw some major developments in Austroasiatic scholarship though some unresolved controversies carried on from the previous century. The overall conception of Austroasiatic by various scholars during this period was largely motivated by their own orientation -- either neogrammarian or difussionist. Schmidt was one of the major proponents of the neogrammarian approach while Blagden championed the diffusionist cause (strongly supported by the state of affairs in Vietnamese like presence of tone, influence of Tai, etc). Schmidt has come to be known for his grand 'Austric' hypothesis consisting of both Austroasiatic and Austronesian. Schmidt's works suffered from the exclusion of Vietic from his proposed phylum, largely due to the lack of data on minor Vietic varieties, which had not undergone drastic areal influence unlike Vietnamese. His maps, also reproduced in the book, however, show that he never doubted that Vietnamese was Austroasiatic. Although Grierson places Mon-Khmer and Munda into different, much larger ethnolinguistic groupings in the LSI, one can see a transformation in his ideas, from claiming that '' ... it was not a matter of doubt that Munda and Mon-Khmer families had no common parentage ... '' (1904, 2), to making a dramatic turn-around stating, '' ... the Munda and the Mon-Khmer languages are derived from one and the same base ... '' (1906, 14), influenced initially by Kuhn and later by Schmidt. For those who accepted the Austroasiatic phylum, the view that the Munda structure was conservative and hence closer to the proto-Austroasiatic was axiomatic for years. The debate on the inclusion of Vietnamese raged on till the 1920s when Vietic was firmly placed within Austroasiatic with stronger evidence from minor tongues like Muong. Sebeok (1942), which has strangely become one of the most quoted papers on Austroasiatic classification, was willing only to concede a narrow Mon-Khmer group and had reservations on accepting Munda, Aslian, Vietic within Austroasiatic along with Mon-Khmer. More convincing arguments in favour of an Austroasiatic phylum came later from the studies of Haudricourt and Pinnow.

The second half of the 20th century saw Swadesh's lexicostatistical methods being employed to gauge the distance and relationships between various Mon-Khmer languages. A noted exclusion from most of these studies was that of Munda languages and at times that of Nicobarese and Khasian languages, which was clearly a geographical exclusion. Pinnow, during this period, made major contributions to Kharia studies, and offered insights into proto-Munda reconstruction through his etymological dictionary. While Pinnow's approach persisted with the tradition of excluding Vietic from Mon-Khmer (in spite of recent advances made on its inclusion into the Mon-Khmer fold by Haudricourt), Shafer's works on the classification of Austroasiatic suffered from the exclusion of Munda from it. Among the practitioners of lexicostatistics, Thomas and Headley's four-way classification of the Austroasiatic phylum into Munda, Mon-Khmer, Malacca (Aslian) and Nicobarese, and the 12-way classification of Mon-Khmer are noteworthy contribution, and went on to influence the works of CNRS, including an ethnolinguistic atlas and that of Diffloth, especially his presentation of the Austroasiatic languages in the much-quoted Encyclopedia Britannica article. Another noteworthy contribution of this period was the work on 'A Mon-Khmer Comparative Dictionary' by Shorto (published posthumously). One of the major developments of this period was Diffloth's argument against placing Munda as a distinct sister family to the entire set of Mon-Khmer preferring rather to posit it as a sister to multiple sub-families of the Austroasiatic phylum. Sidwell's own studies have been non-lexicostatistical in their approach and propose 9 clades within Mon-Khmer, which is proposed as a sister to the Munda family within the Austroasiatic phylum. On the issue of Urheimat (homeland) of the Austroasiatic people, three major proposals are discussed: (a) the Austroasiatic speakers came from the northern regions of India, (b) the origin was near the Yangste river in China and (c) the origin was in Southeast Asia. Sidwell supports the third speculation with a Southeast origin taking into account the widespread practice of rice cultivation. This he feels is a stronger argument than that advanced for the first proposal, namely, the conservative nature of Munda morphology.

The rest of the book is a discussion on the history and state of the art of the Austroasiatic phylum in terms of the 12 branches: Aslian, Bahnaric, Katuic, Khasian, Khmeric, Khmuic, Monic, Munda, Nicobaric, Palaungic, Pearic and Vietic. Sidwell does this by detailed listings and summaries of the contributions of the major scholars associated with each of the branches.

The 'Aslian' branch (earlier called Malaccan) with a small number of speakers has been a witness to a multitude of confusing language/sub-group/dialect nomenclatures. Schmidt's neogrammarian approach and analysis of Aslian is noted to have prevailed to this date. Contemporary scholars ascribe three sub-branches to this branch: North, Central and South or Jehaic, Senoic and Semelaic. The number of languages in the branch is around a score.

One of the most internally diverse branches of Austroasiatic, the sub-groupings within 'Bahnaric' is far from settled. Even the status of Bahnar is far from clear. The spread of the languages of this branch in three different countries and the non-contiguous nature of the settlements where the languages are spoken has only complicated the attempts to study the branch. Scholars have at times even listed the multiple languages belonging to Bahnaric as distinct constituents of the Mon-Khmer family. One of the most influential contributions has been Thomas and Headley's lexicostatistical work recognizing three neat groupings: South, North and West Bahnaric. The language Bahnar, which is phonologically similar to South Bahnar, is geographically closer to North Bahnar and shares most of its vocabulary with North Bahnar and hence has posed major problems in classification. The study of a few previously poorly documented languages has led scholars to suggest more sub-branches of the Bahnaric languages. Sidwell's original research has offered a historical-phonological approach, proposing three coordinate divisions. Glottochronological studies have thrown up a still newer classification system.

Two different approaches, one based on lexicostatistics and the other on historical-phonological data have dominated the debate on 'Katuic' classification (spoken in Thailand, Cambodia, Laos, and Vietnam). That they are spoken in territories not always accessible politically has not helped the situation. The lexicostatistical approach has consistently posited Katu as a distinct sub-branch, while there has been no consensus on the rest. Although Schmidt had noted the existence of Katuic languages early in the Austroasiatic tradition, most studies had resisted positing a separate/distinct Katuic sub-branch within Mon-Khmer, until the 1970s, when Thomas and Headley posited 17 Katuic languages without attempting any sub-grouping. From then on, how these 17 languages are to be sub-grouped has not met with consensus among scholars.

The study of the various varieties of Mon-Khmer varieties in the 'Khasian' branch, representing languages spoken in Meghalaya, India, has been marred by an absolute dearth of comparable data. There has been no consensus on even the number of varieties to be recognized. The excessive attention paid to the standard variety has led to a lack of study on other varieties. Until the end of the 19th century, the Mon-Khmer connection of the Khasian languages was not known but by the time the LSI was published, George A. Grierson seems confident in the Mon-Khmer lineage of Khasian. He discusses four varieties in the LSI: standard Khasi, Pnar, Lyngngam and War, without any suggestions on how they relate genealogically and along the language-dialect continuum. The scholarly tradition has variously called the varieties dialects or has sometimes remained non-committal. The state of the art today favours a two-way split between Khasi and War, with War considered more archaic and constituting the connection of Khasian with the larger Mon-Khmer group than does standard Khasi. It is also noted that the Khasian branch is one with relatively moderate internal diversity.

A single-page discussion of the 'Khmeric' branch posits a branch consisting of only a single language, Khmer, the national language of Cambodia and a few minor varieties. Western Khmer is noted as an archaic variety with all extant varieties believed to be descended from Middle Khmer. There are written records for this branch dating from the 7th Century CE.

There have been only sporadic studies on the 'Khmuic' branch (spoken mostly in Laos but also in Thailand, China and Vietnam). Except the Khmu dialects, smaller varieties have largely been neglected. Most studies only provide listing of the languages with no comments or commitments on the issue of internal sub-grouping. Earlier suggestions of a principled division between Khmu (and its many dialects) versus a division that includes every other Khmuic language has been challenged by later studies.

The 'Monic' languages, descendants of the Old Mon language of the first millennium Dvaravati civilization, are represented by two languages -- Mon (earlier called Paguan) and Nyah Kur, a language believed to be moribund. Mon is however spoken by close to a million speakers (mostly in Myanmar and also in Thailand) and like Khmer, has a recorded history, which dates back to more than one and a half millennia with the modern varieties descending from middle Mon.

The scholarly tradition in classifying the roughly ten 'Munda' languages (spoken in Eastern India) has been a divided house between a tradition that advocates a four-way split between Eastern, Western, Central and Southern (also supported by lexicostatistical studies), and another (the more recent ones) that prefers a two-way split between Southern and Northern. It was only in the 19th century that the Munda languages came into recognition as being distinct from the languages of the Indo-Aryan and Dravidian stock. Scholars were for long skeptical of Munda being part of Austroasiatic proper. It was Pinnow's canonical work in 1959 that settled the issue in favour of a distinct primary branch within the Austroasiatic phylum.

Spoken in an isolated island cluster administered by India, the 'Nicobaric' languages are probably among the least researched among the Austroasiatic languages. Most studies are dictionaries from the colonial era and there's been only one known detailed study (Braine, 1970). The state of the art recognizes six varieties -- Car-Nicobar, Chowra, Teresa, Central, Southern and Shompen, with no progress on internal groupings.

The 'Palaungic' languages spoken in Myanmar, China and Laos are widely dispersed. The total number of sub-groups within this branch hasn't met with universal agreement. However, Diffloth and Zide's much-cited 1992 suggestion, itself a revision of Diffloth's earlier works and Pinnow's works, recognizes only two main splits: Eastern and Western, although Schmidt favoured four clades in the branch. With emerging work on minor languages, a separate Mangic sub-branch has also been suggested.

'Pearic' is a small branch of highly endangered languages. The number of sub-groups is still debated, but there seems to be consensus on a primary split between Pear and a group of speech forms in a continuum called 'Chang'. Most of what is known about the branch is from collected lexicons and little is known about the structure of these languages.

Contemporary divisions on the exact composition of the 'Vietic' branch reflect an obsession only with Standard Vietnamese at the cost of neglecting other varieties. Contemporary scholars agree on four branches: Viet-Muong, Pong/Toum, Chut and Thavung/Pakatan. The remaining composition is debated.


This book deals with a phylum that has been studied for close to 170 years, with publications in the mainstream and sometimes obscure places, making a historical and state of the art overview of the classification a daunting task. Sidwell achieves that in only 158 pages, a compact and tightly woven book. While one may appreciate the author for the challenge that lies in reading all the available materials to come up with definite/tentative conclusions, the most challenging task before him must have been to acquire all the materials from the diverse places where they rest. He does a very good job of taking us through multiple views, often conflicting, on classification of the various Austroasiatic branches. He regrets the lack of critical questioning about Austroasiatic classification and is to be commended for being neutral in most instances, though he does not withdraw from presenting his own analysis and classification, where required. One would agree with Sidwell that the dearth of available information has also hindered useful interaction between linguists, geneticists and archaeologists with respect to the other families.

Studies in this field have been undertaken in different corners of the world with hardly any forum for interaction among scholars, leading to confusion and disagreement on many matters. Sidwell's acknowledgment that this work is not the final word on Austroasiatic classification is not a reflection of any lack of inspiration on the author's part, but a realistic reflection of the state in which Austroasiatic studies lie today. If there is one message, it is the need for consolidation of ideas in this field. That some of the languages of the family are spoken in terrains which are geographically or politically hostile have not helped the cause either. In the 170 years of scholarship, dating from missionary studies to the present scholars, the field hardly has any native speakers taking up linguistic descriptions and classification studies, reflecting a serious lacuna in the field. With hardly any governmental interest in the study of many tribal languages, and the initial phase of a racial lumping together of all non-European speech-forms on the basis of skin-tone of the speakers, the field has grown slowly. Sidwell notes grimly how race played a significant role, in how, many scholars attempted genetic study of languages. That many varieties have recently been studied and some still remain unexplored, makes the wait for any definitive conclusions on classification still longer. Sidwell notes that most of the attempts at classification of these languages have been lexicostatistical and typological and not on the robust cladistic studies of phonology and lexicon. This could be a major reason behind the lack of robustness in Austroasiatic classification.

A book with such an expansive coverage will have omissions. Notably lacking here is an index of languages and an index of the scholars. In his introductory remarks, Sidwell notes that language plays an important role in understanding the history of the regions inhabited by its speakers. He places the need for a proper classificatory understanding of the Austroasiatic phylum in this context. However he does not develop this theme any further in the book except for a brief discussion on the Urheimat of the Austroasiatic people (based more on ethnographic than linguistic grounds). Of course with any book that attempts to take into account a vast range of studies, some readers may find that scholastic tradition and views on certain branches have been represented quite adequately, while the discussion on some other branches have been sketchy. Sidwell is an expert on Bahnaric and well-read on this branch, which is reflected by 14 pages on Bahnaric while branches like Khmeric, Monic, Munda have less than 5 pages devoted to them.

The book is marred by glaring typographical and grammatical errors, such as: 'just because it is has a strikingly different typology' (58), 'vernaculars could be prove very important' (101), 'Census if India data for Nicobars' (124) among others. The References section follows no established norms and is internally inconsistent. While the reproduction of plates has been carried out well, the quality of some of the tables and figures is disappointing: Why are there scanned tables and figures when these could have been reset? For example, the table representing Mason's Talaing-Kole comparison (7) is a low-resolution scan, and difficult to read at some places. The figure in (130) on the 'Classification of Waic dialects by Diffloth (1980)' suffers similarly due to its low resolution.

These minor matters aside, one thing is sure: future students of Austroasiatic will be grateful to Sidwell for having brought together these major references and works on Austroasiatic languages. One would also hope that a companion volume to this would also be made available in the future with more detailed analysis of the phylum.


Braine, Jean C. 1970. Nicobarese Grammar (Car dialect). PhD Dissertation. Berkley: University of California.

Diffloth, Gérard. 1974. Austro-Asiatic Languages. In Encyclopaedia Britannica. Chicago, London, Toronto, Geneva: Encylcopadia Inc.

Diffloth, Gérard, and Norman Zide. 1992. Austro-Asiatic languages. In International Encyclopedia of Linguistics, Vol. I, ed. by W. Bright. New York: OUP.

Grierson, George A. 1904. Mon-Khmer and Siamese-Chinese families. In Linguistic Survey of India, Vol. II. New Delhi: MLBD.

Grierson, George A. 1906. Munda and Dravidian families. In Linguistic Survey of India, Vol. IV. New Delhi: MLBD.

Parkin, Robert. 1991. A guide to Austroasiatic speakers and their languages. Oceanic Linguistics Special Publications No. 23. Honolulu: University of Hawaii Press.

Pinnow, Heinz-Jürgen. 1959. Versuch einer historischen Lautlehre der Kharia-Sprache. Wiesbaden: Otto Harrassowitz.

Sebeok, Thomas A. 1942. An examination of the Austro-Asiatic language family. Language 1 (8): 206-217.

Shorto, Harry L. 2006. A Mon-Khmer Comparative Dictionary. Canberra Pacific Linguistics 579.

Swadesh, Morris. 1952. Lexico-statistical dating of prehistoric ethnic contacts: With special reference to North American Indians and Eskimos. Proceedings of the American Philosophical Society 9 (6): 452-463.

Thomas, David, and Robert K. Headley Jr. 1970. More on Mon-Khmer subgroupings. Lingua 2 (5): 398-418.


Anish Koshy is an Assistant Professor in Linguistics in the Department of ELT, Linguistics & Phonetics, The English & Foreign Languages University, Hyderabad, India. His research interests lie in working on the lesser-studied languages of India, and working on South Asian languages from a typological perspective. He is currently working on the typological nature of clitics in the Austroasiatic languages of India, namely the Munda and the Khasian branches.

Page Updated: 03-Aug-2010