Review of  Classifying the Austroasiatic languages

Reviewer: Anish Koshy
Book Title: Classifying the Austroasiatic languages
Book Author: Paul James Sidwell
Publisher: Lincom GmbH
Linguistic Field(s): Genetic Classification
Language Family(ies): Austro-Asiatic
Issue Number: 21.3150

AUTHOR: Sidwell, Paul James
TITLE: Classifying the Austroasiatic languages
SUBTITLE: History and state of the art
SERIES TITLE: LINCOM Studies in Asian Linguistics 76
YEAR: 2009

Anish Koshy, Department of ELT, Linguistics & Phonetics, The English & Foreign
Languages University, Hyderabad, India


Originally intended as a companion to Parkin (1991), 'Classifying the
Austroasiatic languages' is a reference work and a history of the field,
including 55 figures, 17 tables and 43 plates. This survey developed from a
50-page report prepared for the 'Multitree Project' (hosted by LINGUIST List).
Sidwell introduces the complexities of classification and conflicting views of
scholars who have attempted it, especially the relationship holding between
various Austroasiatic branches and the chronology and extent of diversity within
Austroasiatic, and ways to remedy those problems. Unresolved areas include: (a)
a lack of proto-Austroasiatic reconstruction, limiting what we know about
retention or innovation, (b) a lack of consensus about how Austroasiatic should
be split between Munda and other languages, (c) variation within branches, (d)
lack of adequate resources, survey works, compilations, etc.

Sidwell sees an 1850 footnote by J.R. Logan as the first recognition of
Austroasiatic who later posited a Mon-Annam formation, which apart from a few
odd inclusions contained the entire Austroasiatic phylum as we know it today.
Sidwell credits Logan with recognizing what came to be known as the conservative
nature of Munda. Sidwell also reviews Austroasiatic scholarship in the context
of the racial theory of Turanian languages, a cover term for languages of
'uncivilized' races as opposed to the more 'civilized' Europeans, the Aryan
race! Other significant developments and debates in Austroasiatic scholarship in
the pre-20th century years include the debate on the inclusion/exclusion of
Dravidian from the Austroasiatic phylum (Müller and Grierson), the status of
Munda, Khasi and Nicobarese as separate sub-groups or families (Cust and Kuhn),
the relationship between Cambodian and Vietnamese (Cust), the role of chance and
diffusion in the sharing of common features (Forbes and Blagden), the status of
Khasi vis-à-vis the other Mon-Khmer languages (Kuhn), the unfortunate
correlations drawn between languages on the basis of racial considerations like
skin-tone (Keane and Blagden) and the contribution of the Linguistic Survey of
India's (LSI) in terms of providing data on many languages for the first time.

The first half of the 20th century saw some major developments in Austroasiatic
scholarship though some unresolved controversies carried on from the previous
century. The overall conception of Austroasiatic by various scholars during this
period was largely motivated by their own orientation -- either neogrammarian or
difussionist. Schmidt was one of the major proponents of the neogrammarian
approach while Blagden championed the diffusionist cause (strongly supported by
the state of affairs in Vietnamese like presence of tone, influence of Tai,
etc). Schmidt has come to be known for his grand 'Austric' hypothesis consisting
of both Austroasiatic and Austronesian. Schmidt’s works suffered from the
exclusion of Vietic from his proposed phylum, largely due to the lack of data on
minor Vietic varieties, which had not undergone drastic areal influence unlike
Vietnamese. His maps, also reproduced in the book, however, show that he never
doubted that Vietnamese was Austroasiatic. Although Grierson places Mon-Khmer
and Munda into different, much larger ethnolinguistic groupings in the LSI, one
can see a transformation in his ideas, from claiming that '' ... it was not a
matter of doubt that Munda and Mon-Khmer families had no common parentage ... ''
(1904, 2), to making a dramatic turn-around stating, '' ... the Munda and the
Mon-Khmer languages are derived from one and the same base ... '' (1906, 14),
influenced initially by Kuhn and later by Schmidt. For those who accepted the
Austroasiatic phylum, the view that the Munda structure was conservative and
hence closer to the proto-Austroasiatic was axiomatic for years. The debate on
the inclusion of Vietnamese raged on till the 1920s when Vietic was firmly
placed within Austroasiatic with stronger evidence from minor tongues like
Muong. Sebeok (1942), which has strangely become one of the most quoted papers
on Austroasiatic classification, was willing only to concede a narrow Mon-Khmer
group and had reservations on accepting Munda, Aslian, Vietic within
Austroasiatic along with Mon-Khmer. More convincing arguments in favour of an
Austroasiatic phylum came later from the studies of Haudricourt and Pinnow.

The second half of the 20th century saw Swadesh's lexicostatistical methods
being employed to gauge the distance and relationships between various Mon-Khmer
languages. A noted exclusion from most of these studies was that of Munda
languages and at times that of Nicobarese and Khasian languages, which was
clearly a geographical exclusion. Pinnow, during this period, made major
contributions to Kharia studies, and offered insights into proto-Munda
reconstruction through his etymological dictionary. While Pinnow’s approach
persisted with the tradition of excluding Vietic from Mon-Khmer (in spite of
recent advances made on its inclusion into the Mon-Khmer fold by Haudricourt),
Shafer’s works on the classification of Austroasiatic suffered from the
exclusion of Munda from it. Among the practitioners of lexicostatistics, Thomas
and Headley’s four-way classification of the Austroasiatic phylum into Munda,
Mon-Khmer, Malacca (Aslian) and Nicobarese, and the 12-way classification of
Mon-Khmer are noteworthy contribution, and went on to influence the works of
CNRS, including an ethnolinguistic atlas and that of Diffloth, especially his
presentation of the Austroasiatic languages in the much-quoted Encyclopedia
Britannica article. Another noteworthy contribution of this period was the work
on 'A Mon-Khmer Comparative Dictionary' by Shorto (published posthumously). One
of the major developments of this period was Diffloth’s argument against placing
Munda as a distinct sister family to the entire set of Mon-Khmer preferring
rather to posit it as a sister to multiple sub-families of the Austroasiatic
phylum. Sidwell's own studies have been non-lexicostatistical in their approach
and propose 9 clades within Mon-Khmer, which is proposed as a sister to the
Munda family within the Austroasiatic phylum. On the issue of Urheimat
(homeland) of the Austroasiatic people, three major proposals are discussed: (a)
the Austroasiatic speakers came from the northern regions of India, (b) the
origin was near the Yangste river in China and (c) the origin was in Southeast
Asia. Sidwell supports the third speculation with a Southeast origin taking into
account the widespread practice of rice cultivation. This he feels is a stronger
argument than that advanced for the first proposal, namely, the conservative
nature of Munda morphology.

The rest of the book is a discussion on the history and state of the art of the
Austroasiatic phylum in terms of the 12 branches: Aslian, Bahnaric, Katuic,
Khasian, Khmeric, Khmuic, Monic, Munda, Nicobaric, Palaungic, Pearic and Vietic.
Sidwell does this by detailed listings and summaries of the contributions of the
major scholars associated with each of the branches.

The ‘Aslian’ branch (earlier called Malaccan) with a small number of speakers
has been a witness to a multitude of confusing language/sub-group/dialect
nomenclatures. Schmidt's neogrammarian approach and analysis of Aslian is noted
to have prevailed to this date. Contemporary scholars ascribe three sub-branches
to this branch: North, Central and South or Jehaic, Senoic and Semelaic. The
number of languages in the branch is around a score.

One of the most internally diverse branches of Austroasiatic, the sub-groupings
within ‘Bahnaric’ is far from settled. Even the status of Bahnar is far from
clear. The spread of the languages of this branch in three different countries
and the non-contiguous nature of the settlements where the languages are spoken
has only complicated the attempts to study the branch. Scholars have at times
even listed the multiple languages belonging to Bahnaric as distinct
constituents of the Mon-Khmer family. One of the most influential contributions
has been Thomas and Headley’s lexicostatistical work recognizing three neat
groupings: South, North and West Bahnaric. The language Bahnar, which is
phonologically similar to South Bahnar, is geographically closer to North Bahnar
and shares most of its vocabulary with North Bahnar and hence has posed major
problems in classification. The study of a few previously poorly documented
languages has led scholars to suggest more sub-branches of the Bahnaric
languages. Sidwell’s original research has offered a historical-phonological
approach, proposing three coordinate divisions. Glottochronological studies have
thrown up a still newer classification system.

Two different approaches, one based on lexicostatistics and the other on
historical-phonological data have dominated the debate on ‘Katuic’
classification (spoken in Thailand, Cambodia, Laos, and Vietnam). That they are
spoken in territories not always accessible politically has not helped the
situation. The lexicostatistical approach has consistently posited Katu as a
distinct sub-branch, while there has been no consensus on the rest. Although
Schmidt had noted the existence of Katuic languages early in the Austroasiatic
tradition, most studies had resisted positing a separate/distinct Katuic
sub-branch within Mon-Khmer, until the 1970s, when Thomas and Headley posited 17
Katuic languages without attempting any sub-grouping. From then on, how these 17
languages are to be sub-grouped has not met with consensus among scholars.

The study of the various varieties of Mon-Khmer varieties in the 'Khasian'
branch, representing languages spoken in Meghalaya, India, has been marred by an
absolute dearth of comparable data. There has been no consensus on even the
number of varieties to be recognized. The excessive attention paid to the
standard variety has led to a lack of study on other varieties. Until the end of
the 19th century, the Mon-Khmer connection of the Khasian languages was not
known but by the time the LSI was published, George A. Grierson seems confident
in the Mon-Khmer lineage of Khasian. He discusses four varieties in the LSI:
standard Khasi, Pnar, Lyngngam and War, without any suggestions on how they
relate genealogically and along the language-dialect continuum. The scholarly
tradition has variously called the varieties dialects or has sometimes remained
non-committal. The state of the art today favours a two-way split between Khasi
and War, with War considered more archaic and constituting the connection of
Khasian with the larger Mon-Khmer group than does standard Khasi. It is also
noted that the Khasian branch is one with relatively moderate internal diversity.

A single-page discussion of the 'Khmeric' branch posits a branch consisting of
only a single language, Khmer, the national language of Cambodia and a few minor
varieties. Western Khmer is noted as an archaic variety with all extant
varieties believed to be descended from Middle Khmer. There are written records
for this branch dating from the 7th Century CE.

There have been only sporadic studies on the 'Khmuic' branch (spoken mostly in
Laos but also in Thailand, China and Vietnam). Except the Khmu dialects, smaller
varieties have largely been neglected. Most studies only provide listing of the
languages with no comments or commitments on the issue of internal sub-grouping.
Earlier suggestions of a principled division between Khmu (and its many
dialects) versus a division that includes every other Khmuic language has been
challenged by later studies.

The 'Monic' languages, descendants of the Old Mon language of the first
millennium Dvaravati civilization, are represented by two languages -- Mon
(earlier called Paguan) and Nyah Kur, a language believed to be moribund. Mon is
however spoken by close to a million speakers (mostly in Myanmar and also in
Thailand) and like Khmer, has a recorded history, which dates back to more than
one and a half millennia with the modern varieties descending from middle Mon.

The scholarly tradition in classifying the roughly ten 'Munda' languages (spoken
in Eastern India) has been a divided house between a tradition that advocates a
four-way split between Eastern, Western, Central and Southern (also supported by
lexicostatistical studies), and another (the more recent ones) that prefers a
two-way split between Southern and Northern. It was only in the 19th century
that the Munda languages came into recognition as being distinct from the
languages of the Indo-Aryan and Dravidian stock. Scholars were for long
skeptical of Munda being part of Austroasiatic proper. It was Pinnow's canonical
work in 1959 that settled the issue in favour of a distinct primary branch
within the Austroasiatic phylum.

Spoken in an isolated island cluster administered by India, the 'Nicobaric'
languages are probably among the least researched among the Austroasiatic
languages. Most studies are dictionaries from the colonial era and there’s been
only one known detailed study (Braine, 1970). The state of the art recognizes
six varieties -- Car-Nicobar, Chowra, Teresa, Central, Southern and Shompen,
with no progress on internal groupings.

The 'Palaungic' languages spoken in Myanmar, China and Laos are widely
dispersed. The total number of sub-groups within this branch hasn't met with
universal agreement. However, Diffloth and Zide's much-cited 1992 suggestion,
itself a revision of Diffloth's earlier works and Pinnow’s works, recognizes
only two main splits: Eastern and Western, although Schmidt favoured four clades
in the branch. With emerging work on minor languages, a separate Mangic
sub-branch has also been suggested.

'Pearic' is a small branch of highly endangered languages. The number of
sub-groups is still debated, but there seems to be consensus on a primary split
between Pear and a group of speech forms in a continuum called 'Chang'. Most of
what is known about the branch is from collected lexicons and little is known
about the structure of these languages.

Contemporary divisions on the exact composition of the ‘Vietic’ branch reflect
an obsession only with Standard Vietnamese at the cost of neglecting other
varieties. Contemporary scholars agree on four branches: Viet-Muong, Pong/Toum,
Chut and Thavung/Pakatan. The remaining composition is debated.


This book deals with a phylum that has been studied for close to 170 years, with
publications in the mainstream and sometimes obscure places, making a historical
and state of the art overview of the classification a daunting task. Sidwell
achieves that in only 158 pages, a compact and tightly woven book. While one may
appreciate the author for the challenge that lies in reading all the available
materials to come up with definite/tentative conclusions, the most challenging
task before him must have been to acquire all the materials from the diverse
places where they rest. He does a very good job of taking us through multiple
views, often conflicting, on classification of the various Austroasiatic
branches. He regrets the lack of critical questioning about Austroasiatic
classification and is to be commended for being neutral in most instances,
though he does not withdraw from presenting his own analysis and classification,
where required. One would agree with Sidwell that the dearth of available
information has also hindered useful interaction between linguists, geneticists
and archaeologists with respect to the other families.

Studies in this field have been undertaken in different corners of the world
with hardly any forum for interaction among scholars, leading to confusion and
disagreement on many matters. Sidwell's acknowledgment that this work is not the
final word on Austroasiatic classification is not a reflection of any lack of
inspiration on the author's part, but a realistic reflection of the state in
which Austroasiatic studies lie today. If there is one message, it is the need
for consolidation of ideas in this field. That some of the languages of the
family are spoken in terrains which are geographically or politically hostile
have not helped the cause either. In the 170 years of scholarship, dating from
missionary studies to the present scholars, the field hardly has any native
speakers taking up linguistic descriptions and classification studies,
reflecting a serious lacuna in the field. With hardly any governmental interest
in the study of many tribal languages, and the initial phase of a racial lumping
together of all non-European speech-forms on the basis of skin-tone of the
speakers, the field has grown slowly. Sidwell notes grimly how race played a
significant role, in how, many scholars attempted genetic study of languages.
That many varieties have recently been studied and some still remain unexplored,
makes the wait for any definitive conclusions on classification still longer.
Sidwell notes that most of the attempts at classification of these languages
have been lexicostatistical and typological and not on the robust cladistic
studies of phonology and lexicon. This could be a major reason behind the lack
of robustness in Austroasiatic classification.

A book with such an expansive coverage will have omissions. Notably lacking here
is an index of languages and an index of the scholars. In his introductory
remarks, Sidwell notes that language plays an important role in understanding
the history of the regions inhabited by its speakers. He places the need for a
proper classificatory understanding of the Austroasiatic phylum in this context.
However he does not develop this theme any further in the book except for a
brief discussion on the Urheimat of the Austroasiatic people (based more on
ethnographic than linguistic grounds). Of course with any book that attempts to
take into account a vast range of studies, some readers may find that scholastic
tradition and views on certain branches have been represented quite adequately,
while the discussion on some other branches have been sketchy. Sidwell is an
expert on Bahnaric and well-read on this branch, which is reflected by 14 pages
on Bahnaric while branches like Khmeric, Monic, Munda have less than 5 pages
devoted to them.

The book is marred by glaring typographical and grammatical errors, such as:
'just because it is has a strikingly different typology' (58), 'vernaculars
could be prove very important' (101), 'Census if India data for Nicobars' (124)
among others. The References section follows no established norms and is
internally inconsistent. While the reproduction of plates has been carried out
well, the quality of some of the tables and figures is disappointing: Why are
there scanned tables and figures when these could have been reset? For example,
the table representing Mason's Talaing-Kole comparison (7) is a low-resolution
scan, and difficult to read at some places. The figure in (130) on the
'Classification of Waic dialects by Diffloth (1980)' suffers similarly due to
its low resolution.

These minor matters aside, one thing is sure: future students of Austroasiatic
will be grateful to Sidwell for having brought together these major references
and works on Austroasiatic languages. One would also hope that a companion
volume to this would also be made available in the future with more detailed
analysis of the phylum.


Anish Koshy is an Assistant Professor in Linguistics in the Department of ELT, Linguistics & Phonetics, The English & Foreign Languages University, Hyderabad, India. His research interests lie in working on the lesser-studied languages of India, and working on South Asian languages from a typological perspective. He is currently working on the typological nature of clitics in the Austroasiatic languages of India, namely the Munda and the Khasian branches.

