LINGUIST List 16.2637

Mon Sep 12 2005

Review: General Linguistics: Gordon (2005)

Editor for this issue: Megan Zdrojkowski <>


        1.    Harald Hammarström, Ethnologue: Languages of the World, 15th Edition

Message 1: Ethnologue: Languages of the World, 15th Edition
Date: 05-Sep-2005
From: Harald Hammarström <>
Subject: Ethnologue: Languages of the World, 15th Edition

EDITOR: Gordon, Raymond J.
TITLE: Ethnologue
SUBTITLE: Languages of the World
PUBLISHER: SIL International
YEAR: 2005
Announced at

Harald Hammarström, Department of Computing Science, Chalmers University
of Technology


The Ethnologue (2005) is the 15th edition of the SIL International effort
to gather a catalogue of all the living languages of the world. The
hardbound 1272-page volume is organized as follows:
Introduction 7-14
Statistical Summaries 15-36
Languages of the World 37-648
References 649-672
Language Maps 673-888
Indexes 889-1272

I will concentrate on the bulk of the work, i.e. the language entries and
information about them in the introduction. Inasmuch as they are correct
there is not much for a linguist to say about the statistical summaries,
maps and indexes (except that the maps, in colour, look great and will be
very useful). The first edition of the Ethnologue came out in 1951 and had
information on 46 languages. This 15th edition sports 7299 language
entries and the system of (lowercase) three-letter identifiers for each
language entry is now a draft ISO/DIS 639-3 standard. All the information
in the book version is also available free of charge on the web which greatly facilitates access and
searchability. SIL deserve a huge thank you for posting the web edition,
which will doubtlessly also increase the outreach of Ethnologue.


Out of the 7299 entries, 6912 represent living languages. "Living"
means "definitely having native speakers" so e.g. Latin is not counted as
living and there are another 27 'second language only'/'no data' entries.
Most of the remaining 360 extinct languages represent languages which have
died relatively recently (say within the last 100 years). Ethnologue does
not aim to catalogue all dead languages, so even well-attested ones like
Timucua (Granberry 1993) or Akkadian (Ungnad 1964) are missing. However, a
selection of ancient extinct languages are still listed (such as Ge'ez and
Coptic), perhaps those which have a bible translation. Likewise, there is
no aim towards completeness as to relatively recently extinct languages
either, whether poorly attested or well-attested. Consequently one can
find literally hundreds of extinct New World languages and languages
families in the lists of (Campbell 1997; Landar 1996; Adelaar 2004;
Kaufman 1994; Fabre 2005; Garza Cuarón and Lastra 1991) that are not in
Ethnologue. As plenty of extinct languages are not listed, their
respective family trees silently appear without these branches (e.g.
Ethnologue's Semitic is listed without its East Semitic branch consisting
of the long dead Eblaite and Akkadian (Faber 1997)).

The 6912 living languages include 124 living sign languages, 1 living
artificial language and 5 living pidgins (namely hmo, chn, nef, lir, cpi
despite the standard definition of pidgins as having no native speakers
(Bakker 2002, p. 7). Since Ethnologue admits (p. 13) that the inventory of
pidgins-jargons-special languages, e.g. sorcerers' languages, is not
complete there may well exist more of this kind.

Ethnologue commendably includes known unknown languages, i.e. where there
are speakers known to exist who presumably speak something but, since they
are not in contact, we don't know what. Examples of these are Sentinelese
of the Andamans (Abbi 2004; Shashi 1994), Uru-Pa-In (Angenot-de-Lima 2002,
p. 38) of Brazil, Yarí (Adelaar 2004, p. 624) of Colombia. Carabayo seems
to be a case where the group's 3 houses are known from airplane
observations. Another five Brazilian languages I know of only from the
Ethnologue: Himarimã, Iapama, Karahawyana, Kohoroxitari and Papavô.

1.1 UNLISTED LANGUAGES. Four quite solidly extant Brazilian languages are
missing: Máku is still reported to have 1 speaker (Rodrigues 2005; Seki
1999; Migliazza 1985, pp. 37, 280, 52). Kwazá and Aikanã are excellently
discussed in the introduction of van der Voort (2000). The isolate Kanoê
is not the Tupi Kanoé [kxo] or the [kxo] entry is quite erroneous, see p.
23-24 of Bacelar (2004).

Adelaar (2004, 164) mentions another three living South American languages
that are missing: Pisamira, Nonuya and Yurí. He also sheds light on a
couple of languages which can be presumed extinct on good grounds but
which do not have entries (as living or dead) in the Ethnologue: Opón-
Carare (5 speakers in 1944) (p. 114-115), Mochica (p. 172), some Tucanoan
e.g. Coretú and Icaguate (p. 621) as well as Culli/Culle (p. 172-173).

In 2005, Roger Blench and associates have uncovered data from the Dogon
Plateau of West Africa that prove the existence of several "new"
languages, including one of unknown affiliation; manuscript sources (cited
with permission) are available on his Dogon and other webpages Without doubt, there
will be more "discoveries" in the future on the languages in Northern
Nigeria and adjacent regions. Likewise, two Australian mixed languages
have been brought to light (McConvell and Meakins 2005; O'Shannessy 2005)
too recently to make it into Ethnologue.

1.2 SPURIOUS LANGUAGES. In a work of this size it's hard to completely
exclude languages whose existence is really unsupported, such as Mutús of
the 14th edition (which is now removed, see also Adelaar (2004, p. 125)).

Since Ethnologue does not systematically include attested extinct
languages, the extinct unclassified Colombian languages Cagua, Chipiaje,
Coxima and Natagaima look suspicious, especially since they not mentioned
by by Adelaar or sources therein. Likewise, extinct unclassified Monimbo
of Nicaragua is not to be found in Meso-American sourcebooks. These cases
need of course not be spurious but their inclusion is highly arbitrary in
the masses of extinct unclassified, better documented, South American
languages that could have been taken up.

Pankararú [paz] and Pankararé [pax] are treated by most, e.g. Fabre
(2005), as one extinct language isolate whereas Ethnologue has two
entries, one extinct language isolate and one extinct unclassified.

Yauma is given in Ethnologue as an unclassified language of Angola.
Nothing else suggests that this should be anything more exotic than a
regular Bantu language. In fact it is explicitly listed as a Lucazi
dialect in Fleisch (2000, p. 1).

For extinct languages which are really living or vice versa see the
section below on speaker population. For languages that are better treated
as dialects see the section below on languages and dialects.

2.1 IN THEORY. The thoughts behind the Ethnologue language vs. dialect
divisions are so important that I will quote the section (p. 8):

"Not all scholars share the same set of criteria for what constitutes
a 'language' and what features define a 'dialect.' The Ethnologue applies
the following three basic criteria:
* Two related varieties are normally considered varieties of the same
language if speakers of each variety have inherent understanding of the
other variety at a functional level (that is, can understand based on
knowledge of their own variety without needing to learn the other variety).
* Where spoken intelligibility between varieties is marginal, the
existence of a common literature or of a common ethnolinguistic identity
with a central variety that both understand can be a strong indicator that
they should nevertheless be considered varieties of the same language.
* Where there is enough intelligibility between varieties to enable
communication, the existence of well-established distinct ethnolinguistic
identities can be a strong indicator that they should nevertheless be
considered to be different languages."

The problem with the second and third is that we don't know what a "well-
established ethnolinguistic identity" is. What would have been in order
is: a few examples, a systematic indication of when which criteria have
been utilized or else a rough indication of frequency of application.
Unclear cases are noted sporadically in the comments to the individual
entries in question but e.g. the latter criterion has obviously been
applied without indication in cases like the division of Serbian-Croatian-
Bosnian and Gitxsan-Nisga'a. Anyway, the bottom line is that little is won
by trying to break intelligibility ties with criteria that introduce new

The usage of the second criterion has some peculiar implications. If
accepted, the language/dialect status of two languages A and B can no
longer be established solely by inspection of all properties of A and B,
but depends on the existence of a third variety C. For example, two Vulgar
Latin dialects are one language as long as there is a Latin literature,
but if the speakers of one dialect become illiterate or we eradicate all
Latin writing, then they are perhaps two languages. Moreover, the number
of languages of three such varieties depends on the distribution of the
three over people. If they are distributed over two people, speaking A,C
and B,C respectively then the three are one and the same language. If they
are distributed over three people such that they all speak only one each,
then the number of languages is 1-3 (arguably 2). Note that there is no
inconsistency in the latter example. Is it perfectly possible for A and B
to understand C, but not produce it, so that A and B could not communicate
with each other alone. As a native Swedish speaker I can understand a lot
of Danish, but I can't produce credible Danish.

Finally, in their present formulation, the application of the first and
second criteria may lead to inconsistencies. Consider four varieties
A,B,C,D such that they all have the same ethnolinguistic identity (e.g.
Kurds), B and C have independent literatures (e.g. Sorani and Kurmanji
Kurdish), B and C are mutually intelligible (according to some, Sorani and
Kurmanji Kurdish are). Finally let A be mutually intelligible to B and
marginally to C, and likewise D to be mutually intelligible to C and
marginally to B. (It should be possible to find such Kurdish dialects.)
Now, A and D are not mutually intelligible at all and therefore, by
application of the first criterion, at least those two are separate
languages. The second criteria does not apply because, between A and D,
intelligibility is not even marginal. However, if we start by using the
second criterion on A,B,C and on B,C,D, all four must be one and the same

A popular belief holds that one cannot count the number of languages by
the mutual intelligibility criterion even if one sets an arbitrary
definition for when mutual intelligibility holds (say 85% shared
vocabulary) because of inconsistencies when applied to dialect continua.
This view is premature, it is perfectly possible to do this in an
intuitive way without any inconsistencies (Hammarström 2005).

2.2 IN PRACTICE. Many authors have noted the tendency of Ethnologue to be
extreme 'splitters' i.e. to prefer to split speech varieties into distinct
languages whenever possible. Middle American expert (Kaufman 1994, p. 33)
writes condescendingly of the 11th edition of Ethnologue (Grimes 1988)
suggesting the ratio 1:2 between 'reality' and Ethnologue (by argumentum
ad his own auctoritatem). In a more well-argued manner, traversing
handbooks area by area, the Africanist Maho (2004) finds 1441 living
African languages versus 2058 in the 14th edition of Ethnologue (Grimes
2000); this 15th edition has 2092.

However, as Maho notes (p. 12), maybe the discrepancy is due rather to the
handbooks and overviews being lumpers. For instance, Ethnologue splits
into 31 English-based Creoles, 46 Quechua, 69 Mayan languages, 21 Gbe, 35
Arabic (+3 Arabic-based Creoles), 25 Naga, 26 Berber, 21 Manding, 9 Fulani
but only one Hausa language -- whereas we are used to reading about these
in one-liners rather than as full-fledged families.

Since this is quite an important question, I have made a dive into the
specialist literature to compare the Ethnologue judgments. I think it's
fair to say that, most of the time, Ethnologue is consistent with the
specialists even where their sources must be independent. A lot of times
Ethnologue counts more languages than the specialists -- sometimes
wrongly, sometimes out of due caution. Less often, but still often, the
specialists count more mutually unintelligible varieties than Ethnologue.
Examples follow:

2.2.1 Ethnologue Undercounts.
* Lauje [law] should be split in two (Himmelmann 2001, p. 21) ".. both
Lauje and Ampibabo-Lauje speakers do not consider their speech varieties
mutually intelligible".
* Lenca is (or was) two languages and Xinca was more like four languages
(Campbell 1997, p. 166-167).
* Kilii Boni may be split from Boni (Heine 1982, p. 12).
* Kamona may be split from Bijogo (Segerer 2002, p. 7) as Ethnologue
* Befang could be split into Bangui and Modele (Boum 1981, p. 19) on
decent grounds.
* Bade [bde] could be split following Schuh: "Bade is dialectally diverse,
with some dialects differing enough from each other that one is tempted to
call them distinct languages" (Schuh 2005, p. 1).
* Panoan Katukína and Shanenawa are better treated as separate languages
(Vieira Cândido 2004, p. 13).
* Lemiting was distinct from Kiput (Blust 2003, p. 1).
* Mambila could be split into more than 2 (Connell 2000, p. 202).
* Tetun [tet] consists of two ".. virtually mutually untelligible" (van
Engelehoven and van Klinken 2005, p. 735) dialects. See also (van Klinken
1999; Williams-van Klinken, Hajek, and Nordlinger 2002, p. 3, 6) and
section 1.1 of (Williams-van Klinken, Hajek, and Nordlinger 2001).

2.2.2 Ethnologue Overcounts
* Kanuri/Kanembu is 1 language rather than 4 (Cyffer 1998, p. 31).
* Turkana is 1 language rather than 4 (Dimmendaal 1983, p. 2).
* Batak should be 2 perhaps 3 languages rather than 7 (Woollams 2005, p.
* Adang and Hamap are the same: "Adang speakers and Hamap speakers always
understand each other, when speaking their languages, though there are a
few differences (mainly phonological) between the two" (Haan 2001, p. 5).
* Many Australian, e.g. [piu], [pjt] and [kdd] are mutually intelligible
varieties (Dixon 2002, p. 5).
* Ibani [iby], Okrika [okr] and Kalabari [ijn] Ijo should be one language
(Williamson 1969, p. 2) "These three dialects are ... mutually
intelligible", instead of the confused [okr] and [ijn] as two separate
languages of an [East, Ibani-Okrika-Kalabari] branch, and [iby] of an
[Eastern, Northeastern, Ibani-Okrika-Kalabari] branch.
* Cacua and Nukak as well as Huoda and Yuhup may be perhaps be merged
(Andrade Martins 2004, p. 7).
* Perhaps [nyn], [nyo] are the same [ttj] the same (Rubongoya 1999, p.
* The division of Mumuye Proper into 5 languages is not supported by
(Shimizu 1979, p. 11-19) but then Ethnologue lists several varieties that
are not mentioned by Shimizu.
* In MacKay (1999, p. 12) 4 rather than 8 Totonac languages are recognized.
* The Bankon/Barombi split is ok but giving them as one language would
also have been ok (Atindogbé 1996).
* Northern (in Burkina Faso) and Southern (in Ghana) Dagaare are the same
according to (Naden 1988, p. 42) "can understand each other without undue
difficulty" and is not contradicted by the more recent source (Bodomo
* Furthermore in Ghana, "there is mutual intelligibility between ..."
(Dolphyne and Kropp Dakubu 1988, p. 54) Ahanta and Nzema and Nzema and
Anyi. There is (Dolphyne and Kropp Dakubu 1988, p. 77) "considerable
amount of mutual intelligibility" between Nchumbulu and Dwang.
* As for the notoriously difficult !Kung dialect continuum, Maho (1998, p.
113) states that "They all speak ... mutually intelligible forms of
speech". Both Maho's book and the volume (Haacke and Elderkin 1997) in
which the dialect study by Snyman appealed to Maho appear in Ethnologue's
* According to mutual intelligibility, there are only 3 Miao, 5 Bunu and 2
She languages (Bradley and Harlow 1994, p. 166).
* The split of the Makhuwa languages looks to be out of due caution
(Kisseberth 2003).
* The split of Bima-Sumba languages looks to be out of due caution (Klamer
2005, p. 709).
* Lampungic is better analyzed as 3 rather than 9 languages (Anderbeck
* Peripheral and Khalkh Mongolian are intelligible (Svantesson 2003;
Janhunen 2003a). The split into 3 Buryat languages must be due to partly
extralinguistic criteria (Skribnik 2003).

2.2.3 Ethnologue is in Harmony
* Good resolution of vexed Pashai dialect situation [aae, glh, psh, psi]
(Bashir 2003, p. 826).
* Zulgwa-Minew-Gemzek is one and the same [gnd] as judges (Barreteau 1984,
p. 170).
* Mekeo really is three languages, especially when one takes cultural
differences into account, although speakers can understand the
other dialects in less than a week's time (Jones 1998, p. 19).
* The division of Eskimo accords well with (Fortescue 1984; Miyaoka 1996).
* 4 Kham languages is a decent interpretation of (Watters 2002, p. 12-13).
* Nyamwanga-Iwa and Lungu-Mambwe are recognized in accordance with (Walsh
and Swilla 2000).
* The Chinese Mongolian dialect situation is well-handled (Janhunen 2003b).
* The heavy division of Banda receives support from detailed study of
(Cloarec-Heiss 2000).
* Two Araucanian languages is entirely accurate (Smeets 1989, p. 9-10).
* 2 Slave languages is not contradicted by section 2.3 in (Rice 1989).
* 8 Songhai is not a bad idea (Tersis 1972; Zima 1994; Heath 1999, p.
* Ekoti is justly a separate language (Schadeberg and Mucanheia 2000, p.
* Kilivila is fine (Lawton 1993, p. 6).
* Treating Matses [mcf] and Matis [mpg] as two separate languages is good
(Fleck 2003).
* Ngoe languages are consistent with (Hedinger 1987, p. 27).
* 2 Balanta languages is what (Wilson 1961, p. 139) postulates.
* 21 Mano & Dan languages is not contradicted by (Becker-Donner 1965, p.
* 21 Gbe languages may be too much but not impossible (Lefebvre and
Brousseau 2002; Capo 1990, p. 1-3,62).
* 1 May Brat language is optimal (Philomena Hedwig 1999).
* The Moken/Moklen division agrees with (Larish 2005, p. 514).
* ...

Indexable by the three-letter identification code, language entries have
the following main fields: Primary Name, Alternate Names, Speaker
Population, Classification, and Location. I regard them as primary since
they seem to be systematically indicated. The meaning and accuracy of the
data in these fields is scrutinized below.

In addition, but not with complete systematicity, the following pieces of
language information are usually given: dialect names, intelligibility
degree/lexical similarity with some neighbouring language(s), language
function(s) (e.g. official), language domain (e.g. liturgical), script,
typological remarks (e.g. basic word order), publications and use in media
(usually means presence of bible translation), status (e.g. extinct,
second language only, jargon, language of herb doctors) and other remarks.
Moreover, further information about the speakers is also usually supplied,
such as degree of bilingualism, literacy, religion, attitude to language,
means of subsistence (e.g. hunter-gatherers) and geo-ecological
environment (e.g. rain forest).

This additional data is welcome to the reader but will not be reviewed
here because it is not clear what the intended aim of coverage is. For
instance, my computer calculations show that 2675 have religion annotated,
3730 language development, 4108 language use and 1097 basic word order, as
SOV 558
SVO 322
VSO 133
VOS 24
OSV 12
OVS 10

However, these annotations are frequently partial and/or unsystematic and
have little to do with availability of data. For instance, Tundra Yukaghir
[ykg] is marked 'nontonal', whereas non-tonal Slovak [slk] and the 8-tone
language Iau [tmu] (Bateman 1986) have no information about tone or other
typological data.

3.1 PRIMARY AND ALTERNATE NAMES. Each entry is given a primary name which is
usually an established name from the literature. This hardly ever
coincides with the speakers' own name for the language (for an idea of the
discrepancy, check e.g. Appleyard in Irvine (1994), but Ethnologue aims to
set the primary name accordingly if there is a strong known desire (p. 10)
from the speakers to rid an entrenched foreign or offensive name.
Therefore, the Ethnologue has e.g. Tohono O'odham as primary name instead
of Papago (Zepeda 1983), Shabo for Mikeyir (Teferra 1991, p. 371) and,
more observantly than others, Nivaclé for Chulupí, Ashlushlay etc. But one
has missed e.g. Nuuchanulth for Nootka (Nakayama 2001, p. 2) and Nivkh for
Gilyak (Panfilov 1965).

The alternate names are some alternate names (often familiar from the
literature) and, as is well-known to all ethnolinguists, a multitude of
franco-, anglo-, hispanico-, portugo-phone spelling variants with or
without diacritics. In fact, the 7299 entries yield 39418 names in total,
of which about 45% are spelling variants. (This figure is from a rather
crude computerized statistical analysis.)

In numerous cases, neither the primary nor alternate names coincides with
the name used in the most recent/most authoritative piece of literature,
e.g. Sediq vs. Seediq (Tsukida 2005, p. 291), Phun vs. Hpon (Bradley and
Harlow 1994, p. 179), Qwarenya vs. Qwara (Appleyard 1998), Jiwarli vs.
Djiwarli (Austin 2001)). When searching, the user should be prepared to
try spelling variants with great persistence and creativity, and I have
tried to exercise extra care that none of the issues raised in this review
are mistakes in this respect.

3.2 SPEAKER POPULATION. Speaker populations are generally given with a
source, which may be a publication, person, organization or governmental
institution, as well as year of source. Some 750 entries do not give a
source-year pair at all, of which 274 are 'Extinct' and 238 (no overlap
with 'Extinct') are marked 'No estimate available' (a statement for which
one arguably does not need a source). Sometimes the year of the source is
that of the publication (1998) rather than the survey (1991) (Maho 1998),
sometimes the year is that of the survey (1995) rather than the
publication (2001) (Berthelette and Berthelette 2001) and sometimes both
are given e.g. eki 5,000 (1988, in Crozier and Blench 1992:36).

For the entries which have source years, the distribution of entries over
years is as follows (average 1993.01):

1922 1
1925 1
1931 1
1934 1
1954 1
1956 1
1959 1
1961 7
1962 6
1963 3
1965 1
1966 1
1967 1
1969 8
1970 9
1971 33
1972 24
1973 53
1974 1
1975 29
1976 20
1977 87
1978 38
1979 23
1980 67
1981 413
1982 162
1983 176
1984 49
1985 48
1986 111
1987 237
1988 70
1989 168
1990 383
1991 384
1992 98
1993 288
1994 172
1995 317
1996 143
1997 204
1998 297
1999 250
2000 1181
2001 243
2002 310
2003 337
2004 84
TOTAL 6543

Of those 183 entries with sources from 1975 and older only a handful
represent extinct languages. There is a certain persistent antiquity,
which is more revealing when we look at who the sources are. The sources
which account for 100 or more entries are:

SIL 1816
None 1270
Census 733
World Christian Database 545
Wurm and Hattori (1981) 337
United Bible Societies 145
Wurm 120
... ...

None means that only the year is given and 'Census' represents many
different censuses.

The intersecting point of interest is that Wurm and Hattori (1981) is the
source for (exactly) 337 entries, and the figures in that volume stem
mostly from surveys in the 1970s (Wurm and Hattori 1981) (no page number
given since this publication does not have page numbers). This poor effort
to update from Wurm and Hattori 1981, although a landmark publication, has
a particular effect on the number of non-extinct Australian languages. In
Dixon (2002, p. 2) we are told that "more than half of these [240-250
indigenous languages] are no longer spoken or remembered"; see also
McConvell (2001). Ethnologue lists 263 Australian languages of which 224
are listed as not (yet) extinct. This is a gross overestimate and SIL
should have consulted an Australian specialist here. From e.g. Wurm (2003,
pp. 42-43) one can glean a list of now deceased languages that Ethnologue
cites as still having speakers as of Wurm and Hattori (1981): lrg, nrx,
umr, bpt, fln, bym, gdc, gyf, gyy, gwu, kgl, zmk, zmc, wdu, wrg, zmu, nyt,
wkw, wga, wrb, djl, ....

There are too many other cases where there is a newer better source, e.g.
those on Ket by Krivogonov who visited every village 1991-1995 (Georg
2003, p. 99-103), for speakers population than Ethnologue, so I will just
give a selection of some more important ones below. There are also lots of
cases where the Ethnologue figures are up-to-date (although not extremely
up-to-date) such as e.g. following Salminen on Saami languages in and Ongota; slightly newer
figures are given in Savà (2003, p. 173).

3.2.1 Endangered Languages. Ethnologue marks languages which have a
speaker population of less than 50 or a very small fraction of the actual
ethnic group as 'nearly extinct'. They do not try to take on a more
sophisticated approach so e.g. Masep with 30-40 speakers is classed as
endangered despite the fact that it is used vigorously by all ages
(Clouse, Donohue, and Ma 2002, p. 4), and has been in the same state at
least since 1955.

3.2.2 Wrongly Extinct. To label languages as extinct is a bit sensitive
since it may deter people from searching for remaining speakers. Languages
like Tinigua, Kusunda and Leco have been said to be dead earlier but then
speakers were found.

Itene (Angenot-de-Lima 2002; Crevels 2002, p. 39, 34), Cayuvava (Crevels
2002, p. 34), and Yahuna (Adelaar 2004, p. 621), Senhaja de Srair
(Behnstedt 2002) are not (yet) extinct. Kusunda [kgg] is listed both as
extinct and 3 speakers. It's best to list it as not (yet) extinct (Rana

There are a number of languages which are really presumed extinct rather
than definitely extinct e.g: Jorá (Crevels 2002, p. 55), Tekiraka (Adelaar
2004, p. 456) and perhaps Wappo. I don't know what to say of Yavitero
since Adelaar says it is extinct on p. 162 but has 1 speaker on p. 612.
Canichana is however correctly classified as extinct in spite of Adelaar's
mention of semi-speakers (p. 613) since Crevels (2002, p. 55) clarifies
their nature "Estos hablantes sólo se acuerdan de algunas palabras y una o
dos frases".

3.2.3 Overestimated Populations. Hayu is mentioned as nearly extinct
(Bradley and Harlow 1994, p. 172) so the figure of 1743 speakers is
suspicious and probably refers to the ethnic group. So is said to have
5000 speakers (source dated 1972) but there are at most 100 speakers
(Carlin 1993, p. 5). (5000 is a plausible size for the ethnic group.)
Bubburè is claimed to have 500 speakers whereas actually it's more like 10
(Haruna 1998). Luo (also known as Kasabe) died in 1995 (Connell 1998, p.
216). Ona is extinct according to Adelaar (2004, p. 615). Wotapuri-
Katarwalai is probably extinct (Bashir 2003, p. 869), so the Ethnologue
number of 2000 probably refers to the ethnic group. Tyua is extinct
(Batibo 1998, p. 277) so the figure 817, as do many other of Cook's 2004
figures, probably refers to the ethnic group.

3.3 CLASSIFICATION. Ethnologue's language family index lists 103 families,
40 isolates, 21 mixed languages, 18 pidgins, 86 creoles, and 78
unclassfied languages. From the introduction (p. 14) it is clear that the
intent is genetic classification rather than some convenience grouping.
The basis for the classification is said to be the International
Encyclopedia of Linguistics 2nd ed. (IEL) (Frawley 2003), but that is
really an empty self-reference since IEL follows the 14th edition of
Ethnologue in its classification (Frawley 2003, p. xiv): "These lists [of
language families and their members] were compiled by Barbara Grimes --
not by the authors of the articles -- using the Ethnologue ... There
remain great controversies in the field over which languages belong to
which families, and, indeed, some of the groupings in the lists are at
odds with the positions of the authors of the articles. The goal of
including the lists was not to resolve controversies -- or promote them! --
but to ensure that the user has maximum information."

The IEL adds no substance to the classification and the argument given is
obviously a smokescreen to avoid effort. Surely, one can provide the user
with more 'maximum information' than arbitrariness and contradiction. I am
not asking that SIL embark on a large-scale enterprise of historical
linguistics, only that they report the latest well-argued expert opinions
on the matter.

A good case in point is Khoisan which is listed as a family even here in
the 15th edition of Ethnologue. But Khoisan specialists have denied the
establishment of genetic unity of its six genetically independent units
for ages (Bleek 1927; Westphal 1963; Westphal 1971; Köhler 1975; Winter
1981; Güldemann and Vossen 2000; Güldemann 2003), and other Khoisanists'
belief have never amounted to anything more than belief. Note that the
list includes Güldemann in the IEL, which is the newest published family
overview in wait for the ever-forthcoming Khoisan handbook from Routledge.

Although the 15th edition has incorporated some recent findings, there is
still a notable hangover of highly controversial groupings, to name a few:
Altaic (Róna-Tas 1998), Australian (Dixon 2002) see also (Evans 2005) and
references therein, Andamanese (Abbi 2004), Kadugli-Krongo should be a
stand-alone family outside Nilo-Saharan (Reh 1985; Ehret 2001, p. 2, 68),
East Papuan (Dunn, Reesink, and Terrill 2002, p. 31), Arutani-Sape
(Migliazza 1985), Trans New Guinea and Geelvink Bay need update (Foley
2000, p. 362), the North American Na-Dene, Penutian, Hokan, Coahuiltecan
(Tonkawa wrongly included), Hokan, Gulf need further splitting following
the well-argued divisions of Mithun (1999) and Campbell (1997), as well as
in South America (information scattered in Fabre (2005) and Adelaar
(2004). The internal subclassification in many families does not follow
the latest well-argued accounts either, e.g. Nilo-Saharan (Ehret 2001) and
Sino-Tibetan (Thurgood and LaPolla 2003); cf. van Driem (2003). Many
groupings, however, are quite satisfactory, such as e.g. Grassfields Bantu
(Watters 2003).

The is no mention of the definition used for 'Mixed Language' but it seems
to follow the discussion in Matras and Bakker (2003) since the category
contains the commonly discussed cases: Ma'a/Mbugu, Media Lengua, Michif,
Callahuaya plus quite a few more (totalling 21), including some poorly
known European travellers' languages. (Cocama-Cocamilla [cod] (Adelaar
2004, p. 432) may belong here but is classified under Tupi.) Similarly,
although it is not directly mentioned, one can infer that the most
important aspects of the definition used for creole is "native speaker"
and "full expressivity".

3.4 ISOLATES AND UNCLASSIFIED LANGUAGES. Although Ethnologue never state it, the
meaning of 'unclassified' vs. 'isolate' ought to be that
unclassified languages have too little data to be classified, whereas
isolate means that there is sufficient data but that any attempts to link
to it have failed. There are also many languages, apart from the 78 stand-
alone unclassifieds, which are unclassified within families. This, I
gather from the entries in question, should be interpreted as
either "sufficient data to classify into family but insufficient for lower-
level assignment" or "full data on language is available but current
research on lower-level assignment inconclusive".

Following the definition of isolate vs. (stand-alone) unclassified, a
number of unclassified languages should be moved to isolate: Beothuk
(Mithun 1999), Kunza/Atacameño (Adelaar 2004, p. 375-385), Puquina,
Yuwana (Migliazza 1985; Fabre 2005), and Yaruro (Adelaar 2004, p. 163).

Luo (aka Kasabe) and Yeni, if at all different from Njerep (Connell 1998,
p. 214-217), are close relatives of Njerep rather than unclassifieds
(Connell and Zeitlyn 2000). Likewise, as Ethnologue admits, Bung may go
with Ndung-Kwanja.

The unclassified category further includes a number of languages whose
unclassified status is harder to attack: Brazilian Wasu (better known as
Wassú), Amikoana, Arára, Agavotaguerra, Miarrã, Tapeba, Tingui-Boto (sic),
Tremembé, Truká, a couple of Papuan and Nigerian languages and some second-
language special languages like Haitian Vodoun Culture Language and
Traveller Scottish.

The unclassified extinct poorly attested Brazilian languages Kaimbé,
Kamba, Kambiwá, Karirí-Xocó, Pankararé, Uamué, Xukurú, Pataxó-Hãhaãi,
Wakoná and Tuxá seem to be listed only because they appear in the SIL
Publication (Meader 1978), otherwise extinct unclassified Amazonian or non-
Amazonian, e.g. Kenaboi (Hajek 1998) languages usually do not get an entry.

The status of the Indian and Afghan unclassifieds Andh, Bhatola, Majhwar,
Mukha-dora, Aariya, Malakhel and Warduji, as well as Waxianghua of China,
will hopefully be examined in the near future.

3.5 OTHER ISOLATES. A number of individual languages that Ethnologue
classifies into families are better treated as isolates, such as: Masep
(Clouse, Donohue, and Ma 2002, p. 5), Kusunda (Rana 2002), Lenca, Xinca
(Campbell 1997, p. 166-167) the African isolates Ongota (Fleming, Yilma,
Mitiku, Hayward, Miyawaki, Mikesh, and Seelig 1993; Savà and Tosco 2000;
Savà 2003), Jaláa (Kleinewillinghöfer 2001), and Shabo (Teferra 1991;
Ehret 2001, p. 68). Kujarge and Laal are two other unclassified languages
which seem to have enough material to be called isolates. Kara is
problematic to place in Central Sudanic (Djarangar 2000, p. 219) so it's
not clear what to do with it.

3.6 LOCATION. I am not competent to scrutinize the location data so I have
no comments.

As a catalogue the Ethnologue is of very high absolute value and by far
the best of its kind. However, it is not a reference book and one should
always double check to get the latest and most authoritative information
on individual entries. The relative number of errors is low but the
Ethnologue is leaking in various places where it should not have to. I
don't think the Ethnologue deserves much beating for their practice of
splitting dialects into languages. My impression is that, at any rate, the
specialist literature (as a whole) is not any better. The language/dialect
implementation, although still relatively eager to split, is now rather
informed and can boast many recent dialect surveys conducted by SIL
themselves. Therefore I look forward to an even sharper 16th edition.

Thanks to all language speakers, fieldworkers and libraries.

The name of last speaker of Ubykh, given as 'Tevfik Esen', should be
spelled with a 'ç' at the end.

Data which belong to 'remarks on classification' seem to have been
systematically misplaced into the 'dialects' field. For instance, we find
under 'dialects' such comments as:
* "Greenberg places it in Macro-Chibchan" [kuz]
* "It may be distantly related to Altaic or Uralic" [ykg]
* "Ruhlen says it is Andean. Adelaar says it is in the Hibito-Cholon
family" [cht]
* "May be in a Takelma-Kalapuyan subgroup, but not conclusive." [tkm]
* "Mason (1950:246 with disclaimer), Tax (1960:433), and Kaufman (1990:43
tentatively) say this is Witotoan. Tovar (1961:150), Witte (1981:1), and
Aschmann (1993:2) say it is an isolate." [ano]

The introduction (p. 13) claims that there has been 50,000 updates since
the last edition. Clearly, 7 fields per entry have not been updated, so
this leaves us with a very diluted notion of an update.

The index says Kolyma Yukaghir (p. 1225) under "Yukaghir, southern [yux]"
has its entry on p. 499 instead of the correct p. 507.

The list of sources has roughly one immediately spottable typo per page:
'Die nordjemenitischen Dialaekte' (p. 650) should be '.. Dialekte .. '
'Northern Ter ritory' (p. 651) should be '.. Territory ..'
'Die Sprach von Wotapur' (p. 652) should be '.. Sprache ..'
'Annales' (p. 652) should be 'Annales de l'Université d'Abidjan, série H,
'Paris: Laroux' (p. 654) should be '.. Leroux' or '.. Ernest Leroux'
'des perlers dardes' (p. 655) should be '.. parlers ..'
'1903-1928. Linguistic Survey of India, 3 vols.' (p. 656) should be '.. 11
'Leningrad.', 'Moscow' on three entries by Grjunberg (p. 657) should be
prefixed 'Izdatel´stvo Akademii Nauk SSSR'
'A dialektologii' (p. 657) should be 'O dialektologii'
'A. Jazyery' (p. 657) should be 'M. Jazayery' (or M. A. for Mohammad Ali)
'Rudiger Koppe' (p. 657) should be 'Rüdiger Köppe'
'Togorestsprachen. Kölner Beiträge zur Afrikanistik, Band )' (p. 658)
should be '.. Band 1'
'Ein neuaramaischen Dielekt aus dem Vilayet Siirt (Ustanatolien). ZDub
121' (p. 659) should be 'Ein neuaramäischer Dialekt aus dem Vilayet Siirt
(Ostanatolien). Zeitschrift der Deutschen Morgenländischen Gesellschaft
'Mesopotamisch-Arabishen' (p. 659) should be 'Mesopotamisch-Arabischen'
'Neuaramaische Dialect' (p. 659) should be 'Neuaramäische Dialekt'
'Karassowitz' (p. 659) should be 'Harrassowitz'
'Beitrage' (p. 659) should be 'Beiträge' (and the publisher is probably
Afro-Pub and Beitr. zur Afrikanistik the series name).
'Kastenholz ... Vol. 2' (p. 659) should be '... Mande Languages and
Linguistics Vol. 2'
'Rudiger, Koppe' (p. 659) should be 'Rüdiger Köppe'
'Anthropological Linguistics 19.8.' (p. 660) could add the pages '378-
401', and there is a newer version of this article in the cited Manelis
Klein and Stark (1985).
'Societe' (p. 660) should be 'Société'
'Mahapatra, B. P. Malto 1979. An Ethnosemantic Study' (p. 661) should
be 'Mahapatra, B. P. 1979. Malto: An Ethnosemantic Study'
'Migliazza 1977 .. ms' (p. 662) was published in the cited Manelis Klein
and Stark 1985
'Heinz-Jurgen' (p. 664) should be 'Heinz-Jürgen'
'Sonsoral' (p. 665) should be 'Sonsorol'
'Saenz-Badillos' (p. 665) should be 'Sáenz-Badillos'
'filologica' (p. 667) should be 'filología'
'Afrika und Ubersee 40:110-112' (p. 668) should be '... Übersee ...' and
the full article is on pp. 73-84 and 93-115 as well as continued in vol
41:27-65, 117-153, 171-196.
'The Tati languages group' (p. 668) should be 'The Tati language group'
'langues parlees' (p. 669) should be 'langues parlées'
'Zhao, Xiangru ..' (p. 672) add 'pp. 260-287'


Abbi, A. (2004). The great experience: A linguistic field trip to the
Andaman Islands 2001-2002. accessed
15 Jan 2005.

Adelaar, W. F. H. (2004). The Languages of the Andes. (Cambridge Language
Surveys) Cambridge University Press.

Anderbeck, K. (2005). Cheaper by the dozen? Reassessing linguistic
diversity in the Lampungic language cluster. Paper Presented at the 6th
Biennial Meeting of the Association for Linguistic Typology, Padang, West
Sumatra, Indonesia.

Andrade Martins, S. (2004). Fonologua e Gramática Dâw. Ph. D. thesis,
Vrije Universiteit Amsterdam.

Angenot-de-Lima, G. (2002). Description Phonologique, Grammaticale et
Lexicale du Moré, Langue Amazonienne de Bolivie et du Brésil. Ph. D.
thesis, Rijksuniversiteit te Leiden.

Appleyard, D. L. (1998). Language death: The case of Qwarenya (Ethiopia).
In M. Brenzinger (Ed.), Endangered Languages in Africa, pp. 143-162.
Rüdiger Köppe Verlag, Köln.

Atindogbé, G. (1996). Bankon (A 40): Éléments de phonologie, morphologie
et tonologie, Volume 7 of Grammatische Analysen Afrikanischer Sprachen.
Rüdiger Köppe Verlag, Köln.

Austin, P. K. (2001). Word order in a free word order language: The case
of Jiwarli. In J. Simpson, D. Nash, M. Laughren, P. Austin, and B. Alpher
(Eds.), Forty Years On: Ken Hale and Australian Languages, pp. 305-324.
Pacific Linguistics, Canberra.

Bacelar, L. N. (2004). Gramática da língua Kanoê. Ph. D. thesis,
Katholieke Universiteit Nijmegen.

Bakker, P. (2002). Pidgin inflectional morphology. In G. E. Booij and J.
van Marle (Eds.), Yearbook of Morpology 2002, pp. 3-33. Kluwer Academic

Barreteau, D. (1984). Les langues. In R. Breton and M. Dieu (Eds.), Le
Nord du Cameroun: Des Hommes, une région, Volume 102 of Mémoires ORSTOM,
pp. 159-180. ORSTOM, Paris.

Bashir, E. (2003). Dardic. In G. Cardona and D. Jain (Eds.), The Indo-
Aryan Languages, Routledge Language Family Series, pp. 818-894. Routledge,
London & New York.

Bateman, J. (1986). Tone morphemes and aspect in Iau/Tone morphemes and
status in Iau. Nusa 26, 1-76.

Batibo, H. M. (1998). The fate of the Khoesan languages of Botswana. In M.
Brenzinger (Ed.), Endangered Languages in Africa, pp. 267-284. Rüdiger
Köppe Verlag, Köln.

Becker-Donner, E. (1965). Die Sprache der Mano, Volume 5 of
Österreichische Akademie der Wissenschaften: Philosophisch-Historische
Klasse, Sitzungsberichte, 245. Kommissionsverlag der Österreichischen
Akademie der Wissenschaften.

Behnstedt, P. (2002). La frontera entre el bereber y el árabe en el rif.
Estudios de dialectología norteafricana y andalusí 6.

Berthelette, J. and C. Berthelette (2001). Sociolinguistic survey report
for the Blé language. Technical report, SIL International, Dallas. SIL
Electronic Survey Reports 2001-001

Bleek, D. F. (1927). The distribution of Bushman languages in South
Africa. In Festschrift Meinhof, pp. 55-64. L. Friederichsen & Co., Hamburg.

Blust, R. (2003). A short morphology, phonology and vocabulary of Kiput,
Sarawak, Volume 546 of Pacific Linguistics. Research School of Pacific and
Asian Studies, Australian National University, Canberra.

Bodomo, A. B. (2000). Dàgàárè, Volume 165 of Languages of the
World/Materials. Lincom GmbH, München.

Boum, M. A. (1981). Le Syntagme Nominal en Modele. Ph. D. thesis,
Rijksuniversiteit te Leiden.

Bradley, D. and S. Harlow (1994). East and South East Asia. In C. Moseley
and R. E. Asher (Eds.), Atlas of the World's Languages, pp. 157-192.
Cambridge University Press.

Campbell, L. (1997). American Indian Languages: The Historical Linguistics
of Native America. Oxford Studies in Anthropological Linguistics. Oxford
University Press.

Capo, H. B. C. (1990). Systèmes numeriques et hétérogénéité ethnique des
commonautes de parlers gbe. Afrikanistische Arbeitspapiere 22, 61-82.

Carlin, E. (1993). The So Language, Volume 2 of Afrikanistische
Monografien (AMO). Institut für Afrikanistik, Universität zu Köln.

Cloarec-Heiss, F. (2000). Mésures dialéctales en 3 diménsions: Application
à une aire dialéctale hétérogène, l'aire banda. In H. E. Wolff and O. D.
Gensler (Eds.), Proceedings of the 2nd World Congress of African
Linguistics: Leipzig 1997, pp. 175-195. Rüdiger Köppe Verlag, Köln.

Clouse, D., M. Donohue, and F. Ma (2002). Survey report of the north coast
of Irian Jaya. Technical report, SIL International, Dallas. SIL Electronic
Survey Reports 2002-078

Connell, B. (1998). Moribund languages of the Nigeria-Cameroon borderland.
In M. Brenzinger (Ed.), Endangered Languages in Africa, pp. 207-225.
Rüdiger Köppe Verlag, Köln.

Connell, B. (2000). The integrity of Mambiloid. In H. E. Wolff and O. D.
Gensler (Eds.), Proceedings of the 2nd World Congress of African
Linguistics: Leipzig 1997, pp. 197-213. Rüdiger Köppe Verlag, Köln.

Connell, B. A. and D. Zeitlyn (2000). Njerep: A postcard from the edge.
Studies in African Linguistics 29 (1), 95-125.

Crevels, M. (2002). Itonama o Sihnipadara, Lengua no Clasificada de la
Amazon ía Boliviana. Number 16 in Estudios de Lingüística. Departamento de
Filología Española, Lingüística General y Teoría de Literatura,
Universidad de Alicante.

Cyffer, N. (1998). A Sketch of Kanuri, Volume 9 of Grammatische Analysen
Afrikanischer Sprachen. Rüdiger Köppe Verlag, Köln.

Dimmendaal, G. J. (1983). The Turkana Language. Publications in African
Languages and Linguistics. Foris Publications, Dordrecht.

Dixon, R. M. W. (2002). Australian Languages: Their Nature and
Development. Cambridge Language Surveys. Cambridge University Press.

Djarangar, D. I. (2000). Essai de classification des langues sara. In H.
E. Wolff and O. D. Gensler (Eds.), Proceedings of the 2nd World Congress
of African Linguistics: Leipzig 1997, pp. 215-227. Rüdiger Köppe Verlag,

Dolphyne, F. A. and M. E. Kropp Dakubu (1988). The Volta-Comoé languages.
In The Languages of Ghana, Volume 2 of African Languages: Occasional
Publication, pp. 50-90. Kegan Paul, London.

Dunn, M., G. Reesink, and A. Terrill (2002). The East Papuan languages: A
preliminary typological appraisal. Oceanic Linguistics 41 (1), 28-62.

Ehret, C. (2001). A Historical-Comparative Reconstruction of Nilo-Saharan,
Volume 12 of Sprache und Geschichte in Afrika: Beihefte. Rüdiger Köppe
Verlag, Köln.

Evans, N. (2005). Review article: Australian languages reconsidered: A
review of Dixon (2002). Oceanic Linguistics 44 (1), 242-286.

Faber, A. (1997). Genetic subgrouping of the Semitic languages. In R.
Hetzron (Ed.), The Semitic Languages, pp. 3-15. Routledge, London & New

Fabre, A. (2005). Diccionario etnolingüístico y guía bibliográfica de los
pueblos indigenas sudamericanos. Book in Progress.

Fleck, D. W. (2003). A Grammar of Matses. Ph. D. thesis, Rice University,

Fleisch, A. (2000). Lucazi Grammar, Volume 15 of Grammatische Analysen
Afrikanischer Sprachen. Rüdiger Köppe Verlag, Köln.

Fleming, H. C., A. Yilma, A. Mitiku, R. Hayward, Y. Miyawaki, P. Mikesh,
and J. M. Seelig (1992-1993). Ongota (or) Birale: A moribund language of
Gemu-Gofa (Ethiopia). Journal of Afroasiatic Languages 3 (3), 181-225.

Foley, W. A. (2000). The languages of New Guinea. Annual Review of
Anthropology 29 (1), 357-404.

Fortescue, M. (1984). West Greenlandic. Croom Helm Descriptive Grammars.
Croom Helm, London.

Frawley, W. J. (Ed.) (2003). International Encyclopedia of Linguistics,
2nd ed., Volume 2. Oxford University Press.

Garza Cuarón, B. and Y. Lastra (1991). Endangered languages in Mexico. In
R. H. Robins and E. M. Uhlenbeck (Eds.), Endangered Languages, pp. 93-134.
Berg, New York.

Georg, S. (2003). The gradual disappearance of a Eurasian language family:
The case of Yeniseian. In M. Janse and S. Tol (Eds.), Language Death and
Language Maintenance: Theretical, Practical and Descriptive Appraoches,
Volume 240 of Current Issues in Linguistic Theory, pp. 89-106. John
Benjamins, Amsterdam.

Granberry, J. (1993). A Grammar and Dictionary of the Timucua Language (3
ed.). The University of Alabama Press, Tuscaloosa.

Grimes, B. F. (Ed.) (1988). Ethnologue: Languages of the World (11 ed.).
SIL International, Dallas.

Grimes, B. F. (Ed.) (2000). Ethnologue: Languages of the World (14 ed.).
SIL International, Dallas.

Güldemann, T. (2003). Khoisan languages. In W. J. Frawley (Ed.),
International Encyclopedia of Linguistics (2 ed.), Volume 2, pp. 359-362.
Oxford University Press.

Güldemann, T. and R. Vossen (2000). Khoisan. In B. Heine and D. Nurse
(Eds.), African Languages: An Introduction, pp. 99-122. Cambridge
University Press.

Haacke, W. H. G. and E. D. Elderkin (Eds.) (1997). Namibian Languages:
Reports and Papers, Volume 4 of Namibian African Studies. Rüdiger Köppe
Verlag, Köln.

Haan, J. W. (2001). The Grammar of Adang: A Papuan Language Spoken on the
Island of Alor East Nusa Tenggara - Indonesia. Ph. D. thesis, University
of Sydney.

Hajek, J. (1998). Kenaboi: An extinct unclassified language of the Malay
peninsula. Mon-Khmer Studies 28, 137-149.

Hammarström, H. (2005). Counting languages in dialect continua using the
criterion of mutual intelligibility. Manuscript available at accessed 4 September 2005.

Haruna, A. (1998). Language death; the case of Bubburè in southern Bauchi
area, northern Nigeria. In M. Brenzinger (Ed.), Endangered Languages in
Africa, pp. 227-251. Rüdiger Köppe Verlag, Köln.

Heath, J. (1999). A Grammar of Koyra Chiini: The Songhay of Timbuktu,
Volume 19 of Mouton Grammar Library. Mouton de Gruyter.

Hedinger, R. (1987). The Manenguba Languages (Bantu A.15, Mbo Cluster) of
Cameroon. School of Oriental and African Studies, London.

Heine, B. (1982). Boni Dialects, Volume X of Language and Dialect Atlas of
Kenya. Verlag von Dietrich Reimer, Berlin.

Himmelmann, N. P. (2001). Sourcebook on Tomini-Tolitoi Languages: General
Information and Word Lists, Volume 511 of Pacific Linguistics. Research
School of Pacific and Asian Studies, Australian National University,

Irvine, A. K. (1994). The Middle East and North Africa. In C. Moseley and
R. E. Asher (Eds.), Atlas of the World's Languages, pp. 263-280. Cambridge
University Press.

Janhunen, J. (2003a). Mongol dialects. In J. Janhunen (Ed.), The Mongolic
Languages, Routledge Family Series. Routledge, London & New York.

Janhunen, J. (Ed.) (2003b). The Mongolic Languages. Routledge Family
Series. Routledge, London & New York.

Jones, A. A. (1998). Towards a Lexicogrammar of Mekeo (An Austronesian
Language of West Central Papua), Volume 138 of Pacific Linguistics: Series
C. Research School of Pacific and Asian Studies, Australian National
University, Canberra.

Kaufman, T. (1994). The Americas. In C. Moseley and R. E. Asher (Eds.),
Atlas of the World's Languages, pp. 1-76. Cambridge University Press.

Kisseberth, C. W. (2003). Makhuwa (p30). In D. Nurse and G. Philippson
(Eds.), The Bantu Languages, Routledge Language Family Series, pp. 546-
565. Routledge, London & New York.

Klamer, M. (2005). Kambera. In A. Adelaar and N. Himmelmann (Eds.), The
Austronesian Languages of Asia and Madagascar, Routledge Language Family
Series, pp. 709-734. Routledge, London & New York.

Kleinewillinghöfer, U. (2001). Jalaa - an almost forgotten language of
northeastern Nigeria: A language isolate. In D. Nurse (Ed.), Historical
Language Contact in Africa, Volume 16/17 of Sprache und Geschichte in
Afrika, pp. 239-271. Rüdiger Köppe Verlag, Köln.

Köhler, O. (1975). Geschichte und Probleme der Gliederung der Sprachen
Afrikas. In H. Baumann (Ed.), Die Völker Afrikas und ihre traditionellen
Kulturen, Volume 1, pp. 135-373. Steiner, Wiesbaden.

Landar, H. S. (1996). Sources. In I. Goddard (Ed.), Languages, Volume 17
of Handbook of North American Indians, pp. 721-761. Smithsonian
Institution, Washington.

Larish, M. D. (2005). Moken and Moklen. In A. Adelaar and N. Himmelmann
(Eds.), The Austronesian Languages of Asia and Madagascar, Routledge
Language Family Series, pp. 513-533. Routledge, London & New York.

Lawton, R. (Ed.) (1993). Topics in the Description of Kiriwina, Volume 84
of Pacific Linguistics: Series D. Research School of Pacific and Asian
Studies, Australian National University, Canberra.

Lefebvre, C. and A.-M. Brousseau (2002). A Grammar of Fongbe, Volume 25 of
Mouton Grammar Library. Mouton de Gruyter.

MacKay, C. J. (1999). A Grammar of Misantla Totonac. Studies in Indigenous
Languages of the Americas. University of Utah Press, Salt Lake City.

Maho, J. (2004). How many languages are there in Africa really? In K.
Bromber and B. Smieja (Eds.), Globalisation and African Languages: Risks
and benefits - Festschrift Karsten Legère, Volume 156 of Trends in
Linguistics: Studies and Monographs, pp. 279-296. Mouton de Gruyter.

Maho, J. F. (1998). Few People, Many Tongues: The Languages of Namibia.
Gamsberg Micmillan, Windhoek, Namibia.

Matras, Y. and P. Bakker (2003). The study of mixed languages. In Y.
Matras and P. Bakker (Eds.), The Mixed Language Debate: Theoretical and
Empirical Advances, Volume 145 of Trends in Linguistics: Studies and
Monographs, pp. 1-20. Mouton de Gruyter.

McConvell, P. (2001). State of the indigenous languages in Australia -
2001. Technical report, Australia State of the Environment Second
Technical Paper Series (Natural and Cultural Heritage), Department of the
Environment and Heritage, Canberra.

McConvell, P. and F. Meakins (2005). Gurindji Kriol: A mixed language
emerges from code-switching. Australian Journal of Linguistics 25 (1), 9-

Meader, R. E. (1978). Indios do Nordeste: Levantamento Sobre os
Remanescentes Tribais do Nordeste Brasileiro, Volume 8 of Série
Lingüística. Summer Institute of Linguistics, Brasília.

Migliazza, E. C. (1985). Languages of the Orinoco-Amazon region: Current
status. In H. E. Manelis Klein and L. Stark (Eds.), South American Indian
Languages: Retrospect and Prospect, pp. 17-139. Texas University Press.

Mithun, M. (1999). The Languages of Native North America. Cambridge
Language Surveys. Cambridge University Press.

Miyaoka, O. (1996). Sketch of Central Alaskan Yupik, an Eskimoan language.
In I. Goddard (Ed.), Languages, Volume 17 of Handbook of North American
Indians, pp. 325-363. Smithsonian Institution, Washington.

Naden, T. (1988). The Gur languages. In The Languages of Ghana, Volume 2
of African Languages: Occasional Publication, pp. 12-49. Kegan Paul,

Nakayama, T. (2001). Nuuchahnulth (Nootka) Morphosyntax, Volume 134 of
University of California Publications in Linguistics. University of
California Press, Berkeley and Los Angeles.

O'Shannessy, C. (2005). Light Warlpiri: A new language. Australian Journal
of Linguistics 25 (1), 31-57.

Panfilov, V. Z. (1965). Grammatika Nivxskogo Jazyka. Akademia Nauk SSSR,

Philomena Hedwig, D. (1999). A Grammar of Maybrat: a language of Bird's
Head, Irian Jaya, Indonesia. Ph. D. thesis, Rijksuniversiteit te Leiden.

Rana, B. K. (2002). New materials on the Kusunda language. Presented at
the Fourth Round Table International Conference on Ethnogenesis of South
and Central Asia, Harvard University, Cambridge, MA, USA. May 11-13, 2002.

Reh, M. (1985). Die Krongo-Sprache (N`ýino Mó-d`ý): Beschreibing, Texte,
Wörterverzeichnis, Volume 12 of Kölner Beiträge zur Afrikanistik. Dietrich
Reimer Verlag, Berlin.

Rice, K. (1989). A Grammar of Slave, Volume 5 of Mouton Grammar Library.
Mouton de Gruyter.

Rodrigues, A. D. (2005). Sobre as línguas indígenas e sua pesquisa no
brasil. Ciência e Cultura 57 (2), 35-38.

Róna-Tas, A. (1998). The reconstruction of Proto-Turkic and the genetic
question. In L. Johanson and E. Csató-Johanson (Eds.), The Turkic
Languages. Routledge, London & New York.

Rubongoya, L. T. (1999). A Modern Runyoro-Rutooro Grammar, Volume 9 of
East African Languages and Dialects. Rüdiger Köppe Verlag, Köln.

Savà, G. (2003). Ongota (Birale), A moribund language of southwest
Ethiopia. In M. Janse and S. Tol (Eds.), Language Death and Language
Maintenance: Theoretical, Practical and Descriptive Approaches, Volume 240
of Current Issues in Linguistic Theory, pp. 171-188. John Benjamins,

Savà, G. and M. Tosco (2000). A sketch of Ongota: A dying language of
southwestern Ethiopia. Studies in African Linguistics 29 (2), 59-135.

Schadeberg, T. C. and F. U. Mucanheia (2000). Ekoti: the Maka or Swahili
language of Angoche, Volume 11 of East African Languages and Dialects.
Rüdiger Köppe Verlag, Köln.

Schuh, R. G. (2005). The nominal and verbal morphology of Western Bade. In
A. S. Kaye (Ed.), Morphologies of Africa and Asia. Eisenbrauns.

Segerer, G. (2002). La langue Bijogo de Bubaque (Guinée-Bissau), Volume 3
of Afrique et Langage. Peeters, Paris.

Seki, L. (1999). A lingüística indígena no brasil. Documentação de Estudos
em Lingüística Teórica e Aplicada 15, 257-290.

Shashi, S. S. (Ed.) (1994). Island Tribes of Andaman & Nicobar, Volume 7
of Encyclopaedia of Indian Tribes. Anmol, New Delhi.

Shimizu, K. (1979). A Comparative Study of the Mumuye Dialects (Nigeria),
Volume 14 of Marburger Studien zur Afrika- und Asienkunde: Serie A:
Afrika. Verlag von Dietrich Reimer, Berlin.

Skribnik, E. (2003). Buryat. In J. Janhunen (Ed.), The Mongolic Languages,
Routledge Family Series. Routledge, London & New York.

Smeets, I. (1989). A Mapuche Grammar. Ph. D. thesis, Rijksuniversiteit te

Svantesson, J.-O. (2003). Khalkha. In J. Janhunen (Ed.), The Mongolic
Languages, Routledge Family Series, pp. 154-176. Routledge, London & New

Teferra, A. (1991). A sketch of Shabo grammar. In M. L. Bender (Ed.),
Proceedings of the Fourth Nilo-Saharan Linguistics Colloquium, Volume 7 of
Nilo-Saharan: Linguistics Analyses and Documentation, pp. 371-387. Helmut
Buske Verlag, Hamburg.

Tersis, N. (1972). Le Zarma (République du Niger): Étude du parler djerma
de Dosso, Volume 33-34 of Société d'Études Linguistiques et
Anthropologiques de France. Centre National de la Récherche Sciéntifique,

Thurgood, G. and R. LaPolla (Eds.) (2003). The Sino-Tibetan Languages.
Routledge Language Family Series. Routledge, London & New York.

Tsukida, N. (2005). Seediq. In A. Adelaar and N. Himmelmann (Eds.), The
Austronesian Languages of Asia and Madagascar, Routledge Language Family
Series, pp. 291-325. Routledge, London & New York.

Ungnad, A. (1964). Grammatik des Akkadischen. C. H. Beck, München.

van der Voort, H. (2000). A Grammar of Kwaza: A description of an
endangered and unclassified indigenous language of Southern Rondônia. Ph.
D. thesis, Rijksuniversiteit te Leiden.

van Driem, G. (2003). Review of G. Thurgood and R. K. LaPolla (2003).
Bulletin of the School of Oriental and African Studies 66 (2), 282-284.

van Engelehoven, A. and C.W. van Klinken (2005). Tetun and Leti. In A.
Adelaar and N. Himmelmann (Eds.), The Austronesian Languages of Asia and
Madagascar, Routledge Language Family Series, pp. 735-768. Routledge,
London & New York.

van Klinken, C. L. (1999). A Grammar of the Fehan Dialect of Tetun: An
Austronesian language of West Timor, Volume 155 of Pacific Linguistics:
Series C. Research School of Pacific and Asian Studies, Australian
National University, Canberra.

Vieira Cândido, G. (2004). Descrição Morfossintática da Língua Shanenawa.
Ph. D. thesis, Universidad Estadual de Campinas, São Paulo.

Walsh, M. T. and I. N. Swilla (2000). Linguistics in the corridor: A
review of research on the Bantu languages of south-west Tanzania, north-
east Zambia, and north Malawi. Draft expansion of paper presented at
International Colloquium on Kiswahili in 2000.

Watters, D. E. (2002). A Grammar of Kham. Cambridge Grammatical
Descriptions. Cambridge University Press.

Watters, J. R. (2003). Grassfields Bantu. In D. Nurse and G. Philippson
(Eds.), The Bantu Languages, Routledge Language Family Series, pp. 225-
256. Routledge, London & New York.

Westphal, E. O. J. (1963). The linguistic prehistory of Southern Africa:
Bush, Kwadi, Hottentot and Bantu linguistic relationships. Africa 33, 237-

Westphal, E. O. J. (1971). Click languages of southern and eastern Africa.
In T. A. Sebeok (Ed.), Linguistics in Sub-Saharan Africa, Volume 7 of
Current Trends in Linguistics, pp. 367-420. Mouton de Gruyter.

Williams-van Klinken, C., J. Hajek, and R. Nordlinger (2001). Tetun Dili:
A Grammar of an East Timorese Language, Volume 528 of Pacific Linguistics.
Research School of Pacific and Asian Studies, The Australian National

Williams-van Klinken, C., J. Hajek, and R. Nordlinger (2002). A Short
Grammar of Tetun Dili, Volume 388 of Languages of the World/Materials.
Lincom GmbH, München.

Williamson, K. (1969). A Grammar of the Kolokuma Dialect of Ijo. Cambridge
University Press in association with West African Linguistic Society,
University of Ibadan, Nigeria.

Wilson, W. A. A. (1961). Outline of the Balanta language. African Language
Studies 2, 139-168.

Winter, J. C. (1981). Die Khoisan-Familie. In B. Heine, T. Schadeberg, and
E. Wolff (Eds.), Die Sprachen Afrikas, pp. 329-374. Helmut Buske Verlag,

Woollams, G. (2005). Karo Batak. In A. Adelaar and N. Himmelmann (Eds.),
The Austronesian Languages of Asia and Madagascar, Routledge Language
Family Series, pp. 534-561. Routledge, London & New York.

Wurm, S. and S. Hattori (1981). Language Atlas of the Pacific Area, Volume
66 of Pacific Linguistics: Series C. Research School of Pacific and Asian
Studies, Australian National University, Canberra.

Wurm, S. A. (2003). The language situation and language endangerment in
the greater Pacific area. In M. Janse and S. Tol (Eds.), Language Death
and Language Maintenance: Theoretical, Practical and Descriptive
Approaches, Volume 240 of Current Issues in Linguistic Theory, pp. 15-48.
John Benjamins, Amsterdam.

Zepeda, O. (1983). A Tohono O'odham Grammar. University of Arizona Press,

Zima, P. (1994). Lexique Dendi (Songhay) (Djougou, Bénin) avec un index
Français-Dendi, Volume 4 of Westafrikanische Studien: Frankfurter Beiträge
zur Sprach- und Kulturgeschichte. Rüdiger Köppe Verlag, Köln.


Harald Hammarström is a PhD Student in Computational Linguistics at the
Depertment of Computing Science at Chalmers University of Technology,
Gothenburg, Sweden. His current research topic is Unsupervised Learning of
Concatenative Morphology but interests go significantly wider and include
linguistic typology and computational linguistics in general.