LINGUIST List 14.1367

Tue May 13 2003

Sum: Parallel Multilingual Wordlists

Editor for this issue: Steve Moran <>


  1. Daniel Wedgwood, Sum: Parallel multilingual wordlists

Message 1: Sum: Parallel multilingual wordlists

Date: Tue, 13 May 2003 10:43:38 +0000
From: Daniel Wedgwood <>
Subject: Sum: Parallel multilingual wordlists

About a month ago I posted an enquiry (Linguist 14.1083) about the
availability of parallel word lists (as opposed to lists of cognates)
in different languages. I got the impression from some of the
responses that there are others out there who would appreciate as much
information on this as possible, so I'm including here not just a
summary of responses but also some sources of data that I found by
other means.

Many thanks to everyone who responded. Here is a summary of the

Robin Thelwall <> kindly offered offprints and
references relating to work done by himself and others containing
lists for a number of Nilo-Saharan languages and ''various bits of

Radmila Djordjevic <raljusezampro.yu> pointed out the existence of
Berlitz multilingual European dictionaries, which are primarily aimed
at tourists, but could be of use to linguists seeking parallel data
for multiple languages.

Yuri Koryakov <> suggested looking at the very rich
source of multilingual vocabulary data on Sergei Starostin's Babel
Tower website: 

This turns out to be organised generally around cognates, so Yuri was
kind enough to send a number of 100-word Swadesh lists which he has in
files in .dbf format.

Natalia Slaska <> sent pointers to a
large number of useful resources. First, the website of the project
that she is involved in, which will interest people who are working
with word lists for historical linguistic purposes: 

Second, an extremely useful set of references to printed works
containing Swadesh lists for various languages. Many are from issues
of IJAL from the 1950s and 60s. Space doesn't allow me to reproduce
all the references here; I'd be happy to forward them to anyone who

To all these very useful suggestions, I would add the following
resources: has a database of Swadesh lists for a wide
variety of languages, but few of them are complete.

Dyen, Isidore, Joseph B Kruskal, and Paul Black (1992) An Indoeuropean
Classification: A Lexicostatistical Experiment, Transactions of the
American Philosophical Society, vol. 82, part 5. Large data set (95
languages/dialects) available online at:

Also at is a collection of lists taken from various
small online dictionaries - these are not identical lists, but many
overlap to a worthwhile extent.

Jacques Guy's GLOTTO program comes with 200-word lists for 16
Indoeuropean languages. The meanings are not the Swadesh list, but
mainly nouns taken from Bergman, Peter (1968) The Concise Dictionary
of 26 Languages in Simultaneous Translation, Signet Books. For online
availability, see Guy's posting to the Linguist List, (Linguist 5.630).

1600-word lists for English, Italian, German, Dutch and the invented
languages Esperanto, Novial and Tsolyani can be downloaded from

(Permission is requested by the website owner for use of these data in

200-word lists for 7 languages used in Kessler, B. (2001) The
significance of word lists, Stanford: Center for the Study of Language
and Information. are available as a well-annotated XML file from
Brett Kessler's website:

Tryon, Darrel (1995) Comparative Austronesian Dictionary. Berlin:
Mouton de Gruyter. is apparently a possible source of parallel word
lists for Austronesian languages (I haven't yet seen a copy myself).

I've found many of these very useful; others should suit the research
aims of other list-hunters.

Dan Wedgwood
Theoretical and Applied Linguistics,
University of Edinburgh 
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue