LINGUIST List 23.2390

Sat May 19 2012

Review: Language Documentation; Ling Theories: Haig et al. (2011)

Editor for this issue: Monica Macaulay <>

Date: 19-May-2012
From: Daniel Hieber <>
Subject: Documenting Endangered Languages: Achievements and Perspectives
E-mail this message to a friend

Discuss this message

Announced at
EDITORS: Geoffrey L. J. Haig, Nicole Nau, Stefan Schnell, Claudia WegenerTITLE: Documenting Endangered LanguagesSUBTITLE: Achievements and PerspectivesSERIES TITLE: Trends in Linguistics: Studies and Monographs [TilSM] 240PUBLISHER: De Gruyter MoutonYEAR: 2011

Daniel W. Hieber, Associate Researcher, Rosetta Stone


In recent years, documentary linguistics has established itself as a disciplinein its own right, with a unique set of theories, challenges, and methodologies.“Documenting Endangered Languages: Achievements and Perspectives” seeks toadvance this discipline, and joins an important category of books which havehelped defined the field and its foci, including Farfán & Ramallo (2010),Gippert, Himmelmann, & Mosel (2006), Grenoble & Furbee (2010), Harrison, Rood, &Dwyer (2008), and Janse & Tol (2003). According to the back cover, “This volumeshowcases recent developments in methodology, technology and analysis, drawingon experience gained in a global range of documentation projects.” It consistslargely of case studies, but includes several chapters with broader perspectivesas well. The book is dedicated to Ulrike Mosel, and includes a laudatory prefacehighlighting her notable contributions to the field of documentary linguistics.


After the preface, an introductory chapter outlines some of the history ofdocumentary linguistics as a field, particularly in relation to the VolkswagenFoundation’s DoBeS program (Dokumentation bedrother Sprachen), and sketches whatthe authors see to be the most salient lessons we have learned from the field,namely: 1) Focus on documenting the full range of communicative practices; 2)Concern for long-term storage and preservation of primary data; 3) Closecooperation with, and direct involvement of, the speech community; and 4) Thescientific potential in large and diverse amounts of digitally archived data.

From there, the book is organized into four sections: 1) Theoretical issues inlanguage documentation, which focuses on broader perspectives on the field; 2)Documenting language structure, which presents lessons from a series of casestudies on specific language documentation projects, each covering a differentaspect of language structure; 3) Documenting the lexicon, focused on thecreation of rich ethnographic and encyclopedic lexica; and 4) Interaction withspeech communities, which treats the impact of fieldwork and the outputs ofdocumentation.

CHAPTER 2, ‘Competing motivations for documenting endangered languages’ (FrankSeifart), offers a four-way typology of the impetus for doing languagedocumentation: for the preservation of human cultural heritage; to enhance theempirical basis of linguistics; for and by the speech community; and in order tostudy language contact. For each type, the author clarifies their respectiverequirements, in terms of their content and apparatus (in the sense ofHimmelmann [2002, 2006]). Also discussed are several cases where thesemotivations compete. The chapter is very brief, and serves mainly as a quickoverview of some of the whys of language documentation.

CHAPTER 3, ‘Evolving challenges in archiving and data infrastructures’ (BaanBroeder, Han Sloetjes, Paul Trilsbeek, Dieter van Uytvanck, Menzo Windhouwer,and Peter Wittenburg) also presents a high-level overview, of the manifoldissues in data archiving. It covers issues and strategies in data handling, witha brief look at how formats and storage capacities have evolved over time, andhighlights the important role that DoBeS has played in establishing metadataformats and archiving standards. The authors then walk the reader through someof the core issues in data archiving, such as meeting the needs of variousstakeholders, long-term preservation requirements and what this entails for fileformats and the organization of the metadata, access restrictions, and legal andethical issues. They note the inherent conflict between access restrictions fordata and the recent push in academia towards open access to research results anddata. They also discuss a range of tools for enhancing data, namely ELAN, ANNEX,and LEXUS, and detail some of the exciting advances in software for searchingand browsing through content and metadata. They end with a section on newchallenges (and benefits) in the field, such as improvements in recordingequipment and connectivity, allowing for easier dissemination of materials, andchanges in the preservation/curation of data, such as a recent new losslessvideo format (MJPEG2000), and updating their archive to participate in“externally registered persistent identifiers” (51). One large concern theypoint out is that “the amount of recorded media streams that is not beingtouched (annotated in some form to make it ready for analysis) is increasingcontinuously which means that much of the stored data will effectively not be ofmuch use to anyone other than the person who collected it” (52).

CHAPTER 4, ‘Comparing corpora from endangered language projects: Explorations inlanguage typology based on original texts’ (Geoffrey Haig, Stefan Schnell, andClaudia Wegener), illustrates some promising ways that the massive digitalarchives which have been accumulated on endangered languages can be utilized forcross-linguistic research. While the potential for such research is enormous,few studies have undertaken this challenge to date. This chapter fills that gapthrough examining some basic properties of information structure (thedistribution of S, A, P, and pronouns) across texts in five languages, using theGRAID (Grammatical Relations and Animacy in Discourse) annotation schema. Indoing so, they demonstrate that there is indeed validity and feasibility totypological investigations utilizing data from language documentation, withsignificant payoffs.

CHAPTER 5 examines “Words” in Kharia: Phonological, morpho-syntactic and“orthographical” aspects (John Peterson). It presents a fascinating study ofspeakers’ intuitions regarding “words” in the Kharia language. After giving abrief overview of the phonological and morphosyntactic criteria for wordhood inKharia, which might be called an “agglutinating” language even though it reliesheavily on clitics rather than affixes, the author presents the judgments of sixdifferent speakers when presented with a spoken sentence they were asked towrite down. The results show that speakers vary widely in theirconceptualization of “words”, and particularly interesting was that “the onlyprinciple which seems to hold for all is that speakers / writers tend to givepriority to phonology over morpho-syntax when these do not coincide […] In fact,the preference for phonological criteria can even be so strong that singlemorphemes can be divided up into two different written words” (116). While acriticism of this chapter is that the author did not explicitly pull out lessonsfor other documentation projects, such as suggested metrics or procedures, itcontains valuable insights for documentation regardless.

CHAPTER 6 is titled ‘Aspect in Forest Enets and other Siberian indigenouslanguages: When grammaticography and lexicography meet different metalanguages’(Florian Siegl). It reiterates the oft-recited lesson that “grammaticalcategories should be described without interference from grammaticaldescriptions and traditions of majority or related languages” (145) by means ofa detailed case study of aspect in Forest Enets. The system of aspect in ForestEnets has traditionally been described along the same lines as that of Russian,as consisting of ‘aspectual pairs’ of verbs rather than differences ininflection / morphology. The author convincingly demonstrates, however, thatthis analysis is not appropriate to Forest Enets. The grammatical traditions ofRussian have been carried over and applied in often subtle ways, such asborrowing dictionary conventions from Russian dictionaries, which overplays thekind of perfective/imperfective opposition common to Russian, but which isforeign to Forest Enets.

CHAPTER 7, ‘Documentary linguistics and prosodic evidence for the syntax ofspoken language’ (Candide Simard and Eva Schultze-Berndt) argues that it is bothfeasible and necessary to study the prosodic system of a language in theanalysis of syntactic constructions, via a case study of prosody in the languageJamingung. The authors illustrate how “it is possible to distinguish, on thebasis of prosodic evidence alone, constructions such as reactivated topics vs.afterthoughts; afterthoughts vs. discontinuous noun phrases, and two subtypes ofdiscontinuous noun phrase” (172). This is clearly a valuable set of tools forlanguage documentation and analysis.

In CHAPTER 8, ‘Diphthongology meets language documentation: The Finnishexperience’, Klaus Geyer presents a new method for analyzing diphthongs, usingFinnish as a case study. This is a much-needed contribution since, as the authorpoints out, guides to phonological analysis “remain somewhat fuzzy with respectto procedures for working out a potential diphthong inventory” (178). Geyer’ssystem makes use of both the static features of diphthongs (articulatory originand end points) and dynamic ones (movement in vertical tongue position, andfalling vs. rising sonority) to create a matrix of distinctive features, whichcan then be used to distinguish a variety of diphthongs -- enough to handle eventhe remarkably complex set of diphthongs in Finnish. The chapter closes with abrief summary of the “diphthong analysis and description tool”, a tool whichfuture field workers would do well to utilize.

CHAPTER 9, ‘Retelling data: Working on transcription’ (Dagmar Jung and NikolausP. Himmelmann) is a highly practical chapter detailing some ubiquitous hurdlesto working on transcription. The first is to point out why transcription is suchan alien and unnatural task to native speakers, and because of this, howspeakers undertake the task only with great reluctance. The latter half of thechapter focuses on frequently-encountered strategies used by speakers when doingtranscription, which arise because linguists and native speakers see the goalsof transcription differently. Speakers use methods such as paraphrasing,editing-out, changing, and editing-in to adapt the record as they seeappropriate, and the authors caution that it is important to document thesechanges and their motivations, to provide both a precise transcript and a recordof changes applied by the speaker.

CHAPTER 10 details ‘The making of a multimedia encyclopaedic lexicon for and inendangered speech communities’ (Gabriele Cablitz). This chapter showcases therecently-created online lexicon tool LEXUS, developed by the Max PlanckInstitute for Psycholinguistics, in conjunction with the relational linking toolViCoS (Visualization of Conceptual Spaces), in the documentation of theMarquesan languages in French Polynesia. The author conveys sound advice forenriching a lexicon with encyclopedic information, and creating interactivevisual folk taxonomies and ‘cultural knowledge spaces’ using ViCoS, which allowsend-users of the dictionary to better understand the relations and taxonomiesthat hold between words. In addition, this chapter offers advice for web-basedcollaboration with speech communities on lexicon projects, including design,pitfalls and benefits, and capacity building. The final section discusseslexicography in documentary linguistics, where the author argues for lexicaldatabases as not just documentation aids, as suggested by Himmelmann (2006: 10),but as “an essential part of a language documentation itself” (252).

CHAPTER 11, ‘What does it take to make an ethnographic dictionary? On thetreatment of fish and tree names in dictionaries of Oceanic languages’ (AndrewPawley) advocates rich semantic descriptions for dictionary entries, rejecting aprincipled distinction between lexical and encyclopedic knowledge. Instead, theauthor argues that “it makes more sense to ask ‘Of the many characteristics […]known to English speakers, which are the most salient?’ and ‘For the varioususers of the dictionary, what is likely to be the most useful information toinclude?’” (277). This chapter presents important concepts in taxonomic systemsand definition types, and highlights some of the challenges involved for anyoneintending to do a first general dictionary of a language, as well as advice forovercoming such hurdles.

CHAPTER 12, ‘Language is power: The impact of fieldwork on community politics’(Even Hovdhaugen and Åshild Næss) presents an interesting case study from theauthors’ own global fieldwork experiences, and particularly their work on theVaekau-Taumako and Äiwoo languages of the Solomon islands, which address some ofthe many complex political and ethical issues involved in fieldwork. It isalways instructive to see where other fieldworkers have run into difficulties,how they resolved those problems, and what they believe they could have donebetter. The authors present the story of their conflict, and provide a soundanalysis of the problem in a way that demonstrates the importance ofunderstanding local power structures. Just as importantly, the authors show thatexisting power structures may not always be adequate to address the particularneeds of a documentation project. The very presence of a fieldworker cansometimes force new bodies of authority, power structures, or administrativedistricts to come into existence. This is particularly true when the boundariesof the language community do not align with currently-existing politicalboundaries, and so new boundaries must be created.

CHAPTER 13, ‘Sustaining Vurës: Making products of language documentationaccessible to multiple audiences’ (Catriona Hyslop Malau) showcases aninnovative documentary video project with the goal of fostering revitalizationand reaching as broad an audience as possible. To that end, the audio in thedocumentaries is entirely in the Vurës language, but every aspect of the filmsand their packaging is also available in either English or Bislama (the nationallanguage of Vanuatu). Two features of these films are especially interesting:First, they include background information on the language and issues oflanguage endangerment, with information about how some regional languages havealready been lost, and explaining that this is the reason for the production ofthe documentary. Second, each documentary includes a number of dictionaryentries presented on screen, depicted and defined, which supplement some of thekey concepts and topics in the film. Finally, the author relates how these filmshave already been useful in maintaining and spreading cultural practicesthroughout the region.

Finally, CHAPTER 14 treats ‘Filming with native speaker commentary’ (AnnaMargetts), and imparts a novel methodology for collecting commentary. Whenprevious attempts at running commentary fell flat, producing little usable data,the author, by happy accident, came upon the idea of recording live sportscommentary. The new data was linguistically rich and provided a new type ofcommunicative event in her corpus, one that was much more engaging to speakers.A useful section of this chapter is also the author’s discussion of an eventwhich did not have running commentary but would have benefited from it, notingthat such commentary makes useful metadata and contains other ethnographicinformation useful in compiling rich lexicons like those outlined in chapters 10and 11.


This book will be an excellent addition to the library of any documentarylinguist. Experienced linguists will find a number of new methodologies toutilize in their work, while younger linguists will find in-depth treatments ofa variety of specific topics not covered (or not covered with any depth) inintroductory surveys, handbooks, or field guides. The book is perhaps mostsimilar to “Essentials of Language Documentation” (Gippert, Himmelmann, andMosel 2006), and covers many related and similar topics. But whereas“Essentials” might be seen as the seminal survey of the field and its centraltopics, the present volume is more of an ‘advanced topics in documentarylinguistics’, an excellent sequel to the former. As such, it consists largely ofcase studies on specific topics, and does not aim for comprehensive scope overthe field. So while the book should not be seen as an all-inclusive handbook orsurvey, it does advance the field significantly in many areas.

One complaint I have with this book is that the title is somewhat misleading,and the mismatch between my expectations and the actual content of the bookperhaps hindered me from appreciating it at first. With a subtitle like‘Achievements and Perspectives’, one would expect more sections on the historyof documentary linguistics or its practitioners, successful revitalizationmodels from different communities, or broad perspectives on the field ingeneral. To be sure, several chapters fit this bill well, including thelaudatory preface to Ulrike Mosel, Haig et al.’s chapter on motivations fordocumenting languages, Broeder et al.’s chapter on evolving challenges inarchiving, and Hovdhaugen & Åshild Næss’ chapter on the impact of fieldwork. Theremaining chapters, however, deal with far more specific topics, almost all ofwhich are case studies.

On the other hand, the specificity of these chapters, and the extent to whichthey offer new techniques and insights into these topics, is one of thestrengths of this book. Each of these chapters offers valuable lessons fordocumentary linguistics from experienced fieldworkers; given this, a subtitlesuch as ‘Lessons from the Field’ may have been more appropriate. In fact, Ithink the best description of what this book offers comes from the statement onthe back cover: “This volume showcases recent developments in methodology,technology and analysis, drawing on experience gained in a global range ofdocumentation projects.” Taking this, rather than the title, as the intendedgoal for the book, it is clear that the editors have met and exceeded theirobjective. The lessons in this volume are indispensable contributions to thefield that make significant advances in the practice of documentary linguisticsas a whole. Any documentary linguist, whether weathered veterans or justentering the field, would be remiss to neglect the lessons from it.


Farfán, José Antonio Flores, and Fernando Ramallo, eds. 2010. New perspectiveson endangered languages: Bridging the gaps between sociolinguistics,documentation and language revitalization. Amsterdam: John Benjamins.

Gippert, Jost, Nikolaus P. Himmelmann, and Ulrike Mosel, eds. 2006. Essentialsof language documentation. Berlin: Mouton de Gruyter.

Grenoble, Lenore A., and N. Louanna Furbee, eds. 2010. Language documentation:Practice and values. Amsterdam: John Benjamins.

Harrison, K. David, David S. Rood, and Arienne M. Dwyer, eds. 2008. Lessons fromdocumented endangered languages. Amsterdam: John Benjamins.

Himmelmann, Nikolaus P. 2002. “Documentary and descriptive linguistics.”Linguistics 36: 161-195.

Himmelmann, Nikolaus P. 2006. Language documentation: What is it and what is itgood for? In Essentials of Language Documentation, ed. Jost Gippert, Nikolaus P.Himmelmann, and Ulrike Mosel, 1-30. Berlin: Mouton de Gruyter.

Janse, Mark, and Sijmen Tol, eds. 2003. Language death and language maintenance:Theoretical, practical, and descriptive approaches. Amsterdam: John Benjamins


Danny Hieber is a Linguist at Rosetta Stone, and has helped create language-learning software for the Chitimacha, Navajo, Iñupiaq, and Inuttitut languages. He also writes on language issues in the popular press. His primary interests are language typology, documentary and descriptive linguistics, and the economics and praxeology of language. He holds a B.S. in Linguistics and Philosophy from The College of William & Mary in Virginia. Learn more about his work at

Page Updated: 19-May-2012