EDITORS: Geoffrey L. J. Haig, Nicole Nau, Stefan Schnell, Claudia Wegener TITLE: Documenting Endangered Languages SUBTITLE: Achievements and Perspectives SERIES TITLE: Trends in Linguistics: Studies and Monographs [TilSM] 240 PUBLISHER: De Gruyter Mouton YEAR: 2011
Daniel W. Hieber, Associate Researcher, Rosetta Stone
In recent years, documentary linguistics has established itself as a discipline in its own right, with a unique set of theories, challenges, and methodologies. “Documenting Endangered Languages: Achievements and Perspectives” seeks to advance this discipline, and joins an important category of books which have helped defined the field and its foci, including Farfán & Ramallo (2010), Gippert, Himmelmann, & Mosel (2006), Grenoble & Furbee (2010), Harrison, Rood, & Dwyer (2008), and Janse & Tol (2003). According to the back cover, “This volume showcases recent developments in methodology, technology and analysis, drawing on experience gained in a global range of documentation projects.” It consists largely of case studies, but includes several chapters with broader perspectives as well. The book is dedicated to Ulrike Mosel, and includes a laudatory preface highlighting her notable contributions to the field of documentary linguistics.
After the preface, an introductory chapter outlines some of the history of documentary linguistics as a field, particularly in relation to the Volkswagen Foundation’s DoBeS program (Dokumentation bedrother Sprachen), and sketches what the authors see to be the most salient lessons we have learned from the field, namely: 1) Focus on documenting the full range of communicative practices; 2) Concern for long-term storage and preservation of primary data; 3) Close cooperation with, and direct involvement of, the speech community; and 4) The scientific potential in large and diverse amounts of digitally archived data.
From there, the book is organized into four sections: 1) Theoretical issues in language documentation, which focuses on broader perspectives on the field; 2) Documenting language structure, which presents lessons from a series of case studies on specific language documentation projects, each covering a different aspect of language structure; 3) Documenting the lexicon, focused on the creation of rich ethnographic and encyclopedic lexica; and 4) Interaction with speech communities, which treats the impact of fieldwork and the outputs of documentation.
CHAPTER 2, ‘Competing motivations for documenting endangered languages’ (Frank Seifart), offers a four-way typology of the impetus for doing language documentation: for the preservation of human cultural heritage; to enhance the empirical basis of linguistics; for and by the speech community; and in order to study language contact. For each type, the author clarifies their respective requirements, in terms of their content and apparatus (in the sense of Himmelmann [2002, 2006]). Also discussed are several cases where these motivations compete. The chapter is very brief, and serves mainly as a quick overview of some of the whys of language documentation.
CHAPTER 3, ‘Evolving challenges in archiving and data infrastructures’ (Baan Broeder, Han Sloetjes, Paul Trilsbeek, Dieter van Uytvanck, Menzo Windhouwer, and Peter Wittenburg) also presents a high-level overview, of the manifold issues in data archiving. It covers issues and strategies in data handling, with a brief look at how formats and storage capacities have evolved over time, and highlights the important role that DoBeS has played in establishing metadata formats and archiving standards. The authors then walk the reader through some of the core issues in data archiving, such as meeting the needs of various stakeholders, long-term preservation requirements and what this entails for file formats and the organization of the metadata, access restrictions, and legal and ethical issues. They note the inherent conflict between access restrictions for data and the recent push in academia towards open access to research results and data. They also discuss a range of tools for enhancing data, namely ELAN, ANNEX, and LEXUS, and detail some of the exciting advances in software for searching and browsing through content and metadata. They end with a section on new challenges (and benefits) in the field, such as improvements in recording equipment and connectivity, allowing for easier dissemination of materials, and changes in the preservation/curation of data, such as a recent new lossless video format (MJPEG2000), and updating their archive to participate in “externally registered persistent identifiers” (51). One large concern they point out is that “the amount of recorded media streams that is not being touched (annotated in some form to make it ready for analysis) is increasing continuously which means that much of the stored data will effectively not be of much use to anyone other than the person who collected it” (52).
CHAPTER 4, ‘Comparing corpora from endangered language projects: Explorations in language typology based on original texts’ (Geoffrey Haig, Stefan Schnell, and Claudia Wegener), illustrates some promising ways that the massive digital archives which have been accumulated on endangered languages can be utilized for cross-linguistic research. While the potential for such research is enormous, few studies have undertaken this challenge to date. This chapter fills that gap through examining some basic properties of information structure (the distribution of S, A, P, and pronouns) across texts in five languages, using the GRAID (Grammatical Relations and Animacy in Discourse) annotation schema. In doing so, they demonstrate that there is indeed validity and feasibility to typological investigations utilizing data from language documentation, with significant payoffs.
CHAPTER 5 examines “Words” in Kharia: Phonological, morpho-syntactic and “orthographical” aspects (John Peterson). It presents a fascinating study of speakers’ intuitions regarding “words” in the Kharia language. After giving a brief overview of the phonological and morphosyntactic criteria for wordhood in Kharia, which might be called an “agglutinating” language even though it relies heavily on clitics rather than affixes, the author presents the judgments of six different speakers when presented with a spoken sentence they were asked to write down. The results show that speakers vary widely in their conceptualization of “words”, and particularly interesting was that “the only principle which seems to hold for all is that speakers / writers tend to give priority to phonology over morpho-syntax when these do not coincide […] In fact, the preference for phonological criteria can even be so strong that single morphemes can be divided up into two different written words” (116). While a criticism of this chapter is that the author did not explicitly pull out lessons for other documentation projects, such as suggested metrics or procedures, it contains valuable insights for documentation regardless.
CHAPTER 6 is titled ‘Aspect in Forest Enets and other Siberian indigenous languages: When grammaticography and lexicography meet different metalanguages’ (Florian Siegl). It reiterates the oft-recited lesson that “grammatical categories should be described without interference from grammatical descriptions and traditions of majority or related languages” (145) by means of a detailed case study of aspect in Forest Enets. The system of aspect in Forest Enets has traditionally been described along the same lines as that of Russian, as consisting of ‘aspectual pairs’ of verbs rather than differences in inflection / morphology. The author convincingly demonstrates, however, that this analysis is not appropriate to Forest Enets. The grammatical traditions of Russian have been carried over and applied in often subtle ways, such as borrowing dictionary conventions from Russian dictionaries, which overplays the kind of perfective/imperfective opposition common to Russian, but which is foreign to Forest Enets.
CHAPTER 7, ‘Documentary linguistics and prosodic evidence for the syntax of spoken language’ (Candide Simard and Eva Schultze-Berndt) argues that it is both feasible and necessary to study the prosodic system of a language in the analysis of syntactic constructions, via a case study of prosody in the language Jamingung. The authors illustrate how “it is possible to distinguish, on the basis of prosodic evidence alone, constructions such as reactivated topics vs. afterthoughts; afterthoughts vs. discontinuous noun phrases, and two subtypes of discontinuous noun phrase” (172). This is clearly a valuable set of tools for language documentation and analysis.
In CHAPTER 8, ‘Diphthongology meets language documentation: The Finnish experience’, Klaus Geyer presents a new method for analyzing diphthongs, using Finnish as a case study. This is a much-needed contribution since, as the author points out, guides to phonological analysis “remain somewhat fuzzy with respect to procedures for working out a potential diphthong inventory” (178). Geyer’s system makes use of both the static features of diphthongs (articulatory origin and end points) and dynamic ones (movement in vertical tongue position, and falling vs. rising sonority) to create a matrix of distinctive features, which can then be used to distinguish a variety of diphthongs -- enough to handle even the remarkably complex set of diphthongs in Finnish. The chapter closes with a brief summary of the “diphthong analysis and description tool”, a tool which future field workers would do well to utilize.
CHAPTER 9, ‘Retelling data: Working on transcription’ (Dagmar Jung and Nikolaus P. Himmelmann) is a highly practical chapter detailing some ubiquitous hurdles to working on transcription. The first is to point out why transcription is such an alien and unnatural task to native speakers, and because of this, how speakers undertake the task only with great reluctance. The latter half of the chapter focuses on frequently-encountered strategies used by speakers when doing transcription, which arise because linguists and native speakers see the goals of transcription differently. Speakers use methods such as paraphrasing, editing-out, changing, and editing-in to adapt the record as they see appropriate, and the authors caution that it is important to document these changes and their motivations, to provide both a precise transcript and a record of changes applied by the speaker.
CHAPTER 10 details ‘The making of a multimedia encyclopaedic lexicon for and in endangered speech communities’ (Gabriele Cablitz). This chapter showcases the recently-created online lexicon tool LEXUS, developed by the Max Planck Institute for Psycholinguistics, in conjunction with the relational linking tool ViCoS (Visualization of Conceptual Spaces), in the documentation of the Marquesan languages in French Polynesia. The author conveys sound advice for enriching a lexicon with encyclopedic information, and creating interactive visual folk taxonomies and ‘cultural knowledge spaces’ using ViCoS, which allows end-users of the dictionary to better understand the relations and taxonomies that hold between words. In addition, this chapter offers advice for web-based collaboration with speech communities on lexicon projects, including design, pitfalls and benefits, and capacity building. The final section discusses lexicography in documentary linguistics, where the author argues for lexical databases as not just documentation aids, as suggested by Himmelmann (2006: 10), but as “an essential part of a language documentation itself” (252).
CHAPTER 11, ‘What does it take to make an ethnographic dictionary? On the treatment of fish and tree names in dictionaries of Oceanic languages’ (Andrew Pawley) advocates rich semantic descriptions for dictionary entries, rejecting a principled distinction between lexical and encyclopedic knowledge. Instead, the author argues that “it makes more sense to ask ‘Of the many characteristics […] known to English speakers, which are the most salient?’ and ‘For the various users of the dictionary, what is likely to be the most useful information to include?’” (277). This chapter presents important concepts in taxonomic systems and definition types, and highlights some of the challenges involved for anyone intending to do a first general dictionary of a language, as well as advice for overcoming such hurdles.
CHAPTER 12, ‘Language is power: The impact of fieldwork on community politics’ (Even Hovdhaugen and Åshild Næss) presents an interesting case study from the authors’ own global fieldwork experiences, and particularly their work on the Vaekau-Taumako and Äiwoo languages of the Solomon islands, which address some of the many complex political and ethical issues involved in fieldwork. It is always instructive to see where other fieldworkers have run into difficulties, how they resolved those problems, and what they believe they could have done better. The authors present the story of their conflict, and provide a sound analysis of the problem in a way that demonstrates the importance of understanding local power structures. Just as importantly, the authors show that existing power structures may not always be adequate to address the particular needs of a documentation project. The very presence of a fieldworker can sometimes force new bodies of authority, power structures, or administrative districts to come into existence. This is particularly true when the boundaries of the language community do not align with currently-existing political boundaries, and so new boundaries must be created.
CHAPTER 13, ‘Sustaining Vurës: Making products of language documentation accessible to multiple audiences’ (Catriona Hyslop Malau) showcases an innovative documentary video project with the goal of fostering revitalization and reaching as broad an audience as possible. To that end, the audio in the documentaries is entirely in the Vurës language, but every aspect of the films and their packaging is also available in either English or Bislama (the national language of Vanuatu). Two features of these films are especially interesting: First, they include background information on the language and issues of language endangerment, with information about how some regional languages have already been lost, and explaining that this is the reason for the production of the documentary. Second, each documentary includes a number of dictionary entries presented on screen, depicted and defined, which supplement some of the key concepts and topics in the film. Finally, the author relates how these films have already been useful in maintaining and spreading cultural practices throughout the region.
Finally, CHAPTER 14 treats ‘Filming with native speaker commentary’ (Anna Margetts), and imparts a novel methodology for collecting commentary. When previous attempts at running commentary fell flat, producing little usable data, the author, by happy accident, came upon the idea of recording live sports commentary. The new data was linguistically rich and provided a new type of communicative event in her corpus, one that was much more engaging to speakers. A useful section of this chapter is also the author’s discussion of an event which did not have running commentary but would have benefited from it, noting that such commentary makes useful metadata and contains other ethnographic information useful in compiling rich lexicons like those outlined in chapters 10 and 11.
This book will be an excellent addition to the library of any documentary linguist. Experienced linguists will find a number of new methodologies to utilize in their work, while younger linguists will find in-depth treatments of a variety of specific topics not covered (or not covered with any depth) in introductory surveys, handbooks, or field guides. The book is perhaps most similar to “Essentials of Language Documentation” (Gippert, Himmelmann, and Mosel 2006), and covers many related and similar topics. But whereas “Essentials” might be seen as the seminal survey of the field and its central topics, the present volume is more of an ‘advanced topics in documentary linguistics’, an excellent sequel to the former. As such, it consists largely of case studies on specific topics, and does not aim for comprehensive scope over the field. So while the book should not be seen as an all-inclusive handbook or survey, it does advance the field significantly in many areas.
One complaint I have with this book is that the title is somewhat misleading, and the mismatch between my expectations and the actual content of the book perhaps hindered me from appreciating it at first. With a subtitle like ‘Achievements and Perspectives’, one would expect more sections on the history of documentary linguistics or its practitioners, successful revitalization models from different communities, or broad perspectives on the field in general. To be sure, several chapters fit this bill well, including the laudatory preface to Ulrike Mosel, Haig et al.’s chapter on motivations for documenting languages, Broeder et al.’s chapter on evolving challenges in archiving, and Hovdhaugen & Åshild Næss’ chapter on the impact of fieldwork. The remaining chapters, however, deal with far more specific topics, almost all of which are case studies.
On the other hand, the specificity of these chapters, and the extent to which they offer new techniques and insights into these topics, is one of the strengths of this book. Each of these chapters offers valuable lessons for documentary linguistics from experienced fieldworkers; given this, a subtitle such as ‘Lessons from the Field’ may have been more appropriate. In fact, I think the best description of what this book offers comes from the statement on the back cover: “This volume showcases recent developments in methodology, technology and analysis, drawing on experience gained in a global range of documentation projects.” Taking this, rather than the title, as the intended goal for the book, it is clear that the editors have met and exceeded their objective. The lessons in this volume are indispensable contributions to the field that make significant advances in the practice of documentary linguistics as a whole. Any documentary linguist, whether weathered veterans or just entering the field, would be remiss to neglect the lessons from it.
Farfán, José Antonio Flores, and Fernando Ramallo, eds. 2010. New perspectives on endangered languages: Bridging the gaps between sociolinguistics, documentation and language revitalization. Amsterdam: John Benjamins.
Gippert, Jost, Nikolaus P. Himmelmann, and Ulrike Mosel, eds. 2006. Essentials of language documentation. Berlin: Mouton de Gruyter.
Grenoble, Lenore A., and N. Louanna Furbee, eds. 2010. Language documentation: Practice and values. Amsterdam: John Benjamins.
Harrison, K. David, David S. Rood, and Arienne M. Dwyer, eds. 2008. Lessons from documented endangered languages. Amsterdam: John Benjamins.
Himmelmann, Nikolaus P. 2002. “Documentary and descriptive linguistics.” Linguistics 36: 161-195.
Himmelmann, Nikolaus P. 2006. Language documentation: What is it and what is it good for? In Essentials of Language Documentation, ed. Jost Gippert, Nikolaus P. Himmelmann, and Ulrike Mosel, 1-30. Berlin: Mouton de Gruyter.
Janse, Mark, and Sijmen Tol, eds. 2003. Language death and language maintenance: Theoretical, practical, and descriptive approaches. Amsterdam: John Benjamins
ABOUT THE REVIEWER
ABOUT THE REVIEWER:
Danny Hieber is a Linguist at Rosetta Stone, and has helped create
language-learning software for the Chitimacha, Navajo, Iñupiaq, and
Inuttitut languages. He also writes on language issues in the popular
press. His primary interests are language typology, documentary and
descriptive linguistics, and the economics and praxeology of language. He
holds a B.S. in Linguistics and Philosophy from The College of William &
Mary in Virginia. Learn more about his work at www.danielhieber.com.