Review: Historical Ling.; Text/Corpus Ling.: Kawaguchi, Minegishi & Viereck (2011)

Date: 06-Sep-2012
From: Anna Majek <>
Subject: Corpus-based Analysis and Diachronic Linguistics
EDITORS: Yuji Kawaguchi, Makoto Minegishi & Wolfgang ViereckTITLE: Corpus-based Analysis and Diachronic LinguisticsSERIES TITLE: Tokyo University of Foreign Studies, Studies in Linguistics 3PUBLISHER: John BenjaminsYEAR: 2011

Anna Ewa Majek, School of Linguistic, Speech and Communication Sciences, TrinityCollege Dublin, Dublin, Ireland

INTRODUCTIONThe book under review consists of 14 papers written by a diverse group ofscholars who present and discuss a wide range of topics in different languages.The articles are introduced by a message from the President, a description ofthe Center for Corpus-based Linguistics and Language Education, an explanationof synchronic and diachronic analyses, and short summaries of each article.

SUMMARYThe first paper 'The Atlas Linguarum Europae: A Diachronic Analysis of Its Data'by Wolfgang Viereck starts by presenting a short history and description of'Atlas Linguarum Europae' (ALE map). The construction of ALE began in 1970 andgives an accurate description of Europe's linguistic situation. It distinguishessix languages / families: Altaic, Basque, Caucasian, Indo-European, Semitic andUralic. There are 22 language groups in these language families, some of whichinclude a large number of individual languages. Viereck treats what he sees asthe three important aspects of the interpretation of ALE maps: loanwordresearch, etymological research and motivational research.

The next paper is devoted to 'Variationism and Underuse Statistics in theAnalysis of the Development of Relative Clauses in German'. Anke Lüdeling, HagenHirshmann & Amir Zeldes explore how multi-layer corpus architecture helps inunderstanding change. The focus is methodological, based on an investigation ofthe development of German relative clauses from Old High German to New HighGerman. The paper shows 'how a deeply annotated diachronic corpus can help todetect and study language change' (p.53).

The third paper deals with 'Variation and Change in the MontferrandAccount-books (1259-1367)'. According to Anthony Lodge, the town of Montferrandin central France possesses a great collection of medieval and early-modernarchives recording the town's financial affairs and municipal life covering thetwelfth until the middle of the eighteenth century and written in the localdialect. What is more they include, among other documents, a long series ofaccounting books detailing the town's income and expenditures from 1259 to 1731with explanations and justifications of how and why the town's money was spent.In Lodge's view, these account-books offer a rich source for historicallinguists. He compiles a Montferrand corpus consisting of account-books writtenin Occitan in the period of 1259-1390, divided chronologically into threesections: Tranche I: 1259-1319 (c. 67,000 words) includes primarily thirteencentury material; Tranche II: 1345- 1367 (c. 180,000 words) consists of themiddle third of the fourteenth century, the largest of the three sets; andTranche III: 1372-1385 (c. 165, 000 words) covers the period of language shiftfrom Auvergant to French in 1390. The paper presents examples of lexical,syntactic, morphological and phonetic changes that can be gleaned from this corpus.

Wolfgang Raible presents 'Cognitive Aspects of Language Evolution and LanguageChange: The Example of French Historical Texts'. He analyses the earliest twohistorical texts written in Old French prose, both of which deal with the FourthCrusade (1202- 1204) and presents the following theses: Thesis I: If in such asituation authors try for the first time to write a historical text in prose,they will use already existing generic models. Thesis II: It will still takeconsiderable time until the cognitive and linguistic framework for historicalprose proper will develop. Both were supported in both historical texts.

The next paper concentrates on 'The Importance of Diasystematic Parameters inStudying the History of French' starting with the assumption that hypotheses indiachronic linguistics can be confirmed or dismissed by means of corpora. Toillustrate this topic, Lene Schøsler uses the creation of the 'composed past',from the Latin present form: 'habeo litteras scriptas', literally 'I haveletters [that have been] written'. The main changes from the Latin present formto modern Romance are well known but in her opinion they do not provide answerson many other questions, such as: What is the function of the composed past inthe old texts: is it a present or a past form? What are the phases of change?How does epic tense switching conform to analyses of the composed past? How maywe explain conflicting evidence in the old texts? The case study confirms thehypothesis, 'provided that corpora are composed in such a way that they permitan exploration of relevance for various parameters' (p.105).

Martin Becker presents 'The Reorganisation of Mood in the Epistemic Subsystem-The Case of French Belief Predicates in Diachronic Dynamics', aiming toillustrate how theories of modal semantics and corpus-based empirical researchcan be combined to yield new insight into the processes and mechanisms oflanguage change. Data is taken from the Old French mood system in the domain ofbelief predicates from Old to Classical French. Becker focuses on two basicbelief predicates: the verbs 'cuid(i)er' and 'croire' tested in two corpora: theNew Amsterdam Corpus and Frantext and the middle French subcorpus. The casestudy shows that a theory-based analytical framework combined with historicalcorpora can provide a deeper insight into the principles and mechanism oflanguage change but can not uncover the motivations which drive speakers toswitch systematically from one verb option ('cuidier') to the other ('croire')at a certain period of time.

A paper 'French Liaison in the 18th Century -- Analysis of Gile Vaudelin'sTexts' by Yuli Kawaguchi discusses French liaison and related phenomena in twotexts of Gile Vaudelin's texts, namely: 'Nouvelle manière d'écrire comme onparle en France', published in Paris in 1213 and 'Instructions crétiennes, misesen ortografe naturelle, pour faciliter au peuple la lecture de la sience dusalut', published in Paris in 1715. Kawaguchi processes two texts through theconcordancer AntConc 3.2.1w to obtain quantitative information on verbs,pronouns, articles, possessive adjectives, prepositions, adjectives, adverbs andnumerals and evaluates the situation of French linking phenomena in theeighteenth century.

Antonio Emiliano in 'Issues in the Typographic Representation of MedievalPrimary Sources' states that a bad transcription of medieval primary sources forlinguistic and philological study may ruin a corpus or archive or seriouslydiminish its value for research. He proposes a set of possible strategiesregarding the typographic representation of medieval texts, the aspects ofcorpus encoding and character of encoding procedures that in his view should beused by the researcher intending to carry out a study based on medieval texts.

The next paper focuses on 'An Analysis of the Misuse of the Participle in OldRussian Texts'. According to Yoshinori Onda, Russian language lacks adistinction between the functions of participles and adverbs. The author aims atanalyzing this misuse of participles in the texts from Old Church Slavonic andOld Russian texts and proposes a functional explanation of its causes. Ondapresents two hypotheses: Hypothesis I: the similarity of the syntacticstructures caused the confusion in participle use, Hypothesis II: an attitude ofthe copyist toward the original texts influenced the copied texts. Bothhypotheses were supported but for the second Onda was unable to determine thenature of the relationship between the text type and the attitude of the copyists.

Robert Ratcliffe carries out 'A Preliminary Analysis of Arabic Derived Verbs inthe Leeds Quran Corpus -- With Special Reference to Stem III (CaaCaC)'. Theauthor analyzes data from Leeds Quran Corpus to quantify the semi-productivityof the derived stems in Quranic Arabic.

Makoto Minegishi, Jun Takashima & Ganesh Murmus' paper 'On the Narrow and Open'e' Contrast in Santali' examines whether the contrast between narrow and open'e' is phonologically distinct in Santali. The analysis is carried out in theBSD corpus (Bodding's Santali data). The authors consider the most frequentsyllable patterns and the candidates for minimal pairs that have exactly thesame phonemic environment. The paper concludes 'that the vowel contrast between'el' and 'e2' is not a full-fledged phonemic one' (p.221).

Tomoyuki Yamahata in 'The Classification of Apabhraṃśa -- A Corpus- basedApproach of the Study of Middle Indo-Aryan' investigates the variances of thetexts of Apabhraṃśa language. This language presents great variation acrossdocuments and this leads to numerous classifications of Apabhraṃśa. TomoyukiYamahata reviews these classifications and uses criteria from a corpus ofApabhraṃśa. The corpus consists of eight texts derived from Eastern, Southern,Western and Kashmiri Apabhraṃśa. Yamahata assumes that variation in Apabhraṃśalanguages can be classified on the grounds of style, and specifically shows atendency based on a degree of preference of the pseudo-archaic forms but it isinsufficient for the classification of Apabhraṃśa.

Ayako Shiba in 'Changes in the Meaning and Construction of Polysemous Words: TheCase of 'mieru' and 'mirareru' focuses on revealing how verbs have recentlyextended their evidential meaning. To achieve this Shiba concentrates on twoforms of 'miru' ('to see, to be able to see'): 'mieru' and 'mirareru' andanalyses them in the Modern Japanese and Present-day Japanese corpora. Bothconsist mainly of critical essays on history, science and culture but only theModern Japanese Corpus includes works of fiction. Ayako Shiba shows thedifference between 'mieru' and 'mirareru' in their meaning-construction typesand demonstrates the distribution of each type in the Modern Japanese corpus andPresent-day Japanese corpus.

Kanetaka Yarimizu's 'Language Change from the Viewpoint of Distribution Patternsof Standard Japanese Forms' treats the standardization process of Japanese infive historical stages using data from dialect research. Two different data setsare used. The first is the 'Grammar Atlas of Japanese Dialects' (GAJ), and thesecond is the 'Glottogram survey', also referred to as the TH survey. The fivestages of his standardization models are as follows: 1. The period until themid-eighteenth century), 2. from the mid-eighteenth century to the end of thenineteenth century), 3. from the end of the nineteenth century to themid-twentieth century), 4. from the mid-twentieth century to the present), 5.The present. The study shows that standardization progressed gradually. Duringthe first two stages, the traditional dialects forms were used. In the modernthird stage, standardization progressed through education but traditional formswere still used in the private domains. In the fourth stage, standardizationprogressed and in the fifth stage, it approached completion. Yarimizu presumesthat standardization is strongly affected by the mass media.

EVALUATIONThe book is primarily aimed at historical linguists but it would also be avaluable source of information for those interested in corpus linguistics. Apositive attribute is that the book collects papers from a wide range of topicswith analyses of different languages, some no longer spoken such as Apabhraṃśa,or are not widely known, for example Santali. An important quality is that itincludes articles recommendations for further study and offers advice forresearchers.

The book also has one drawback, the organization of the articles. There are twotypes of papers: the ones which present analyses and the ones which giverecommendations. It would be better if the articles were divided into two partsas it would facilitate reading and improve the coherence of the book.

All in all, the book is inspiring and absorbing. It provides significant insightinto synchronic and diachronic variation and is a great contribution tocorpus-based studies. The editors are to be congratulated for bringing togethersuch a diverse group of scholars and such a wide range of analyses.

ABOUT THE REVIEWERAnna Ewa Majek is a PhD research student at Trinity College Dublin. Herprimary research interests include corpus linguistics, language variationand change, and sociolinguistics.

