Review of  Corpus Linguistics in Chinese Contexts

Reviewer: Chunsheng Yang
Book Title: Corpus Linguistics in Chinese Contexts
Book Author: Bin Zou Simon Smith Michael Hoey
Publisher: Palgrave Macmillan
Linguistic Field(s): Applied Linguistics
Text/Corpus Linguistics
Language Acquisition
Subject Language(s): Chinese, Mandarin
Issue Number: 27.1500

Corpora have been widely used in linguistic and language acquisition studies. Drawing on the large-scale annotated written and spoken linguistic data, corpus-based studies have helped reveal linguistic patterns that traditional linguistics may not be able to show. “Corpus Linguistics in Chinese Contexts”, edited by Bin Zou, Simon Smith, and Michael Hoey, originated from the International Conference on Corpus Technologies and Applied Linguistics (CTAL-2012) held in Suzhou, China, in June, 2012, and is a welcome addition to this body of research, especially in Chinese contexts.

The nine chapters included in this volume represent the first-of-the-art corpus linguistic research in Chinese contexts, meaning that the papers in this volume either focus on the application of corpus tools to the Chinese language, corpus-based studies on English by Chinese scholars, studies on the English as a foreign language (EFL) in China, or the comparison of English usages by Chinese EFL learners and British native speakers. Thus, it is clear that the chapters are only loosely connected under the umbrella “Chinese contexts”.

The introductory chapter by Wenzhong Li and Simon Smith provides a brief introduction to the corpus-informed research on Chinese and corpus-based EFL research in China.

Chapter 1, “Lexical priming: The odd case of a psycholinguistic theory that generates corpus-linguistic hypotheses for both English and Chinese” by Michael Hoey and Juan Shao, is the most theory-loaded chapter in this volume. Developed by Hoey in response to the insights derived from corpus linguistics, Lexical Priming theory is an usual corpus-driven theory “in that it builds both upon corpus linguistics analysis and upon long-standing psycholinguistic research” (p. 16). The theory generates hypotheses that have not been previously explored in a systematic fashion by corpus linguists. This chapter applies Lexical Priming theory to account for collocation, colligation, and semantic preference in Chinese. The applicability of the Lexical Priming theory to Chinese shows that the psycholinguistic claims of Lexical Priming theory are not culture- or language-specific for “two typologically different languages [English and Chinese] share properties when looked at from both a lexical and a psycholinguistic perspective.” (p. 31)

Richard Xiao’s Chapter 2, “Contrastive corpus linguistics: Cross-linguistic contrast of English and Chinese”, is a contrastive corpus analysis of the distribution of passive voice and classifiers in English and Mandarin Chinese. Interestingly, Xiao shows that although Chinese is usually recognized as a typical classifier language but English is not, these two languages show striking similarities in their classifier systems in spite of the different terms used, their quantitative differences, and some language-specific syntactic behavior. A model of contrastive corpus linguistics is also proposed for future studies along this line.

Chapter 3, “Learning Chinese with the sketch engine” by Adam Kilgarriff, Nicole Keng, and Simon Smith, introduces the key features of Sketch Engine, a widely used tool in lexicography, the teaching and learning of English, and its application in teaching and learning Chinese. Sketch Engine can be used for basic concordance, character search, learning about the collocations between measure words/classifiers and nouns, and identifying the distribution of words with similar meaning (Thesaurus) and the difference between similar words (Sketch Diff).

In Chapter 4, “Patterned distribution of phraseologies within text: The case of research articles” Maocheng Liang employs TextSmith, a corpus analysis tool, to examine the lexico-grammatical features of different sections of research papers published in the Journal of Applied Linguistics. The findings confirm that the texts in the same genre are often similarly structured. This chapter also exemplifies the integration of corpus analysis and genre analysis.

Chapter 5, “Corpus pedagogic processing of phraseology for EFL teaching: A case of implementation” by Anping He, is a case study that combines a corpus study with EFL pedagogy. College students of corpus linguistics first built an EFL textbook corpus. Then, findings in the corpus-based analysis were incorporated in a multimedia courseware, which were then used in the EFL middle school classes. This innovative application of corpus in teaching was found to benefit all parties involved: the college students, the middle school teachers, and middle school EFL learners.

Wangheng Peng wrote Chapter 6, “A corpus analysis of Chinese students’ (Mis-)use of nouns at XJTLU” , which analyzes Chinese EFL learners’ use and misuse of countable and uncountable nouns in EFL academic writing and compares them to British native speakers’ use in academic and published writing. It is found that EFL learners tend to use uncountable nouns as countable ones. One interesting finding is that some uncountable nouns are used as countable nouns in the native speaker corpora (such as researchers and staffs), although with low frequency. The finding seems to suggest that there may be no absolute distinction between two grammatical categories.

Bin Zou and Wangheng Peng’s Chapter 7, “A corpus-based analysis of the use of conjunctions in an EAP teaching context at a Sino-British university in China”, compares conjunction use in the writings of students in a Sino-British University in China to two published corpora of English academic writing, one by British native speakers and the other by Chinese university learners. It was determined that the students at the Sino-British university used formal conjunctions more frequently, showing more resemblance to native English speakers than regular Chinese university students.

Chapter 8, “Application of corpus analysis methods to the teaching of advanced English reading and students’ textual analysis skills” by Haiping Wang, Yuanyuan Zheng, and Yiyan Cai, reports the effect of learners’ corpus construction on EFL reading instruction. The chapter shows that the investigations of language use and the construction of a textbook-related corpus serves as an extension of language teaching textbooks, stimulates learners’ interest in reading, and eventually helps them read more effectively and critically.

The last chapter, Chapter 9, “An appraisal analysis of reports about Chinese military affairs in the New York Times” by Zhaoyang Mei, Ren Zhang, and Baixiang Yu, applies Appraisal Theory (see Martin and Rose, 2003) to the investigation of English reports on Chinese military affairs in New York Times. Three subcategories of the appraisal theory (i.e., attitude, engagement, and graduation) are included in the discussion. The analysis shows that while news reports may appear to relate events objectively, there are often latent evaluations which are unintentionally conveyed to readers. This chapter shows the potential use of corpora in sociolinguistic studies.


Zou, Smith, and Hoey did a great job in compiling this volume of corpus-based and corpus-driven research on Chinese, English, and EFL. Considering the large number of EFL learners in China and the ever-increasing number of Chinese as Second Language (CSL) learners around the world, this new volume in the Series of New Language Learning and Teaching Environments will be of great interest to corpus linguists, EFL and CSL practitioners, graduate students, and even advanced undergraduate students.

One major strength of this volume is its wide coverage of topics, which range from Chinese EFL learners to native English speakers, from EFL writing to formal English writing in American and British media, and from corpus building to corpus tool application. The volume provides not only the state of art in the use of corpora in applied linguistic research by top corpus linguists, such as Michael Hoey and Richard Xiao, but it also addresses the practical side of corpora, namely how corpora can be utilized in language teaching and learning. Interested researchers, language practitioners, and graduate students can all draw upon these chapters to create their research questions and conduct independent studies.

Another strength of the volume is the introduction of multiple corpus tools, such as WordSmith, TextSmith, and Sketch Engine, etc. These corpus tools will be of great value to those who are new to corpus linguistics and are interested in exploring the use of corpora in their research and language learning.

This volume also opens many new venues of corpus-based research. For example, Chapter shows that although English is not usually considered to be a language with classifiers, corpus-based analysis of the classifiers in English and Mandarin Chinese showed that the two languages have a lot in common. More contrastive corpus-based studies along this line can be conducted to understand linguistic typology and universals. In Chapter 5, HE demonstrated the application of EFL corpora built by students of corpus linguistics. Such innovative research, which combines the training of corpus linguistics students and the practical application of corpora, can be conducted in many similar contexts and will likely benefit all parties involved.

While the introductory chapter provides a historical and current perspective of corpora and corpus-based research in China, it seems that a more thorough discussion of corpus-driven and corpus-based research should be provided to make the various chapters more reader-friendly, and, more importantly, situate individual chapters in a larger context. In the current state, the different chapters are only loosely linked by the “Chinese contexts” although this is understandable in that all chapters hail from presentations given at a conference.

To summarize, Corpus Linguistics in Chinese Contexts will be of great value for corpus linguists, applied linguists, EFL and CSL practitioners, as well as anyone interested in the theoretical and practical issues related to corpora.
Dr. Chunsheng Yang is an assistant professor of Chinese and applied linguistics at the University of Connecticut.His research focuses on the acquisition of second language (L2) phonology, especially the acquisition of L2 prosody (i.e., tones, intonation, stress, etc), computer-assisted and mobile-assisted language teaching and learning, and Chinese linguistics and Chinese pedagogy in general.

Format: Hardback
ISBN-13: 9781137440020
Prices: U.K. £ 63.00
U.S. $ 90.00