LINGUIST List 32.1821

Tue May 25 2021

Review: Sociolinguistics: Cacoullos, Travis (2020)

Editor for this issue: Jeremy Coburn <>

Date: 03-Mar-2021
From: Maria Khachaturyan <>
Subject: Bilingualism in the Community
AUTHOR: Rena Torres Cacoullos
AUTHOR: Catherine E. Travis
TITLE: Bilingualism in the Community
SUBTITLE: Code-switching and Grammars in Contact
PUBLISHER: Cambridge University Press
YEAR: 2020

REVIEWER: Maria Khachaturyan


“Bilingualism in the community. Code-switching and Grammars in Contact” by Torres Cacoullos and Travis (TC&T) is a study of Spanish in English spoken by a bilingual population of northern New Mexico, USA. Performed within the framework of Labovian variationist sociolinguistics, it is a large-scale corpus study relying on advanced quantitative methods. English and Spanish have been spoken in New Mexico for about 150 years, making New Mexican bilinguals a perfect “laboratory” for studying long-term effects of contact-induced change, or lack thereof. Today, New Mexican bilinguals use both languages in everyday life, frequently recurring to code-switching as a bilingual discourse mode. The book focuses on a particular morphosyntactic parameter, subject pronoun expression, as a domain of possible contact-induced change. The two languages have distinct overall rates of unexpressed vs pronominal subjects, Spanish allowing unexpressed subjects much more frequently than English. The book’s central empirical question, then, is whether New Mexican bilinguals show convergence in the patterns of use of subject pronouns in the two languages.

The main body of the book is divided into 11 chapters. Chapter 1 introduces the main problem: does contact, amplified by code switching (CS), always lead to change? Also, it formulates the object, probabilistic patterns of use of subject pronouns in Spanish and English conditioned by co-occurrence constraints. The chapter also introduces the design: a study of the distribution of subject pronouns in a bilingual corpus of New Mexican Spanish and English compared with benchmark monolingual corpora (the corpus of Colombian Spanish and the Santa Barbara Corpus of American English) and with an earlier, presumably less bilingual variety of New Mexican Spanish. The comparison with benchmark varieties allows arguing for the presence of meaningful differences between the corpora, as well as the direction and causality of change. Furthermore, some of the same probabilistic patterns are studied in the presence and absence of CS. This is done to measure the possible role of CS in transient contact-induced change.

Chapter 2 draws the general sociolinguistic portrait of New Mexico based on the census data: about a half of the New Mexican population are Hispanics, two-thirds of whom are born in that very state. In some of the counties in the north of New Mexico over two-thirds of the population are Hispanics and most of them are US-born. There is a decline in the Spanish-speaking population since only half of US-born Hispanics in New Mexico speak Spanish at all, but again in some of the northern counties, about two-thirds of Hispanics speak Spanish in the home. The 40 bilingual individuals whose spontaneous speech constitutes the corpus under study come from these very counties. The speaker sample includes close to equal numbers of men and women of various ages, occupations, and social classes. No correlation between pronominal rates and social variables has been identified.

Chapter 3 describes the bilingual corpus under discussion, which includes 29 hours (300 000 words) of recordings of natural interactions during the course of sociolinguistic interviews. The interviewers are also Hispanics from New Mexico and entertain a close relationship with the participants, who are their extended family members or their acquaintances. The corpus is transcribed orthographically following the notation developed by Du Bois et al. (1993): in particular, it divides the speech into Intonation Units (IU) and marks the units’ boundary intonation as continuous or discontinuous (final, appeal, or truncation). The same prosodic transcription is used for three monolingual benchmark corpora: the corpus of Colombian Spanish, American English, and an earlier variety of New Mexican Spanish.

Chapter 4 characterizes the bilingual participants based on (a) a short questionnaire containing questions about language preference in various spheres and self-reported language proficiency, (b) content analysis of the sociolinguistic interviews revealing aspects of acquisition and use of both languages which are not captured by the questionnaire, and (c) production data. TC&T find that self-reported language preference and language proficiency scores hardly correlate and thus take them as two independent measures of the degree of bilingualism, to which they add the language predominance score measured as the rate of English-language finite verbs in the production data of a given individual. Spanish- and English-predominant speakers constitute roughly one-third of the overall speaker population (12 and 13, respectively), and the rest, 15 speakers, use both languages more or less equally. The Spanish subject pronoun rate does not appear to be significantly affected by either of the three measures (preference, proficiency, predominance). The rate of English to Spanish finite verbs in the overall corpus is almost equal and there are often no rhetorical motivations for CS, making it the bilingual community’s discourse mode.

Chapter 5 explores the linguistic environment constraining the use of subject pronouns in the monolingual benchmark corpora of conversational Colombian Spanish, occasionally using data from Puerto-Rican Spanish and Spanish of Madrid. After observing that contrastive or emphatic contexts, on the one hand, and ambiguity resolution, on the other, account only for a small portion of examples, TC&T turn to the main factors constraining variation, such as subject continuity (a lack of non-coreferential human subject between the target subject and the previous mention in the subject position), which favors the use of unexpressed subjects. Other factors include coreferential subject priming, whereby a previous coreferential pronoun favors a subsequent pronoun, while a previous coreferential unexpressed subject favors a subsequent unexpressed subject. The rate of subject pronouns in coreferential contexts is affected by syntactic and prosodic linking. It is the lowest in the maximally linked clauses, that is, in coordination via the conjunction y ‘and’ when the target clause is prosodically linked to the preceding one. Prosodic linking of a clause is defined as its occurrence in the same IU or a different IU connected to the previous one by continuing intonation. Further factors are aspectual class of the verb, lexically particular constructions, and subject’s person.

Chapter 6 explores the similarities and differences between the monolingual benchmarks. The comparison is done by investigating the structure of variability through describing the similarities and differences in co-occurrence patterns in the variable contexts where speakers have a choice between variants – for subject expression, where both pronominal and unexpressed subjects occur. In Spanish and English, the central variable context is the position of the subject in coordinated clauses. Indeed, in English, the lowest rate of pronominal subjects occurs in the context of prosodically linked coordination via the conjunction and, without however reaching zero. The graded effect of linking is similar to the one in Spanish, but is stronger. Coreferential subject priming is also at work in English. In Spanish, subject continuity and priming moderate one another, such that a previous pronominal subject, favoring pronominal expression, moderates the effect of subject continuity, which disfavors it. In English, the effect of priming is observed only for unexpressed subjects, but not for pronominal subjects, and is absent when there are intervening human subjects. As a result, there are clusters of unexpressed coreferential subjects, an effect that is not observed in Spanish. The coordination context, then, is the one where similar tendencies are observed in both languages, but their strength varies. The only context allowing both pronominal and unexpressed subjects in English outside of coordinate clauses is the IU-initial position. Outside the IU-initial position and outside coordination only pronominal subjects occur. In Spanish, the situation is different since unexpressed subjects occur in both positions and are even more frequent in the non-IU-initial position. The IU-initial position becomes, then, another diagnostic context for change.

Chapter 7 puts under scrutiny the hypothesis that the rates of subject pronouns in bilingual speech are different from the monolingual benchmark and give a summary of conflicting predictions as to whether the rates of expressed pronouns are expected to be higher or lower. Observing that the raw rates are hardly informative, TC&T focus on co-occurrence constraints, especially subject continuity, and its interactions with other linguistic conditions. They conclude that there are no significant differences in co-occurrence constraints between Spanish from the bilingual New Mexican corpus and monolingual benchmarks: data from Spanish of San Juan, Puerto Rico, Mexico City, Mexico, and Madrid, Spain, as well as a corpus of an earlier variety of New Mexican Spanish.

Chapter 8 turns to test the hypothesis of the influence of English: according to TC&T, if English influences Spanish in the speech of bilingual New Mexicans, the change should manifest, in particular, in decreased rates of subject pronouns in the IU-initial position and a stronger effect of syntactic and prosodic linking. No such effect is observed.

Chapter 9 analyzes the effects of CS on feature transfer, an effect that has been put forward in several studies on grammatical convergence. For tokens with CS, TC&T select only tokens of multi-word sequences of CS (which cannot be confused with borrowing) and only proximate codeswitching occurring in the same or preceding clause as the target verb produced by the same speaker. The absence of CS means that it is absent from the preceding clause produced by the same speaker or the interlocutor. TC&T observe that regardless of the presence of CS the rates of subject pronouns and the linking effects stay roughly the same.

Chapter 10 explores the effects of CS on coreferential subject priming. Cross-language priming (English pronoun to Spanish pronoun) is weaker than intra language priming (Spanish pronoun to Spanish pronoun) but is still observable since the rates of Spanish pronouns after an English pronominal prime are higher than after a Spanish unexpressed subject. Crucially, in the absence of CS only 27% of previous coreferential subjects are expressed by pronouns, while in the presence of CS this rate raises to 61%. Because unexpressed subjects in English are very rare, CS with English increases the rate of English pronominal primes. However, since English-to-Spanish priming is relatively weak and in the context of CS there are fewer Spanish pronominal primes, the opportunities for Spanish pronouns to occur under the effect of priming are reduced and the overall rate of Spanish pronouns in the corpus remains unaffected. (Perhaps related is the fact mentioned in Chapter 4: the rate of subject pronouns in English-predominant individuals is lower than in other categories: 18% against 29% in Spanish-predominant individuals and 28% in those who do not show any clear language predominance.) Thus, although language-particular conditioning – and even the overall rates! – are preserved, contextual distributions of structures are shifted, and this, according to TC&T’s data, is the only effect of CS.

Chapter 11 summarizes the findings: in a community where bilingualism can be traced back to 150 years there is no contact-induced change in patterns of subject expression, as well as in several other morphosyntactic domains, in contrast with phonology where some convergence is observed.


“Bilingualism in the community” sets a high standard of variationist studies of contact-induced change and applies a systematic comparative methodology that includes comparing with a previous, monolingual variety and monolingual benchmarks. A similar methodology is already followed in language contact studies, where a study of monolingual benchmarks is substituted by a study of closest linguistic relatives of the languages in contact and the study of an earlier variety of the target language is substituted by linguistic reconstruction. The quantitative perspective adopted by TC&T adds a new value to the reasoning. The linguistic comparison through the study of co-occurrence constraints is a major contribution to typology (on that, see the review by Di Garbo (2019)). The study of the interaction of syntax with prosody, and in particular, of prosodic and syntactic linking is a promising line of research, as evidenced by recent studies (Mithun 2020). The book is in general very well written and is recommended to scholars interested in language contact, variation, and the interplay of the two both within and outside the variationist framework.

The study reaches a conclusion that despite 150 years of co-presence of English and Spanish in New Mexico and frequent CS in the speech of contemporary New Mexicans, there is no effect on the grammar of subject expression. For TC&T, a lack of contact-induced change must be the default assumption: “The burden of proof is rather on the contrary proposition, to demonstrate that convergence elsewhere has indeed occurred, and to account for why it did so” (p. 204). Leaving aside the consideration that contact-induced change is an undoubtedly widespread phenomenon (on that, see the review by Adamou, (2018)), a further question is, what is the sufficient condition for change to occur? The types and the amounts of data collected by TC&T and the variationist approach seem to be perfectly suited to address this question.

Thus, the fact that the overall rates of pronominal subjects in New Mexican Spanish come out as roughly the same as in monolingual benchmarks is very intriguing. However, in Chapter 2 we learn that subject pronoun expression rates range from 9 percent to 62 percent across individual speakers. The rates do not correlate systematically with any social grouping studied nor with the language predominance rate which is a proxy to the rate of CS. If the grammar of co-occurrence constraints is arguably the same, it would be interesting to see what contributes to different individual rates of subject expression.

Furthermore, TC&T observe that the only effect of CS is in the increase of the rate of English primes, but because English-to-Spanish priming is weak, it does not affect the overall rates. It would be interesting to artificially manipulate come parameters in the corpus: in particular, what would happen if the overall corpus rates English and Spanish finite clauses were not equal, but English occurred more? In the context of language shift, it is not difficult to imagine that the population of English-predominant speakers would grow such that the rate of English finite clauses in a sample would grow, too. Would it change anything in the overall results? If, in contrast, Spanish occurred more, as in the earlier variety, how would it affect the rate of subject pronouns? Now if the strength of English-to-Spanish priming were different, would it affect the overall rates, such that weaker priming would lead to decreased rates of subject pronouns and stronger priming would lead to increased rates? Finally, if, after such artificial manipulation of the strength of priming and of the rate of English clauses the overall rates of subject pronouns came out different, this would draw a model of the conditions of contact-induced change in probabilistic constraints which would then need to be evaluated in light of existing predictions of change and would likely have interesting theoretical implications.


Maria Khachaturyan is a University Researcher at the University of Helsinki. Her research sits at the intersection of descriptive linguistics, interactional linguistics and linguistic anthropology. She specializes in Mande languages of West Africa, especially the Mano language (400 000 speakers in Guinea and Liberia). Her current research project studies language contact, variation and change, with a particular focus on multilingual language socialization in the family and in the community.

