LINGUIST List 28.4755

Thu Nov 09 2017

Review: Morphology; Phonology; Semantics; Syntax: Coloma (2016)

Editor for this issue: Clare Harshey <>

Date: 21-Jul-2017
From: Laura Dubcovsky <>
Subject: La complejidad de los idiomas
E-mail this message to a friend

Discuss this message

Book announced at

AUTHOR: Germán Coloma
TITLE: La complejidad de los idiomas
YEAR: 2016

REVIEWER: Laura Dubcovsky, University of California, Davis

REVIEWS EDITOR: Helen Aristar-Dry


The book “La Complejidad de los Idiomas” (“Languages’ Complexity”) by Germán Coloma, intends to operationalize the construct of language complexity by using computational measures and quantifiable variables. The author shows mathematical models, hypotheses, and formulas that may pave the path to a more informed and accurate notion of complexity in the different linguistic categories. He describes languages as well-organized and regulated systems, driven by forces of efficiency and stability. Above all the author emphasizes the book’s synergetic focus, which integrates linguistic and extralinguistic variables. Therefore notions of linguistic families, linguistic areas, diverse speaking populations and number of speakers will be integral to the quantitative analysis, adding more layers to the conceptualization of language complexity.

In Chapter One, “Conceptos de Complejidad” (“Complexity Concepts”), Coloma focuses on definitions of complexity. He explains how the meaning of simple/complex languages varies, according to the use of relative or absolute terms, and of local or general assumptions. For example, when described in relative terms, the Portuguese language sounds easier to Spanish speakers, because of the proximity to their native language, but difficult to speakers of Chinese. To avoid fluctuations and discrepancies, most studies adopt absolute terms, and use languages’ number of components, grammatical rules, and other stable criteria to define their complexities. Additionally, taking up a local perspective may bring detailed and more precise information, while following a global perspective may gain breadth and scope of the definition. Coloma also describes briefly main language categories, which are fully developed in consecutive chapters, and characterizes the opposing theoretical approaches of formalists (Chomsky, 1965) and functionalists (Greenberg, 1966). Finally he introduces the two sources that provide the data of his study: the World Atlas of Language Structures--hereafter WALS- (Dryer and Haspelmath, 2013), and the short fable “El Viento Norte y el Sol” (“The Northern Wind and the Sun”) mentioned in Martínez et al. (2003).

The second chapter, “Medición de la Complejidad” (“Complexity Measurement”) describes the nature and limitations of two types of measures, and explains relevant formulas and mathematical procedures that are used throughout the book. On the one hand theoretical measures are supported by general grammatical rules and typologies. They compare universal paradigms to particular language systems, either across categories or within specific components, by means of deductive methods. For example, scholars can estimate theoretically the phonological complexity of particular vowels by contrasting them to the paradigmatic sound system. On the other hand, empirical measures focus on particular texts and their socio-cultural contexts, and use inductive methods for the analyses. For example, scholars can measure empirically the lexical complexity by estimating the ratio between lexical and grammatical words and their occurrences in the specific text. Both types of measures have limitations and face their own challenges. While general typologies cannot account for the number of components or categorical classifications used in particular languages, empirical measures struggle to compare different text types, as they vary dramatically in length, genres and vocabulary. In the second part of the chapter Coloma explains two useful tools of analysis. The “Kolmogorov complexity” (1963) metric measures the ratio between the size of a compacted text and the corresponding size of the original text. The “Menzerath Law” (1959) predicts that the measure of one linguistic element is negatively correlated with the measure of the components of that element.

The following four chapters focus on one particular language category separately. Chapter 3 describes “Complejidad fonológica” (“Phonological Complexity”). Coloma starts by characterizing vowels and consonants and the general sound system. Then he highlights the distinctive traits of accent and tones, with influence at the sound level. Spanish has an accentual system that brings about changes in meanings and functions. For example, the different stress’ positions in “médico” (“doctor”), “medico” (“I medicate”), and “medicó” (“s/he medicated”), indicate the profession (as a noun), the verb in present tense and first person, and in preterit and third person, respectively. Likewise tonal languages use distinctive tones to differentiate meanings of the same word, such as in Mandarin “ma,” means “mother,” when it is expressed in a high tone; “horse,” in a low tone; “hemp plant,” in an ascendant move from low to high tones; and “to nag,” in a descendant move from high to low tones. It seems that there is a negative correlation between accents and tones (“Menzerath Law”), by which the more complex accentual systems have simpler tones; and vice versa, languages with rich tonal systems do not differentiate meanings by accents. The author completes the phonological repertoire with the syllabic structure. He shows a broad range of combinations between vowels (V) and consonants (C) among languages. For example Spanish offers multiple possibilities of forming syllables, such as V, CV, VC, CCV, VCC, CCVC, CCVCC.

Chapter 4 examines “Complejidad Morfológica” (“Morphological Complexity”) at the word level. First Coloma describes synthetic words that include several components (morphemes) that mark specific functions. For example the Spanish word “gat-it-a-s” (“she- kittens”) involves four morphemes with distinctive functions: “gat-” (for the meaning of feline), “-a” (feminine gender), “–s” (plural number), and “-it-” (diminutive). Synthetic words contrast with analytic, formed by just one morpheme that cannot be decomposed, such as the Spanish word “sol” (“sun”). The author also depicts words that can carry various functions in only one morpheme, as it is typical in Spanish conjugations. For example, the final vowel of “corr-o” (“I run”) shows a fusion that condenses the mark for the verb tense (present), the person (first), and the number (singular). Among other morphological mechanisms are composition, derivation and flexion. The former takes place when independent lexical morphemes form a new word, such as “sordo-mudo” (“deaf-mute”) in Spanish. In other cases, grammatical morphemes bring a distinctive function to the original word. For example, the Spanish suffix “–mente,” and the English suffix “-ly,” added at the end of an adjective, transform it into an adverb, as in “rápida-mente” (“quick- ly”). Finally, nominal and verbal structures are highlighted because they contain a large number of markers, such as gender, number and case; and verb tense, aspect, mode, person, number and voice, respectively, which would bring higher levels of morphological complexity.

Chapter 5, “Complejidad Sintáctica” (“Syntax Complexity”), moves to the sentence level, focusing on four main syntactic criteria. First, word order in a sentence may be flexible or fixed, according to the degree of freedom of its components. Coloma compares the word order of basic syntagms (subject- verb- object) within and between languages, observing a freer and more variable word order in Spanish than in English. The second syntactic criterion is the distance between the nucleus and the modifiers of a given structure. The author explains that the farther the modifying components are from the core, the more syntactically complex the language is. Another syntactic criterion is the morphosyntactic alignment between the syntagms and their possible functions. For example, in the sentence, “Mi mamá habla” (“My mom speaks”), the nominal phrase represents an intransitive subject and does not carry any additional marker. However, in the sentence, “Hablo a mi mamá” (“I speak to my mom”) the same nominal phrase represents the direct object of an accusative verb, and is preceded by a preposition -“a” (“to”)- that marks the transitive relationship. The last criterion includes simple and complex structures, which establish equal relations between coordinate clauses, or unequal relationships between principal and subordinated clauses, adding a hierarchical and more complex level of syntactic accessibility.

Chapter 6, “Complejidad Verbal y Léxica” (“Verbal and Lexical Complexity”) includes the two systems announced in the title. Within the verbal complexity, Coloma elaborates on tense and aspect, although he also refers to mode (indicative, subjunctive, imperative) and voice (active/passive). Among different tense markers, the author describes the final vowel used in Spanish, which indicates a fusion between verb tense, person, and number, as mentioned in Chapter 4. For example the “–o” in “cant-o” (“I sing”) signals the present tense in first person, and the “-é” in “cant-é” (“I sang”) the first person preterit. English includes auxiliaries as tense markers, interjecting them between the personal pronoun and the main verb. For example, “I will sing,” indicates the future tense, and “I did not sing,” the negative form of the preterit. Some languages also incorporate more complicated tense systems, like the Spanish pluscuamperfect, which is used to indicate a past event that takes place before another event also in the past. Moreover verbal complexity may be expressed by aspects, which are ways of performing the action. Languages may combine tenses and aspects, such as the Spanish distinction between perfect/imperfect in the preterit, while others--like Mandarin--only include aspects, and others only differentiate verb tenses, like English.

For the lexical complexity Coloma differentiates between lexical and grammatical words. The former are independent and carry distinctive and stable meanings, such as nouns, adjectives, and adverbs. The latter are functional and have occasional meanings, such as pronouns, conjunctions and prepositions. The author emphasizes vocabulary as a clear artifact to measure lexical complexity. Typically, words are computed by the amount of entries in the dictionary, and sometimes by ratios between number of words and text lengths. However, the author claims that numbers alone are not sufficient to measure lexical complexities. He draws examples from the WALS data source and shows an extended range of linguistic possibilities, which reflects the numerical differences found among languages’ repertoires. Systems may repeat the same word, pointing out many different meanings, or include null to a broad system of articles (definite/indefinite). They may offer large inventories for familiar categories, such as colors and family members, or refine the list of personal pronouns according to more familiar/approachable or respectful/distant types of treatments, etc. Coloma then concludes that, as necessary as computational measures are, other variables also account for lexical complexities, such as phylogenetic, sociocognitive and cultural factors.

Chapter 7, “Relación entre Medidas de Complejidad” (“Relationship Between Measures of Complexity”), revises statistical measures previously presented in isolation or in binary opposition, and presents them in dynamic interactions. Coloma shows through step-by-step procedures how to transform categorical variables in numerical values, calculate correlations, and understand multiple regressions. He explains the statistical notions, drawing from linguistic scenarios. Among other cases, the author describes a positive correlation between analytic words and the neutral alignment (morphosyntactic category). He also shows that the simple correlation between the tonal system and number of phonemes (phonological category) can be better explained when a partial correlation (e.g. syllabic structure) is incorporated. He clearly demonstrates how languages with few tones and low number of phonemes have a negative correlation with a complex syllabic variable, and conversely, languages with many tones and phonemes are negatively correlated with simple syllabic structure. Finally the author highlights the fact that the simultaneous combination of measures would facilitate the findings of language tendencies that would better explain the complexity construct.

In the last chapter, “Conclusiones y Comentarios” (“Conclusions and Comments”), Coloma summarizes theoretical notions, such as similarities and differences among languages and within categories, and main principles that guide the integration of linguistic and extralinguistic factors. The author also revises relevant tools, formulas and procedures used in the analysis. He emphasizes the trade-off effect, which helps interpret negative correlations correctly, and the equal complexity hypothesis, which brings into consideration socio-cultural variables, such as the number of speakers, the age of a language, and the phenomenon of languages in contact. Coloma also refer to the origin of language complexities and its development throughout the years, and comments on the persistent dichotomy between innate and acquired nature of language. Finally he underlines that the dynamic synergy between statistical analyses and extralinguistic interactions would contribute to a more sophisticated understanding of language complexities.


“La Complejidad de los Idiomas” (“The Complexity of Languages”) addresses complex linguistic and statistical notions in clear and simple style. As stated in the prologue, Coloma attempts to reach novice and experienced readers interested in examining linguistic aspects quantitatively. The author gives straightforward explanations of mathematical concepts and procedures along the chapters, and includes examples that link the statistical analysis to language categories, in order to facilitate readers’ comprehension. Being an economist, Coloma situates his book in an interdisciplinary space, offering a reading that intersects the fields of linguistics and mathematics. The book can also be incorporated as a reference text, especially in courses of applied linguistics, second language acquisition and reading. Moreover and since the book is written in Spanish, it can be extremely helpful in bilingual teaching preparation programs, where there is a big need for academic materials written in Spanish. Faculty, students and researchers will benefit from Coloma’s complete summaries and well organized ideas.

In spite of the valuable strengths, the book also has some limitations. First the author presents tables and figures throughout the chapters that are neither titled nor numbered, and they do not even appear in the index of contents. This absence represents a big challenge for readers interested in following the unlabeled tables and figures along the book. Among the visual representations, it is worth mentioning tables of contingencies, such as the negative correlation between tone and accent (p. 78) and between verb tenses and aspects (p. 153). Likewise Coloma inserts various figures, for example to illustrate linguistic families (p.43) and compare vowel systems between Spanish and English (p. 59). Another limitation is given by the presentation of quite advanced computational concepts and statistical procedures, unless for an audience more exposed to mathematical analyses. The average reader would need more guidance, not only to refresh mathematical notions and practical mechanisms, but also to gain a deeper understanding of the reasoning behind the concepts and be able to apply them in the linguistic field.


Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge: MIT Press.

Dryer, M. and M. Haspelmath (2013) The World Atlas of Language Structures Online.

Greenberg, J. (1966). Language universals. La Haya: Mouton.

Kolmogorov, A. (1963). On Tables of Random Numbers. Sankhya 25: 369-376.

Martínez Celdrán, E., et al. (2003). Illustrations of the IPA: Castilian Spanish. Journal of the International Phonetic Association 33: 255-260.

Menzerath, P. (1954). Die Architektonik des deutschen Wortschatzes. Bonn: Dümmler.


Laura Dubcovsky is a retired lecturer and supervisor from the Teacher Education Program in the School of Education at the University of California, Davis. With a Master’s in Education and a PhD in Spanish linguistics /with special emphasis on second language acquisition, her interests tap topics of language and bilingual education. She is currently dedicated to the preparation of in service bilingual Spanish/English teachers, especially on the use of Spanish for educational purposes. She also volunteers as interpreter in parent/teachers conferences at schools and translates programs and flyers for the Crocker Art Museum, bilingual school programs and STEAC. She also collaborates as a reviewer with the Linguistic list serve and bilingual associations. For more than ten years she has taught a pre-service bilingual teachers’ course that addresses communicative and academic traits of Spanish, needed in a bilingual classroom She published “Functions of the verb decir (‘to say’) in the incipient academic Spanish writing of bilingual children in Functions of Language, 15(2), 257-280 (2008) and the chapter, “Desde California. Acerca de la narración en ámbitos bilingües” in ¿Cómo aprendemos y cómo enseñamos la narración oral? (2015). Rosario, Homo Sapiens:127- 133.

Page Updated: 09-Nov-2017