Editor for this issue: Joel Jenkins <joellinguistlist.org>
Book announced at https://linguistlist.org/issues/35-897
Title: The Oxford Handbook of Word Classes
Publication Year: 2024
Publisher: Oxford University Press
http://www.oup.com/us
Book URL: https://global.oup.com/academic/product/the-oxford-handbook-of-word-classes-9780198852889?utm_source=linguistlist&utm_medium=listserv&utm_campaign=linguistics
Editor(s): Eva van Lier
Reviewer: Michael B. Maxwell
PRELIMINARIES
I need to begin my review by saying something about how I am reviewing this handbook. Most handbooks have several characteristics that set them apart from other books, including other books that are reviewed here at Linguist List:
1) Handbooks tend to be quite large. The Oxford Handbook of Word Classes reviewed here weighs in at 45 chapters and more than a thousand pages. This precludes my giving the usual chapter-by-chapter summary, since that would take more space than the review is allowed.
2) The sheer number of chapters also precludes my evaluating each article individually. Hence my review will touch on only a selection of the chapters, as well as some overall comments on the book.
3) Handbooks present a wide range of perspectives on issues, and this volume is no exception: there are nine chapters on approaches to word classes from very different theories, for example. Those chapters exist to give an overview of each theory's perspective, and I therefore do not feel it is my task as a reviewer to disagree (or agree) with those authors; that would instead be for someone reviewing a book on the particular theory.
I now turn to the review itself.
SUMMARY
This handbook can be purchased as a hardbound book, or as an ebook (Amazon lists it as a Kindle book, but I have not accessed it in that way). Individual chapters can also be purchased in electronic form (see also comments below).
The introductory chapter by the handbook's editor presents a succinct synopsis of each of the other chapters. The publisher has chosen not to make this chapter freely accessible; more on this decision in the evaluation section of this review.
The remaining chapters fall under five numbered Parts (I'll capitalize this use of the word). As stated, I will not describe each of the remaining chapters, but the five Parts are as follows:
I. "Fundamental Issues": The chapters in this Part respond to questions like whether it even makes sense to compare grammatical categories across languages (Martin Haspelmath argues that it does not---but see below---and William A. Foley and William Croft express similar views in their chapters); whether categories apply to roots, stems and/or words (Bisang says this varies by language); lexical vs. grammatical words (Kasper Boye uses this contrast in a somewhat analogous way to the distinction between content vs. functional words, or between open class vs. closed class words); and processes such as derivational morphology that change categories. While these issues are not couched in terms of particular theories, they are relevant to the theory-based chapters in the following part. For example, if grammatical categories cannot be equated across languages, then the notion of innate categories assumed under some of those theories probably doesn't make sense.
II. "Theoretical Approaches" contains chapters describing the approach to grammatical categories under a number of current theories. The authors of some of these chapters assume innate categories (or category-like features) drawn from Minimalism and Head Driven Phrase Structure Grammar (HPSG)---although the theories themselves could probably be re-cast with non-innate features or categories. Other theories described here view grammatical categories as emergent and/or gradient (some words being more noun-like than others within a given language, for example--there is an entire chapter devoted to this issue). Gradient categories are presumably a poor fit with theories that postulate innate categories, so those theories generally opt for allowing lexemes to belong to two or more categories (or for lexemes to have two or more alternative feature sets).
III. "Specific Word Classes": Chapters in this Part deal with such traditional categories as noun, verb, adjective and adverb, but there are also chapters on ideophones and interjections.
IV. "Word Classes in Genetic and Areal Language Groups": As the name suggests, the chapters here discuss categories in individual languages, or---more frequently---groups of languages. The selection of groups and languages is broad, including not only the usual suspects (like Indo-European, the chapter on which, unlike the other chapters, looks at historical changes in categories), but also Austronesian and several language families of the Americas. (The Americas may even be over-represented, particularly in comparison to languages of Africa.) There is even a chapter "Word Classes in Sign Languages", by Vadim Kimmelman and Carl Börstell.
V. "Word Classes in Linguistic Subdisciplines": This Part contains chapters on a variety of topics related to sub-disciplines of linguistics, ranging from first and second language acquisition to computational linguistics, and including psycholinguistics, neurolinguistics, grammaticalization, and language contact (particularly borrowing).
The bibliography is at the end of the book (rather than having separate bibliographies at the end of each chapter); it is freely accessible on-line. Thus, if you choose to purchase an individual chapter or two in electronic form, you will have access to the references.
I will add here a few non-evaluative comments on points made throughout this handbook.
As mentioned, Haspelmath argues that it is not possible to directly compare grammatical categories across languages, saying "So if word classes are defined in a language-particular way, with reference to different constructions in different languages..., then there is no way to match classes across languages." One might think that if this argument is correct, it would not make sense to have a chapter in Part II on verbs or nouns, for example, since there is allegedly no such thing as a universal category of verbs or nouns. Several authors in that Part address this conundrum; Alexander Letuchiy, the author of the verbs chapter, meets the objection head-on, listing "several properties that are typical for verbs."
But in fact what Haspelmath's right hand takes away, his left hand gives back, since a few pages later he provides semantic criteria for noun roots, verb roots and adjective roots: he says they are object-denoting, action-denoting, and property-denoting roots, respectively. In other words, while definitions of grammatical categories in terms of their grammatical properties cannot be used across languages, semantic-like definitions can, at least for some properties and their corresponding categories. It's not clear whether adpositions can be identified across languages by their semantic properties; the Mayan languages chapter provides insight on this, since Mayan languages typically have only one or two adpositions, with further distinctions made by relational nouns; the English "in the house" is for example paraphrasable in Tzeltal (ISO 639-3 tzh) as "<preposition> the house's interior".
This issue of comparability of word classes surfaces over and over (except of course for those Part II chapters on theories which presuppose innate categories). For instance, in Part IV, whose chapters describe word categories in different languages or language groups, many authors point out the issue, and nearly everyone discusses their language-particular grammatical criteria for membership in each class.
The related issue of variability of behavior (some verbs, for example, may be more verb-like than others within a particular language) is also frequently discussed. Occasionally the issue discussed in Bisang's "Levels of Analysis" (e.g. whether words in isolation can be said to belong to a particular category, or whether it is only words in a morphological and/or syntactic context that have a definite category) appears---in Ulrike Mosel's chapter "Word classes in Austronesian languages", for example.
Adjectives are (as usual) defined as words that can modify nouns (or noun phrases), while adverbs are words that can modify anything else. But it is unclear why adjectives are singled out from all modifiers; why are words that modify just verbs, say, not as distinctive a class as words that modify just nouns? I did not see this come up as a discussion point anywhere.
EVALUATION
As mentioned above, there are 45 chapters in this handbook, and the space available to Linguist List reviews precludes my giving an evaluation or even an overview of each chapter. However, the book's table of contents listing each chapter is available at https://academic.oup.com/edited-volume/55353. Even without login to Oxford University Press, you can click on each chapter and read its abstract; or you can purchase access to individual chapters.
The editor, Eva Van Lier, summarizes each chapter in her introduction. Although this chapter is not freely accessible on the OUP website, it---as well as much of the first 130-some pages, with one omitted page out of every five or ten---can currently (as of November 2024) be read on Amazon as part of the "Sample". I would encourage the publisher to make this introductory chapter freely available on their website, as I believe it might increase interest in obtaining the entire book. (The table of contents and the bibliography are also freely available on the publisher's website.)
Moving to an evaluation of this handbook: it would appear that at least some authors were given the opportunity to review other chapters before finalizing theirs. There are therefore significant cross-references among many of the chapters---for example, between the theory chapters and the chapters about language families. That this was accomplished while still bringing the book to print in a reasonable amount of time is a testimony to the editor.
Nevertheless, some authors fail to reference other relevant chapters. For example, Sabine Stoll's chapter "Word Classes in First Language Acquisition" states that the Mayan languages "have only a small number of verbs and rely on light verbs". Although Valentina Vapnarsky's chapter on the Mayan language family mentions light verbs a couple times, it does so only with respect to a subset of the branches of Mayan, and nowhere does she state that even the languages of those branches _rely_ on light verbs, much less that any Mayan languages have few verbs. I looked at a few dictionaries of Mayan languages for further evidence. The Tseltal-Spanish multidialectal dictionary (Polian 2020) lists 8109 entries, of which about 30% are verbs; I suspect a similar ratio of verbs to non-verbs would be found for many languages. Dictionaries of Mam [mam] and Kʼicheʼ [quc] (of the eastern branch of Mayan) appear to have similar proportions of verbs, although these dictionaries were in PDF format, so that it is difficult to make quantifiable judgments.
Stoll further states that the Strait [sic, Straits] Salish language only distinguishes "between contentive morphemes and functional morphemes" (p.870--871). The chapter on Salishan languages mentions that some authors have claimed this for some Salishan languages (Straits Salish is not specifically mentioned), but argues at length for at least a distinction between nouns and verbs, and less strenuously for finer grained category distinctions in the languages of this family as well. (Note: Straits Salish, or Coast Salish, is a group of languages, including North Straits Salish, ISO code str.)
Similarly, David Beck's chapter on adjectives refers to Haspelmath's (2012) argument that grammatical categories are not comparable across languages. But the same points are made in Haspelmath's chapter in this handbook, which might have been usefully referenced. William Foley's section 6.6 discusses categories in Northern Iroquois languages, and while it does reference the chapter on Iroquoian languages, it goes over much the same material as Walter Bisang's section 3.5.2 without referring to that discussion (Bisang's section on Iroquois mentions the discussion in Foley's chapter).
The book includes an index of languages discussed, but some languages seem to have slipped by. Kharia (a Munda language of India, ISO khr) is mentioned in several articles, but is missing from the index (I checked several possible spellings). The heading of the index lists a number of reasons some languages are not indexed (e.g. if they only appear in footnotes), but these reasons do not appear to apply to Kharia, for which data is shown in at least one place (example (19) p.59).
Language groups (which in some cases may be more familiar to readers, like Southern Wakashan of the Pacific Northwest) are not indexed. The e-version of the book (which Oxford University Press kindly granted me access to) makes it easy to look up both languages (including Kharia) and language groups. However, the spelling of language names is not consistent in either the e-book or the print book; the Mayan language Tzeltal (ISO 639-3 tzh) is spelled that way twice in the text, and as Tseltal 11 times. The index includes both spellings as the single index entry 'Tzeltal/Tseltal', alphabetized with other languages starting with 'Tz', while the related language Tzotzil (tzo) is indexed as 'Tsotsil/Tzotzil' and alphabetized among languages starting with 'Ts'. The language Tz'utujil is spelled with and without the glottal (apostrophe) in the Mayan chapter; these are simply alternative spellings for a single Mayan language (tzj), but are indexed as if they were two distinct languages. Why am I so concerned about the spelling of language names? Inconsistent spelling of language names is exactly one of the issues ISO 639-3 codes where created to solve nearly twenty years ago, and they have been widely adopted elsewhere; but they are absent from this book, making it harder than it needs to be to find information about a particular language.
Occasionally cross references are broken. There is a reference on page 945 to a non-existent section 45.4.4.1, and a reference in section 9.4 (page 191) to "section 9.4", apparently meaning section 9.3.2.2. Cross references between the text and example sentences are also sometimes broken. The text on page 785 refers to example numbers 12 and 13, which do not exist; the references are apparently to examples 14 and 15. Similar mismatches appear on pages 800--803. There are occasional differences between citations in the text and the bibliography. For example, Françoise Rose's article "Word Classes in Maweti-Guarani Languages" cites "Hengeveld (1992)", while the bibliography lists two publications by that author for 1992, labeled 1992a and 1992b. In the e-version of the book, some cross-references are hyperlinks, others are plain text; broken references in the print version are also broken in the e-version. I did not conduct an intensive search for discrepancies such as these, but they seem fortunately to be rare.
Most tables and figures are readable (however, figure 20.3 is missing the labels on the x-axis, making it impossible to interpret). One minor disadvantage of the e-version is that some figures appear much larger than the text, and depending on your browser window you may need to scroll to see the entire image. This was especially apparent for photos of signers in the sign language chapter. On the other hand some e-version photos and charts are in color, unlike the print version---and occasionally this makes a difference, as in the reference in the caption of figure 44.2 to the "red plot", which plot is unfortunately a shade of gray in the print version (the e-version plot is in color, and far easier to read not only because of that, but also because of its size).
Apart from the above relatively minor issues, there are a few decisions that in my opinion might have made this handbook better. First, the decision to devote Part IV to "Word Classes in Genetic and Areal Language Groups" led to some of the chapters in this section covering so many languages within a group that few generalizations can be drawn. The chapter on Australian languages covers perhaps 25 language families, and while there may be similarities among unrelated languages due to areal influence, the diversity is huge. Even the Mayan chapter left me with the feeling that there were very few generalizations among these related languages, apart from head-marked ergativity and the limitation to a very few adpositions in each language. To what extent this apparent lack of generalizations is due to inadequate or inconsistent descriptions of individual languages is not clear. It might have been better to describe word classes as exemplified by a single well-studied Mayan language, with perhaps the occasional comment that another Mayan language behaves differently.
The use of language groups rather than individual languages also means that details needed to understand generalizations are often omitted in favor of discussing differences among languages of the group. This is not totally unexpected, since handbooks in some sense serve as a guide to the larger literature, rather than providing detailed argumentation about certain issues; but handbooks are also not just annotated bibliographies. In sum, I feel a bit more depth (on individual languages) and less breadth (on language families and especially areal groupings) would have been helpful.
Another issue in the chapters on language groups is that while most example sentences are tagged for the individual language, some are not; since the chapters discuss multiple languages, tagging all examples for the individual language would have been helpful.
Secondly, the use of extinct languages in some chapters of Part IV makes it difficult to know how strong the generalizations are. There are really only a few chapters where this is an issue; the chapter on Egyptian, Semitic and Cushitic languages (where Ancient Egyptian [egy] is extinct, as are some of the ancient Semitic languages described) is one. But the problem is particularly noticeable in the chapter on Classical Chinese. (The ISO code ozh for "Old Chinese" includes Classical Chinese, as well as older varieties; ISO code lzh "Literary Chinese" includes both Classical Chinese and later Chinese written in a similar style to Classical Chinese.) I assume that this language was chosen because it appears to be at an extreme in terms of word class distinctions---the contention is that many, perhaps most, words were "flexible", with their actual categories determined by syntactic context and semantics, supplemented by "pragmatic implicatures (based on stereotypes), metonymy, metaphor and general aspects of world knowledge" (p.611). But as an outside observer (I know next to nothing about modern Chinese, much less the classical language), I was wondering how certain these observations were---and whether some of the claims might be based on our incomplete knowledge of a language spoken thousands of years ago, written in a non-alphabetic script where morphology can only be reconstructed, and with the corpus possibly being poetic or at least stylized, and where present day scholars reportedly disagree to some extent. An expert on classical Chinese might scoff at my skepticism, but that misses the point: the chapter is not written for experts on this language, but for linguists interested in word classes in other languages. To use a language that only specialized scholars can make much sense of means that non-experts have a difficult time evaluating the arguments. Furthermore, if it is true that classical Chinese in some sense lacked syntactic (non-semantic) word classes, then there must be other languages---modern languages---that behave the same way. (Modern Chinese is apparently not such a language.) In my view, a discussion of word classes in one of those modern languages would have been more convincing.
Most of the authors take a broad view, presenting both sides of arguments even though the authors clearly have a preference for one side; the chapters on specific theories are of course understandable exceptions to this. But a few authors of ostensibly theory-agnostic chapters clearly come down on one side without presenting much in the way of conflicting evidence. Stoll's chapter on "Word Classes in First Language Acquisition", for instance, is clearly dismissive of linguists who advocate for innate categories. She may be right, but it would have been helpful to hear more about language acquisition from their point of view.
Finally, it is unavoidable that a handbook will be slightly out of date by the time it reaches print, and in this case this is most noticeable with respect to the discussion of computational linguistics. There is a chapter entitled "Word Classes in Computational Linguistics and AI" by Meladel Mistica, Ekaterina Vylomova and Francis Bond, and another entitled "Word Classes in Corpus linguistics", by Natalila Levshina. Both are devoted in part to difficulties in establishing cross-language tag sets (word classes) so that tokens (mostly words) in text corpora can be tagged by humans to support machine learning. If there was anything the previous chapters established, it was that it would be difficult to come up with universal word classes, so this fact about tag sets hardly comes as a surprise.
A more interesting question might have been whether it is possible for the machine to come up with the "right" word classes in particular languages on its own. These chapters were of course written before Large Language Models (LLMs) came to the fore, and the ability of LLMs to construct coherent sentences might be telling us something interesting about word categories. But there was already a considerable literature going back to the early 1990s showing that unsupervised machine learning of grammatical categories (clustering of words in unannotated texts into categories) can be done (see e.g. Clark and Lappin (2010), particularly section 3.1, and Muralidaran, Spasić and Knight (2021) for summaries of some approaches that preceded LLMs). These programs could serve as proofs of concept that humans could infer grammatical categories without innate knowledge---and to the extent that the internal workings of these programs can be understood (it is notoriously hard to examine the internals of LLMs), it might even show how humans learn these things. Or if on the contrary it turns out that unsupervised machine learning cannot reliably infer grammatical categories under reasonable constraints (e.g. without access to quantities of data beyond what children might be expected to hear), this might be an argument for innate categories. The question is briefly discussed in Levishina's chapter (section 38.2.3) and in Mistica et al's chapter (section 45.4.3); but if there will some day be a new edition of this handbook, I would expect this topic to be much more prominent, both in the computational chapters and in chapters devoted to theory.
Another question which is not addressed (at least not in any depth) is morphosyntactic features, such as tense, aspect, number, case and gender--and perhaps even person. This absence is perfectly understandable---the book might have wound up at twice its current length! But if nouns and verbs are not directly comparable across language groups, how much more so the features they carry. Perhaps another book will address those questions.
Finally, I mentioned that the Part on language groups includes a welcome chapter devoted to sign languages. Sign languages are however ignored in nearly all the other chapters, despite the possibility that the very different modality of signing might have a significant effect on word categories---or not. Either result will surely throw interesting light on the question of the universality and innateness of categories. I would hope that future researchers on grammatical categories will rectify this omission.
In summary, this book--or rather, some portion of it!--should be required reading for anyone working on language documentation, syntax, morphology or lexicography. In my opinion, this recommendation should hold particularly for those linguists working within theories that assume innate grammatical categories, who might be persuaded to reconsider that decision, or at least consider why there is so much variation in categories across languages. Those working in other subfields of linguistics may find much to think about here as well.
REFERENCES
Clark, Alexander, and Shalom Lappin. 2010. "Unsupervised Learning and Grammr Induction." Pp. 197--220 in Clark, Alexander; Fox, Chris; and Shalom Lappin (eds.) The Handbook of Comptuational Linguistics and Natural Language Processing. Hoboken, New Jersey: Wiley-Blackwell. Available online at https://alexc17.github.io/static/pdfs/HANDBOOK2010CHAPTER.pdf. Accessed 2024-11-14.
Haspelmath, Martin. 2012. "How to compare major word-classes across the world's languages." P. 109--130 in Thomas Graf et al (editors), Theories of Everything: In Honor of Edward Keenan. Los Angeles: UCLA Working Papers in Linguistics 17. Available online at https://phonetics.linguistics.ucla.edu/wpl/issues/wpl17/papers/16_haspelmath.pdf. Accessed 2024-11-14.
Muralidaran, Vigneshwaran; Spasić, Irena; and Dawn Knight. 2021. Natural Language Engineering. 27(6): 647-689. https://doi.org/10.1017/S1351324920000327. Accessed on 2024-11-14.
Polian, Gilles. 2020. Tseltal-Spanish multidialectal dictionary. Dictionaria 10. 1-8109. DOI: https://doi.org/10.5281/zenodo.5526550. Accessed on 2024-11-11.
ABOUT THE REVIEWER
Dr. Maxwell is a retired researcher in computational morphology and other computational resources for low density languages, formerly at the Center for Advanced Study of Language (later the Applied Research Laboratory for Intelligence and Security) at the University of Maryland. Before that he did research at the Linguistic Data Consortium at the University of Pennsylvania, and studied endangered
languages of Ecuador and Colombia with the Summer Institute of Linguistics.
Page Updated: 21-Jan-2025
LINGUIST List is supported by the following publishers: