Date: Wed, 8 Oct 2003 15:13:55 +0100 From: Azra Nahid Ali Subject: Probabilistic Linguistics

Bod, Rens, Jennifer Hay and Stefanie Jannedy (2003) Probabilistic Linguistics, MIT Press, A Bradford book.

Azra N. Ali, School of Computing and Engineering, University of Huddersfield, England.

OVERVIEW The study of language has very much been categorical, however we are now in an era where we cannot ignore that language shows probabilistic properties and the book clearly deals with this. The book is all about probabilistic linguistics and each chapter covers probabilistic modeling from a different theoretical linguistic view; from sociolinguistics to phonology. The book begins with an introductory chapter about probabilistic linguistics and probability theory before delving deep into probabilistic linguistics. Each theoretical chapter is covered by a specialist in the field. The book also has a glossary which is well documented and if you didn't know what 'hypothesis' meant, well you do now.

Chapter 1: Introduction (by Rens Bod, Jennifer Hay, and Stefanie Jannedy) The chapter provides an overview of probabilistic linguistics and how probability plays a role in linguistics, showing with examples that not all linguistic is categorical, in fact it is gradient and shows probability properties. You can quickly grasp how frequency and probability fits together in linguistic theory/approaches before even reading any further.

Chapter 2: Elementary Probability Theory (by Rens Bod) This is an introductory chapter on probability theory. The chapter starts off with simple linguistic examples (general probability calculations, joint and conditional probability) that all linguistic readers should be able to understand, before moving on to more complex examples - probabilistic grammars (probabilistic context free grammars and data- oriented parsing models) but still within ease of understanding.

Chapter 3 - Probabilistic Modeling in Psycholinguistics (by Dan Jurafsky) While we may fail to see probabilistic properties in linguistics, Dan clearly highlights where they can be found and at the same time provides a good literature support.

Jurafsky introduces the chapter by talking about frequency, showing that the cognitive processing time is considerably short for high frequency words than for low frequency words. He explains that high frequency words have a shorter duration time and often the final coda of a word is unstable, where deletion of /d/ and /t/ are apparent. He then moves on to neighbouring words in a sentence where the probability is an important aspect in speech comprehension and production, followed by a different form of frequencies - 'Syntactic subcategorization Frequencies' of verbs. In this section, conditional probability is discussed at some length.

The latter half of the chapter discusses 'Probabilistic Architectures and Model'. The section details different types of probabilistic models for sentence processing, for example, constraint-based models, competition model, Markov models, stochastic context-free grammars, and Bayesian belief networks. Each model is described in detail with examples and weaknesses of the models also highlighted.

Chapter 4 - Probabilistic Sociolinguistics (by Norma Mendoza- Denton, Jennifer Hay, and Stefanie Jannedy) The chapter provides a good introduction to sociolinguistics variation and points out how existing statistical techniques are poor and not suitable for analysing sociolinguistics data. Traditional statistical methods cannot be used by the sociolinguistics researcher because statistical techniques like Analysis of Variance (ANOVA) require controlled data for their use. The chapter discusses the need for more advanced multivariate probabilistic methods and shows how such techniques can be used to analyse sociolinguistics variation data.

The probabilistic techniques that are discussed are related to one particular language variation case - the monophthongization of /ay/ which is apparent in African- American speakers in the southern states of U.S. Data are analysed, first by using the traditional frequency approaches then moving on to the VARBRUL program and Classification and Regression Tress (CART).

VARBRUL program is a form of logistic regression model and the author details the framework of VARBRUL and discusses how this program compares with the commercial applications like SPSS and SAS. Latter half of the chapter illustrates how VARBRUL program can be used to collect and analyse data - monophthongization of /ay/ in Oprah Winfrey's speech. Oprah Winfrey is an African-American talk-show host and the program is used to analyse the considerable style shifting that is apparent in her speech. In the final section, CART approach is used to investigate patterns in the data.

Chapter 5 - Probability in Language Change (by Kie Zuraw) The chapter looks at the role that probability plays to address the issue of language change. Language changes over time and this is apparent in the changes of observed probabilities over time. Zuraw shows that by applying probabilistic approaches to language change, it enables one to underpin the factors that cause a language to change.

Chapter 6 - Probabilistic Phonology: Discrimination and Robustness (by Janet B. Pierrehumbert) Pierrehumbert discusses a number of studies and supports with evidence to show that probability can be found at all levels of representation, first illustrated through "probability distribution over the phonetic space" (p.182). What is more important and is the focus of the chapter is that speech perception, production and well-formedness is affected by frequency and is both gradient and predictable. Infants acquire words first before phonemes and phonemes are gradually built. In adults, well-formedness judgments for novel words are affected by frequency, lexical neighbours and phonotactics of existing words. Finally, Pierrehumbert highlights the fact that phonetic learning requires continuous updating of probability distribution.

Chapter 7 - Probabilistic Approaches to Morphology (by R. Harald Baayen) Baayen's opening pages of his chapter should have actually been at the beginning of the book, as he encapsulates so nicely how probabilistic linguistics has come about. This has been due to the development and the ease of availability of statistical software that can analyse large amounts of data at a fraction of the time compared to manually processing. At the same time, technology has enabled to collect and store large amounts of data, for instance British National Corpus (BNC), a corpus consisting of 100 million words. With these two technologies at ones disposal, it is not surprising that we can now see probabilistic properties in linguistics.

The chapter concentrates on morphological productivity, why people use certain types of affixes in English and Dutch more than others. Baayen shows that frequency approaches to measure productivity is not an appropriate method, as it does not tell you the degree to which certain affixes are productive. This is illustrated by some simple English morphological examples -th and -ness using subcorpus of the British National Corpus. Baayen therefore deals with probabilistic approaches to determine the factors that aids to the degree of productivity. The final section of the chapter discusses morphological segmentation problem, illustrated by computational models using Matcheck program.

Chapter 8 - Probabilistic Syntax (by Christopher D. Manning) Manning highlights that little attention has been devoted to the area of probabilistic syntax. He therefore examines 'probabilistic models for explaining language structure' (page 291) because there are a number of phenomena in syntax where categorical approaches are not adequate for their explanations. In fact he emphasizes that probabilistic models should be used in addition to the categorical approaches to obtain a full understanding of the language structure.

Manning shows throughout his chapter that probabilities can also be found in syntax, contrary to the statements made by Chomsky and others. This is demonstrated quickly to the reader, by showing that the ungrammatical structure 'as least as' (first noted by Manning in Rosso's book 2001) does not appear to be a typo error or speech error as first thought. By searching through corpus linguistics, several instances of these ungrammatical structures were found in the New York Times newswire and more instances when searched on the web. The remainder of the first part of chapter looks at verbal clausal subcategorization frames to which probabilistic syntax models are applied. In the final section, it gives an overview on Optimality Theory and Analysis, followed by Stochastic Optimality Theory, loglinear models and generalized linear models.

CHAPTER 9 - Probabilistic Approaches to Semantics (By Ariel Cohen) The final chapter discusses probabilistic techniques in semantics. Cohen opens the chapter by discussing 'probability', what do the figures actually mean and tells us when it comes to semantics. The chapter addresses this issues to generic and frequency adverbs using ratio theories to show their extensibility.

EVALUATION The uniqueness of this book is that it starts off with an introductory chapter on probabilistic linguistics and probability theory before delving deep into probabilistic linguistics. Credit must be given to the authors for introducing a single book that covers probabilistic properties that can be found in all areas of linguistics (phonology, syntax, sociolinguistics, etc.) and showing how traditional statistical techniques may no longer be appropriate to deal with complex data analysis work. My only concern is that, although the book is supposed to be an introductory book on probabilistic linguistic it is far from that. Some of the chapters in the book contain complex probabilistic mathematical work which may overwhelm a linguistic student with limited mathematical experience.

Although I have not been able to provide a detailed account for the chapters that are not my main focus of research, it has nevertheless been interesting to read these chapters and to know how probabilistic techniques can be applied to other fields of linguistic too. I would therefore advise a linguistic reader to read the book selectively, start by reading the introduction chapter and probability theory chapter, which is a must if they have limited background in probabilistic mathematics, followed by reading the chapters of interest to their field of research.

ABOUT THE REVIEWER:
ABOUT THE REVIEWER Azra Ali is a PhD student in the ARTFORM (Centre for Artificial Intelligence and Formal Methods) research group in the School of Computing and Engineering at the University of Huddersfield, England. Her research area is audiovisual speech errors, phonology, and she is currently expanding her knowledge in the area of probabilistic linguistics.