Broeder, Peter, and Jaap Murre, ed. (2000) Models of Language Acquisition: Inductive and Deductive Approaches. Oxford University Press, hardback ISBN: 0-19-829989-3, ix+291pp., $85.00.
Christophe Parisse, INSERM (National Research Institute for Health in France), Paris, France
GENERAL DESCRIPTION OF THE BOOK The book is intended for research specialists from various domains: linguistics, computational linguistics, psychology, cognition, language acquisition, and behavioural science. It composed of 11 papers and an introduction for a total of 291 pages and presents recent advances in computational modelling of language acquisition by leading researchers. The general aim of the book is to try to show what interesting lights may be thrown on fundamental issues in language acquisition when powerful computational techniques are combined with real data.
1) Introduction (Broeder & Murre) The introductory text is a bit disappointing in its shortness -5 pages- and does not really help the non-specialist to understand the papers that follow. It sketches a rapid description of the computational study of language acquisition and divides it into three parts (i- Implementation: theory and experiment; ii- Discovery of existence proof; iii- Testing on empirical data) which can be found scattered throughout the book as they somewhat confusingly do not correspond to the book structure. Implementation goes with chapters 2, 3, 11, and 12. Existence proof goes with chapters 4, 5, 6, and 7. Empirical data goes with chapters 8, 9, and 10.
Part 1: 2) Lexical connectionism (MacWhinney) This paper starts with a short presentation of the differences between the traditional symbolic approach and connectionism. The limits of the connectionism approach are then underlined and an extension of this approach is proposed using Self-Organizing Features Maps, which allow to implement lexical access within the connectionism framework. The author describes how this solves several problems such as suffix extraction, word compounding and syntax learning, but how far the proposals are actually implemented is not quite clear.
3) Are SRNs sufficient for modelling language acquisition (Sharkey, Sharkey & Jackson) The authors of this paper attempt to evaluate whether SRNs (Simple Recurrent Nets) are powerful enough to model language acquisition by trying to get an answer to four questions: Q1) Is a specific initial state required for SRNs learning to be successful? Q2) Can SRNs transfer old knowledge to new lexical items? Q3) Will SRNs forget information about old items? Q4) Are SRNs able to represent more than they can learn? The results for Q1 is Yes, for Q2 No and for Q3 Yes, which makes limit the chances of SRNs being good candidates for modelling language acquisition. However, the answer to Q4 being Yes means that the problems with SRNs might be due to inadequate learning algorithm (Answer to Q4 is Yes because is it possible to find node weights which, set manually, obtain results that cannot be achieved using the automatic learning procedure). Improving the learning algorithm may allow future SRNs to provide better answers to the first three questions.
4) A distributed, yet symbolic, model for text-to-speech processing (van den Bosh & Daelemans) A model of text-to-speech processing (SPC: Subword-Phoneme Correspondence) is presented, a single-route model that does not use a connectionist implementation but a procedure akin to lazy-learning (learning using a straightforward storage of examples and an example-based similarity matching procedure). It nevertheless allows good generalization and automatic learning through examples and to define a consistency metric to characterize word pronunciation. The algorithm is implemented and has been tested in three languages, English, Dutch and French.
5) Lazy learning: Natural and machine learning of word stress (Gillis, Daelemans & Durieux) This paper presents another application of the lazy-learning principle, used for implementing a single-route model of word stress assignment in Dutch. The results are compared to those of 3- and 4-year-olds. The comparison shows that the model tends to display the same characteristics as the children, e.g. to regularize irregular words. The model is also able to explain how regular words can sometimes be irregularized. This model represents another alternative to rule-based models, different from the connectionist models discussed further on in the book.
Part 2: 6) Statistical and connectionist modelling of the development of speech segmentation (Shillcock, Cairns, Chater & Levy) This paper describes a succession of strategies that follow a developmental path and can be used to develop an efficient algorithm to segment speech into words. The first algorithm uses connectionist modelling and pure bottom-up processes. It is based on the idea that, the lower the probability of predicting the following phoneme, the higher the probability of being at the end of a word. This very simple algorithm already yields results above chance level. The next algorithm is based on the fact that low frequency n-phones are more likely to contain a word boundary. The next algorithm uses the principle of Metric Segmentation Strategy (Cuttler & Noris, 1988), which states that children take advantage of the fact that, in English, strong syllables are more likely to occur at the beginning of a word than weak syllables. Finally, the results can be improved again by storing lexical level representations. All these algorithms -except the first one- are implemented with non-connectionist statistical procedures. All are bottom-up procedures and yield results that provide children with the means to learn how to segment their language. Later on, children will improve their performance by using syntactic, semantic, and pragmatic, top-down information.
7) Learning word-to-meaning mappings (Siskind) This paper attempts to offer a solution to the word-to-meaning mapping problem. The goal of the algorithm presented here is to find the correspondences between all the elements of a string of words and a representation of the world semantics -for example which elements in "quux bleen plugh baz" correspond to which elements in "RUN(Bill, TO(Mary))". This problem is made more complex in that initially none of the correspondences are known, that more than one semantic structure may correspond to the same string of words, and that the action referred to may not correspond to the present action -for example in most past, or future tense utterances, or conditionals. A final complication comes from homonymy- one word can have more than one correspondence. First, a categorical algorithm for non-ambiguous and non-noisy learning is described and, second, a statistical algorithm for noisy and ambiguous learning. These algorithms are tested using a synthetic corpus, which allows for varying percentages of uncertainty, noise and homonymy. Performances are good until noise levels of 70% are reached. Results for large homonymy rates are also good. Note: There is an error in Table 7.3, which makes it useless. The appendix which should have contained the description of the pruning strategy is missing.
8) Children's overregularization and its implications for cognition (Marcus) The author presents a "rule and memory" model for the formation of the past tense of verbs. The argument is that pure connectionist models of the formation of verb morphology are unable to simulate the production of past tense by children correctly. When these models manage to do it, they do so by resorting to features that implement -in some way or another- the existence of a default rule. Markus then presents an argumentation about the existence of rule processes in cognitive domains other than language. He concludes with the proposal that the cognitive system is better explained with two separate basic mental mechanisms, rule and associative memory, instead of one single one -rule or associative memory.
9) A recurrent network with short-term memory capacity learning the German -s plural (Goebel & Indefrey) In this paper, a connectionist implementation of the formation of the German plural is presented. The authors compare the results of the connectionist model and those of human native speakers at length and show that the -s ending is not the only default plural in German. In many cases, model and humans produce a -e ending for masculine words and a -n ending for feminine words. The authors argue then that the -s ending plural of German cannot correspond to a language universal because default plurals in other languages do not correspond to the same features. They propose that the -s ending appears when several phonetic cues are present, and that otherwise children learn to produce it when certain semantic cues are present -as is true in many other languages. However, models that implement only phonological rules cannot use semantic cues and cannot answer this question, which leaves the issue open and to be tested further using more complex models.
10) Single- and dual-route models of inflectional morphology (Nakisa, Plunkett & Hahn) This paper presents computer implementations of single- and dual-route models of inflectional morphology. These models are then tested with three languages: Arabic, German, and English. The results demonstrate that single-route models always perform better than dual-route models. This surprising result comes from the fact that dual-route models have an unfortunate weakness, which is that they have to decide whether a word is regular or not before being able to apply the default rule. This leads to results worse than directly producing a derivation with a single-route model. All which shows how fundamental implementing models and putting them to the test is, because this leads to discover hidden weaknesses overlooked when devising theoretical structure.
Part 3: 11) Formal models for learning in the principles and parameters framework (Nyogi & Berwick) This paper gives a formal presentation of an algorithm for learning the parameters of the principle and parameters framework. The authors present a Markov model formulation of the Triggering Learning Algorithm of Gibson and Wexler (1994). This gives them a tool for a better analysis of the characteristics of the learning algorithm (e.g. convergence or number of exemplars needed to attain the result with high confidence) and allows to propose then better variants of the algorithm.
12) An output-as-input hypothesis in language acquisition (Elbers) This paper presents a theoretical model of language acquisition where the child's own output serves as primary material for the building of her linguistic knowledge. Five main arguments are presented which justify the construction of such a model. These arguments are mainly based on Levelt's (1989) model of speech production. A model is then described which contains three phases: a) Intake and production of incompletely analyzed fragments; b) Analysis of own production and hypothesis formation; c) Testing of hypothesis against adult input. Examples from case studies are presented to give some evidence of the validity of the model. No specific implementation of the model is yet proposed.
DISCUSSION OF THE BOOK The book is of very good quality, having been written by specialists of the field that all describe research which has been ongoing for a long time. For this reason, most of the information in the book could have been found elsewhere, but the authors were able to take advantage of their mastery of the subject to produce clear and thought-provoking presentations. Thus the book is perhaps best for PhD students that want to find a far-ranging presentation of computational linguistics applied to language acquisition. The book will also be a good read for people already working in the field of language acquisition or computer linguistics who want to discover how these two domains can profit from one another. Implementations are not limited to connectionist models, which attests the wide coverage of the book. The only reservation I could make is that I find the introduction a bit insufficient and that chapter 7 has its annex missing. Moreover, the technical complexity of the papers is not homogeneous an might surprise some readers. However, it remains nonetheless a more than worth reading, offering a comprehensive and valuable idea of this field of research.
REFERENCES Cutler, A., & Norris, D. (1988). The role of strong syllables in segmentation for lexical access. Journal of Experimental Psychology: Human Perception and Performance, 14, 113-121.
Gibson, T., & Wexler, K. (1994). Triggers. Linguistic Inquiry, 25, 407-454. Levelt, W.J.M. (1989). Speaking: From intention to articulation. Camdridge, MA: MIT Press.
BIOGRAPHICAL SKETCH My main research interests are in language development. my main work is on the initial development of syntax (children aged one to four). The tools I use include computer simulation as well as psycholinguistic experiment. I work with children with language disorders as well as normally-developing children.
|