LINGUIST List 8.692

Fri May 9 1997

Review: Regier: The Human Semantic Potential

Editor for this issue: Andrew Carnie <carnielinguistlist.org>


What follows is another discussion note contributed to our Book Discussion Forum. We expect these discussions to be informal and interactive; and the author of the book discussed is cordially invited to join in. If you are interested in leading a book discussion, look for books announced on LINGUIST as "available for discussion." (This means that the publisher has sent us a review copy.) Then contact Andrew Carnie at carnielinguistlist.org

Directory

  1. Anne REBOUL, Review of Regier's book

Message 1: Review of Regier's book

Date: Thu, 8 May 1997 21:18:01 +0200 (MET DST)
From: Anne REBOUL <Anne.Reboulloria.fr>
Subject: Review of Regier's book

Regier, Terry (1996) The Human Semantic Potential. Spatial Language and
Constrained Connectionism, MIT Press, Cambridge Mass. 220 pages.
ISBN 0-262-18173-8.

Reviewed by Anne Reboul, CRIN-CNRS, France

Regier's book approaches the general problem of the human ability to
categorize through an investigation which relies on a computer simulation
of the acquisition of spatial linguistic concepts. The computer simulation
uses a new strategy, the strategy of constrained connectionnism.

SUMMARY

0. Foreword

The foreword by George Lakoff insists on the links between Regier's
approach and cognitive linguistics. Lakoff concludes that Regier's
approach, in that it relies on perceptive and language acquisition
mechanisms, is especially well adapted to deal with the semantics of
spatial relations which, largely, eschew purely logical approach.

1. Introduction

The goal of the book is to characterize what Regier calls the " human
semantic potential ", that is the human ability to categorize and its
expression in language. Given that languages are not always semantically
and conceptually equivalent, trying to circumscribe the possibilities of
variation is important. The problem of language variation is approached
from the angle of language acquisition and the learnability of linguistic
concepts. The book also deals with the problem of the use of connectionism
in cognitive science and the solution advocated by Regier is Constrained
Connectionism.
 Though there are quite a lot of different ways to categorize space
in languages and though some languages where space is conceived relative to
absolute orientations such as north, south, east and west may affect the
very perception of space by their speakers (notably Guugu Yimithirr, a
native Australian language), Regier claims that there are limits to
language variation. Those limits do not correspond to linguistic universals
but rather to perceptual universals: in other words, the limit to
cross-linguistic variation directly derives from the limit in the human
perceptual system. This strong hypothesis is what dictated Regier's answer
to the problem of connectionist modelling in cognitive research: that is a
model in which cognitive constraints are built (hence the name of
Constrained Connectionism), in the case of Regier's model, perceptual
constraints. Thus the goal of the book is double: it is both scientific in
that it is an investigation of a cognitive faculty, the human semantic
potential, and methodological as it proposes a new strategy of cognitive
modelling, constrained connectionism, which relies both on simulation and
on independently motivated structures, which allow a better analysability
of the system as a whole.
 Regier has based his inquiry in the human semantic potential on the
semantics of space in a few languages. He has chosen space because space is
a foundational ontological category, often expressed in closed-class forms
(classes of linguistic items which contain few members and very rarely
admit new members), which strutures other parts of the conceptual system
through metaphor and exhibit cross-linguistic variation. The model
concentrates on the acquisition of spatial semantics and as such has to
face the " no-negative-evidence " problem, i.e. the lack of negative
evidence in language acquisition. The solution proposed by Regier and
motivated by language acquisition studies is to take each explicit positive
instance of a concept as an implicit negative instance of all other
concepts. The model takes as input simple movies of two 2-dimensional
convex objects moving relative to one another, each movie being labeled as
a positive instance of a spatial term from a given language, i.e. a given
movie is labelled as a positive instance of " through ". It learns the
spatial relations through back propagation (a standard connectionist
learning strategy). It should then be able to label correctly new movies.
Each movie shows a static object (the landmark = LM) and another object
(the trajector = TR).
 Though the classic connectionist framework has some advantages,
notably flexibility and plasticity which allow connectionist networks to
simulate a wide range of cognitive and non-cognitive behaviours, it has the
important drawback that its explanatory power is very limited, severely
reducing its utility in cognitive enquiries. The solution advocated by
Regier is constrained connectionism, the construction of connectionist
models which would retain much of the plasticity and flexibility of
connectionism but in which the incorporation of structural devices
independently motivated enhance the analysibility of the models and hence
their explanatory powers.

2. The linguistic categorization of space

The goal of Regier is to do for space more or less what Berlin and Kay
(1978) did for colours. Very roughly, Berlin and Kay showed that there are
semantic universals in the domain of coulours. Their work was supported by
further studies which showed that there are partial links between the basic
colour categories and the neurophysiology of the visual system. The
ambition of Regier is to provide similar insights in the domain of space.
 His main hypothesis, that linguistic categorization in general and
in the domain of space in particular is constrained by experience and, in
this case, by visual perception, is very similar to that of cognitive
linguistics which claims that language is influenced by non-linguistic
faculties and hence cannot be studied alone. This, indeed, means that what
is relevant is not so much the world or reality itself, but rather the
mental representation of it. The notions mainly associated with cognitive
linguistics are protypicality (the rejection of Aristotelian semantics and
its replacement by graded membership in a category), deixis (the semantic
dependency between linguistic items and the physical setting in which they
are used) and polysemy (the fact that a word has several meanings).

3. Connectionism and cognitive models

Connectionism is an alternative to the classic Von Neumann model of
cognitive functioning (or GOFAI: Good Old Fashioned Artificial
Intelligence). It relies on Parallel Distributed Processing (PDP), that is
on the use of massive parallelism and the distributed nature of
representation. The first opposes GOFAI in that it relies on simultaneous
processing of numerous units, while GOFAI relies on linear processing; the
second opposes GOFAI in that computing units are characterized
subsymbolically rather than symbolically, i.e. there is no symbolic
interpretation of the function of a single computing unit. Finally it uses
back-propagation as a learning algorithm.
 Classic connectionist models do not incorporate any prestructuring
and as a result they are very good at a great variety of learning tasks,
but have a very poor analysibility. Structured connectionist models
incorporate quite a lot of prestructuring and as a result their processing
units are interpreted symbolically (as in GOFAI) rather than
subsymbolically as in classic connectionism. They have a much better
analysibility but they loose a lot of the flexibility and learning power of
classic connectionism.
 Regier's suggestion, constrained connectionism, is a tentative to
capture the best of both worlds, both the learning ability of classic
connectionism and the analysibility of structured connectionism. " The
essence of the idea is to build PDP networks that have built-in structural
devices that constrain their operation " (p 44) and the network is trained
under back-propagation. The structural devices incorporated must be
independently motivated. The benefits of constrained connectionism that the
model, through its in-built structural devices, is better motivated as a
whole, its learnability is good and its analysibility is better.

4. Learning without explicit evidence

The no-negative-evidence problem is the problem of the limits of
generalisation: in the absence of negative evidence, how can
over-generalisation be avoided? Regier uses the mutual exclusivity
heuristics (positive evidence for an instance of a term is taken as
equivalent to negative evidence for all other terms). However, the
application of this heuristics is not straightforward and the difficulties
it encounters are discussed and solved in this chapter.
 The main difficulty arises when there is a semantic overlap between
terms, which is obviously the fact with spatial terms. In such a case, the
risk is the generation of a great number of false implicit negatives. The
solution is to view mutual exclusivity " not as an absolute rule governing
acquisition, but as a probabilistic bias that can be overriden " (p 65). On
this view, explicit positive instances and implicit negative instances are
treated differently during training, implicit negative instances providing
weak evidence whereas explicit positive instances provide strong evidence.
This idea was implemented through incorporation of prior knowledge (in
keeping with the constrained connectionism philosophy), allowing a
distinction between antonyms (where implicit negative evidence is good) and
non antonyms (where it is weak).

5. Structures

The design of the structures is crucial in constrained connectionism in as
much as it may both enhance the performance of the network and motivate it
if it is good and will hamper it and will not motivate it if it is bad.
Regier proposes three structures: orientation combination, map comparison
and source-path-destination. The first is a weighted combination of
different orientations: the direction of potential motion (which relies on
the computer equivalent of the mental representation of the forces acting
on the object), the proximal orientation (the imaginary straight segment
joining TR and LM where they are nearer), center-of-mass orientation (the
same between the centers-of-mass of TR and LM). Orientational alignment,
which is another part of the same mechanism, is the degree to which an
orientation, either relational as those mentioned above or reference
(upright vertical for instance), aligns with another. The second is
essentially topological and is used in the detection of contact and
inclusion. It operates through the comparison of a boundary map for TR, a
boundary map for LM and an interior map for LM. The third deals with motion
and distinguishes the first frame of the movie, conserved as the source,
the last frame of the movie, the destination, and the path, calculated from
the frames occuring between the source and the destination. There is no
record of the exact time a specific event in the motion occured, but the
event is recorded nonetheless.
 All three structures are at least partly motivated by works in the
neurology and psychology of perception.

6. A model of spatial semantics

The three principles behind the architecture and design of the model are
adequacy (performance), motivation and simplicity. " The model's task is to
acquire visually grounded semantics for spatial terms " (p 122). The movies
which the model is presented with are arbitrary in length and the model's
response to the last frame is taken to be its response to the movie as a
whole. Yet the model responds to each frame (not knowing in advance the
length of the movie). The model is incomplete in that it does not account
for object segmentation, but, as Regier points out, this is not one of its
goals. Each successive frame is treated by the structures described above
which input their results to the PDP layer. This then produces the output,
that is the categorisation which is the categorisation of the movie as a
whole if the frame treated happens to be the last one. Given this, the
system is trained for several linguistic items simultaneously, for instance
" in " and " through ".
 The model was trained on spatial terms from several languages
(Mixtec, German, Japanese, Russian and English), which means that it is
significant on cross-linguistic variation. In English, it was trained for
" above ", " below ", " left ", " right ", " around ", " in ", " on ",
" out of ", " through " and " over ".
 As wished, the model exhibits some prototype effects. It can adapt
itself to a range of dissimilar spatial systems and does so without
negative evidence via mutual exclusiveness. There is, however, Regier
acknowledges, something non trivial missing from it: an account of the non
linguistic spatial conceptual development and its impact on the linguistic
categorization of space. This could affect the necessity of simultaneous
learning of several terms: for instance, if the notion of inclusion is
learned prelinguistically, the acquisition dependency between " in " and
" through " would disappear.

7. Extensions

In this chapter, Regier proposes several extensions to his model, pointing
out that the model was mainly conceived as a framework in which other
issues, such as deixis or polysemy, can be approached. He begins with
polysemy, distinguishing between the cases where polysemy allows a single
abstract sense subsuming the different meanings of the term and those where
it does not. In the first case, it helps the system to output the required
variations in meaning if some terms with related meanings are learned
simultaneously. In the second case, the best thing seems to be to allow the
system to learn simultaneously all the items in the contrast set. This is
not in itself sufficient to indicate a grouping mechanism inside a contrast
set though it does seem to indicate that paradigmatic and syntagmatic
contexts play a role in the development of polysemous representations.
 He then deals with deixis for which an extension of his model was
effectively constructed via the inclusion in the movies of a simple
indication. This shows that the model as it is can deal with deixis.
 The main concern of Regier is with prelinguistic conceptual
development. The principal problem for his model is that in order to deal
with it and notably with the acquisition of non-linguistic concepts of
space, the model would have to incorporate desire, action and reaction. He
makes a few tentative propositions toward such an extension.
 He also deals with the notion of key events, noting that it would
be tantamount with the selection by the system of a few frames in the
movies it is presented with and that it might allow the system to give
graded judgments.
 The notion of distance could be treated by the simple device of
measuring the length of proximal orientation and center-of-mass distance,
yielding to measures, proximal distance and center-of-mass distance. These,
combined with the focus of attention could extend the system's learning
ability to distance terms such as " near " and " far ".
 The notion of convex hull could help to treat concave objects, as
well as spatial terms such as " between ".
 The use of implicit paths, finally, would help the system to deal
with combination of spatial terms.

8. Dicussion

" In [Regiers's] estimation, the degree to which the work succeeds varies
considerably depending on the particular aspect of the work examined. From
some standpoints, the model lives up to expectations; from others, it does
not " (p 186). Regier begins by comparing his model to other approaches,
Chomsky's and Kay and McDaniel's as far as the basic ideas behind his model
are concerned, Miller and Johnson-Laird's, as well as Landau and
Jackendoff's for purely spatial analysis.
 The main discussion in this chapter, however, centers on
falsifiability. Given that both Chomskyan linguistics and cognitive
linguistics, from which Regier has borrowed some of his central hypotheses
have been attacked as not falsifiable, his concern is with whether his
model could receive the same reproach. It does not in that it could fail to
learn a new spatial linguistic system. Indeed, as Regier points out, it is
false, because it could not, in its present state, learn the spatial system
of Guuru Yimithirr, which uses absolute coordinates. Regier distinguishes
between shallow failures and profound failures (failures which could be
easily remedied through a minor modification of the model and failures that
could not) and informative and non-informative failures. He then shows that
the failure of his model to learn Guuru Yimithirr is a moderately
informative failure and a shallow one and, hence, does not discredit the
model. On his view, however, the main weakness of the system is its failure
to accomodate non linguistic concepts of space.
 Yet, the model has quite a few advantages: it is neurally,
psychologically and linguistically motivated; it illustrates constrained
connectionism which is a progress on both classic and structured
connectionism and it seems to avoid the difficulties which connectionism
has met with.

CRITICAL EVALUATION
 Regier's book is, to my mind, excellent. It deals seriously,
intelligently and honestly with a lot of central issues in current
cognitive science: those of computer modelling, language acquisition, space
and its categorization, and, as he calls it, the human semantic potential.
It is well-informed and has a very good bibliography. It acknowledges its
own weaknesses and highlights its failures. It never overplays its results.
I cannot see any criticism of it which would not be unfair. However, its
very excellence serves to illustrate the present difficulties in Artificial
Intelligence: though Regier does not claim to give a semantics for spatial
terms and indeed does not attempt to do it, it is slightly disappointing
that, in the end, and despite its system's success in learning spatial
terms, we do not seem to be nearer to a non " ad hoc and incohesive "
semantic definition for spatial terms (two criticisms which he applies to
Miller and Johnson-Laird's definitions) than we were before. Though, as
pointed out previously, this was not a part of his goal, it is nonetheless
a serious problem, in that it seems to show that the whole semantics
embodied in his model is either insufficient to account by itself for the
whole semantics of a spatial term (the structural devices do not by
themselves provide a semantic definition of any sort: they only provide
their results to the learning part of the system, i.e. the PDP layer) or
has very poor analysability (the PDP layer). It could be said that a
semantic for spatial terms can be provided, despite its poor analysability,
through the analysis of the working of the PDP layer. But this would mean
that the movies in the training set provide an exhaustive covering of
instances of the term, something which presumably is not the case given
that the movies only represented convex objects (and the notion of convex
hull given in chapter 7, though interesting, would not be sufficient for
concave objects). Again, this cannot be seen as a criticism of Regier's
entreprise as it was not a part of it to provide a semantics of spatial
terms, but it is a good indication of the sort of difficulties
computational linguistics meets with all the time.
 In the same line, I would say that I am not sure that Regier's
model yields truly original insights on language learning in general and
spatial terms acquisition in particular (the most interesting suggestion,
that some terms must be learned simultaneously, is seriously weakened by
the absence of an account of non linguistic spatial concepts). However,
there is no doubt that it is a very good (and successful in the limits
which he himself indicates) test of the fundamental hypotheses on which the
system is built as well as of the particular structures which are
incorporated in it. In other words, human learning may not function exactly
in the way he assumes, but it could function that way, and that, in itself,
is an interesting result and a result which cannot be ignored by anyone
trying to build a semantics for spatial terms. And, as George Lakoff says,
it shows the importance of non-linguistic data for language acquisition.
 This book SHOULD be read by any informed reader interested in
cognitive science, the semantics of space, language acquisition, artificial
intelligence and linguistics, cognitive or otherwise.

References
Berlin, B. & Kay, P. (1969), Basic color terms.Their universality and
evolution, Berkeley, University of California Press.
Kay, P. & McDaniel, C.K. ( 1978), " The linguistic significance of the
meanings of basic color terms ", Language 54, 610-646.
Landau, B. & Jackendoff, R. (1993), " What " and " where " in spatial
language and spatial cognition, Behavioral and Brain Science 16, 217-265.
Miller, G.A. & Johnson-Laird, P.N. (1976), Language and perception,
Cambridge, Mass., Harvard University Press.

Reviewer: Anne Reboul, Research Fellow at the CNRS (National Center for
Scientific Research) France. PhD. in Linguistics, PhD in Philosophy,
currently worlking in The Center for Computer Research in Nancy, in the
team dedicated to man-machine Dialogue. Has written quite a few papers both
in French and in English, Co-author of the Dictionnaire Encyclopedique de
Pragmatique (Paris, Le Seuil. English translation in preparation for Basil
Blackwell, Oxford).

Reviewer's address:
Anne Reboul
CRIN
BP 239
54506 Vandoeuvre-les-Nancy
FRANCE
<Anne.Reboulloria.fr>


Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue