Review of Defining Language
|
|
|
|
|
Review:
|
Date: Mon, 20 Jan 2003 10:06:57 -0600 (CST) From: Stefano Bertolo <bertolo@cyc.com> Subject: Barnbrook (2002) Defining Language
Barnbrook, Geoff (2002) Defining Language: A Local Grammar of Definition Sentences. John Benjamins Publishing Company, xv+280pp, hardback ISBN 1-58811-298-5, $79.00, Studies in Corpus Linguistics 11.
Stefano Bertolo, unaffiliated reader of LINGUIST
SUMMARY OF THE BOOK'S PURPOSE AND CONTENT The book describes the main features of the language used in definition sentences in the Collins Cobuild Student's Dictionary (CCSD). The reason for what might otherwise appear as a rather narrow choice of topic is explained very clearly and with an abundance of examples: in the CCSD a definition uses the head-word to be defined in a context that
a) uses a predictable and simplified syntactic format designed to sound natural (and so to be easily understood by a student); b) introduces enough information to disambiguate the intended meaning of the head-word.
As an example of a), transitive verbs in the CCSD are commonly defined using the format
"If you [Verb1 an Obj-1], you [[Verb2 Anaphora-1] Adjunct]" as in "If you MANHANDLE someone, you TREAT them very roughly"
As an example of b), two common senses of the word "breast" are disambiguated by introducing the appropriate modifier ("woman's" vs "bird's") as in "A woman's breasts are ..." "A bird's breast is ..."
Barnbrook's main insight, as I understand it, is the following: given that it is theoretically possible to explain a word to a learner in infinitely many ways, how is that the CCSD manages to do a very good job using only very few patterns of the kind described above? Evidently, those patterns represent a (or possibly "the") minimal set required to cover all communication needs associated with situation in which definitional information needs to be exchanged. Such patterns, Barnbrook goes on to explain, cannot be directly aligned with the constituents that would be returned by a syntactic parser. The goal of his book is to offer a detailed analysis of those patterns from a functional perspective that is orthogonal to that of current research in syntax/parsing.
I found it very difficult to understand the idea behind this approach until I happened to recognize the analogy between Barnbrook's definitional patterns and "patterns" as understood in the practice of software engineering, see for example Gamma et al. (1995). I mention it here in the hope that other readers might find it as enlightening as I did. The analogy goes like this: when a programmer wants to build a web application she might decide to use the client/server pattern, because this satisfies the need of the application (many agents need to access information maintained by a central entity). Nevertheless, when the program has been completed in, say, Java, analyzing the program with respect to the syntax of Java is the wrong level of abstraction to chose if one wants to understand what the intent of the program is: knowing which parts of the program are variables, method definitions, method calls etc... doesn't help you in the least in recognizing that you are looking at an instantiation of the client/server pattern.
Having summarized the general purpose of the book, I now give a brief description of each chapter.
Chapter 1 introduces the definition format of the Cobuild dictionaries described above, i.e. the fact that head-words are explained in the context of simple sentences of a predictable syntactic format. Barnbrook aptly points out that, in addition to being helpful for a learner, such predictability can and should be exploited by programs for automatic information extraction. Several Cobuild dictionaries are introduced and some of their characteristics explained. Considering that the rest of the book contains a very large number of tables, I would have expected and enjoyed finding a table listing all the Cobuild dictionaries with a summary of their similarities and differences.
Chapter 2 is devoted to a discussion of features of monolingual English dictionaries. Barnbrook summarizes earlier discussion to the effect that, in monolingual dictionaries, the language whose words are being defined acts as its own metalanguage in defining its semantics. I find this unconvincing, considering that, with the exception of entries that record usage information (register, regional variation, etc...) most Cobuild entries do not mention the words that are part of the definition, but simply use them in particularly well chosen examples. The rest of the chapter is devoted to a short history of the development of monolingual dictionary, showing how they evolved from lists of 'difficult' (i.e. usually latinate) words which could be explained by means of the corresponding English word to dictionaries which, for the benefit of non-native speakers of English, list an English definition for every English word.
Chapter 3 articulates what, in my opinion, is the central point of the book, i.e. the claim that monolingual definitional sentences have a recognizable structure (i.e. can be expected to contain certain elements in a certain order) which requires a specialized vocabulary to be described, i.e. cannot be described by (reduced to) the vocabulary of syntactic theory. Philosophers of mind will find here an interesting parallel to the debate on functionalism and the irreducibility of (some) private or public mental construct to their physical substrate (e.g. the fact that "money" is a symbolic entity that cannot be reduced to its physical realization, precious metals, paper bills, wampum, ...) This is a rather important point that, in my opinion, is obfuscated by the insistence on the parallel between the grammar of English and the grammar of definition, which results in rather perfunctory attempts to place the grammar of definition in the class of context- sensitive grammars (page 71) and to show that the definition language so produced is a subset of English closed under some operations that are never defined (page73). The discussion becomes much more interesting when empirical facts about definitional sentences are listed that are independent of any syntactic analysis, for example the fact that the entropy of the CCSD is much lower than that of a corpus of the same size from "The Times"; the fact that definitional sentences never contain inter-sentential coreferences; the fact that only very few of the possible senses of an ambiguous word are used in definition sentences.
Chapter 4 describes the methodology employed by the author in creating a taxonomy of definition questions able to subsume almost without exceptions all the definitional sentences in the CCSD. Most of the steps described reflect the train of thought that any reasonable person without any more sophisticated tools than grammar school linguistic categories would follow in classifying CCSD definitional sentences. In other words, exactly because definitions are analyzed according to criteria that are orthogonal to current sophisticated theories of syntax (and so cannot rely on what could be referred to as theory-internal construct), the whole classification has an extremely empiricist feel to it to the point that one is not quite sure if the resulting taxonomy could be used to make testable predictions or if, instead, it is just a convenient way to summarize the data. It is also to be noted, as Barnbrook's discussion makes quite clear, that the markup information made available by the CCSD database was relied upon extensively, a fact not without consequence, which will be discussed later.
Chapter 5 is devoted to an exposition of the definition type taxonomy. Four major types are identified and described in section 5.4. Each of these types is further subdivided into several subtypes yielding a total of 17. Descriptively, I found this to be the most interesting chapter of the book. Most bodies of definitions a reader would encounter (from Aristotle on) really are definitions of collections, which in turn are typically denoted (at least in the Western languages I am familiar with) by nouns. In such cases, as Aristotle had observed, a definition can be assembled using a super-ordinate term of the term to be defined (often referred to as its 'genus' i.e. a term denoting a collection that properly includes the one denoted by the term to be defined) together with a 'differentia', a term denoting a collection that subsumes the one denoted by the term to be defined and does not subsume any of the other possible sub-ordinate terms of the super-ordinate term. A monolingual dictionary, however, cannot always employ this strategy as it is required to provide definitions for every word, even those that are not nouns, e.g. verbs. In formal semantics, e.g. Chierchia and McConnell-Ginet (1991), it is common to represent verbs as relations, i.e. as collections not of individuals but of n-tuples of individuals.
Given this analysis it is easy to show why defining a verb cannot always be done easily or effectively by means of the intersection of a 'genus' and a 'differentia'. For example, in German the word "eats" in "X eats Y" is translated differently depending on whether X is a person ("essen") or an animal ("fressen"). This goes to show that, although it is easy to find a 'genus' for either of "essen" and "fressen" (e.g. the collection of pairs <X, Y> such that X ingests Y), the required 'differentiae' are really best expressed not as collections of pairs to be intersected with the 'genus' but rather as type restrictions on the X and Y element of the pair: person vs animal on X to realize the "essen"/"fressen" distinction and solid vs liquid on Y to realize the "eat"/"drink" distinction. Although he doesn't cast his analysis in these terms, Barnbrook does an excellent job at presenting an extremely rich selection of examples that show what definitional strategies are commonly employed when the 'genus' + 'differentia' strategy cannot be easily followed. I found this completely fascinating and I hope Barnbrook or others will follow up on these leads in future research. Among obvious research questions to be asked are:
a) to what extent are Barnbrook's types 'stable'? For example, would a clustering algorithm -- see Manning and Schuetze (1999), section 14 -- generate the same types for most reasonable similarity metrics?
b) is it possible to predict aggregate properties of these types given their description? For example, one might predict that because verbs are often defined by means of argument type restrictions that cannot be lexically realized (but require, say, entire locative or instrumental propositional phrases) the entropy of verb definitions will turn out to be much higher than the entropy of noun definitions.
c) is it possible to show that definition types cut across part-of-speech classification as long as the underlying semantics of the term to be defined is the same? For example, can one expect to find verbs and their nominalizations (e.g. "destroy" and "destruction") to be defined according to the same pattern?
Chapter 6, to quote from its introduction, "describes the functional components of the definition sentences, the structural combinations of those components and the variations in structure between the different definition types, together with an outline of the processing involved in the analysis of the definition sentences". I found this chapter tantalizing: it contains more than twenty tables displaying how the elements of definitions of different types are detected and analyzed by a program written by Barnbrook. All of these analyses are eminently sensible, so that one would want to learn by what algorithm exactly they have been generated in order to reproduce those results on a different domain. Unfortunately, the algorithm is never described in any detail (except for several remarks that explain how the markup of the CCSD database is exploited to detect head word boundaries and register information) with the consequence that the results reported are essentially irreproducible. On page 187, Barnbrook reports an interesting proposal by Schnelle (1995, section 2) (inconsistently listed as Schnelle (1996) in the bibliography). Schnelle's idea is that definition patterns can be standardized to the most expressive pattern(s) in order to achieve uniformity (which in turn would be desirable as a machine readable format). For example, even definitions for nouns that can be expressed using the simplest 'genus' + 'differentia' pattern can be recast into the more complex 'If/then' pattern often required by verbs for the reasons explained above. To exemplify, "A bachelor is an adult unmarried male" can be recast as "If someone is a bachelor, then they are adult, male and unmarried"
I find this very interesting, because it hints at the possibility that most (all?) definitions might be analyzed using the "logical form" (AND P-{n}(x1, ... , xn) Q-{m}-1(x1.1, ... , x1.m) ... Q-{j}-k(xk.1, ... , xk.j)) implies (AND Q-{p}-k+1(xk+1.1, ... , xk+1.) ... Q-{q}-k+n(xk+n.1, ... , xk+n.q))) where P-{n} is a predicate of the appropriate -arity for the definiendum (i.e. unary for nouns, binary for transitive verbs, ternary for di-transitive verbs), each of the Q-{m}-i's is a formula of m arguments (closed terms or variables, either bound within the Q-{m}-i literal itself or by the implicit universal quantifier taking scope over the entire "logical form") imposing some restriction on the arguments of P-{n} and each of the Q-{p}-k+i's is a formula of p arguments stating conditions that apply to p-tuples that satisfy conditions of the conjunction in the antecedent. It is easy to see how such formulae could be classified according to different, independent, facets. To name a few that come readily to mind: a) the arity of P-{n}; b) the number of clauses in the antecedent/consequent; c) the maximum arity of any of the Q's
It would be nice if one could prove that this format is sufficiently expressive to represent each of the 17 types identified by Barnbrook.
Also of note is the fact that this format invariably expresses only necessary and never sufficient conditions. In other words it tells you what follows from the fact that an x is a P (or that it P's something) but it doesn't tell you what you need to observe in order to be sure that x is a P (or that it P's something). Philosophers such as Jerry Fodor (1975) have argued that this is evidence in favor of the existence of a "Language of Thought", with the association between words and concepts (Harnad's 1990 "grounding problem") being triggered by mechanisms other than testing of "meaning hypotheses". Linguists such as Wierzbicka (1996) have equally pointed to the likely existence of language primitives that cannot be further "defined away". Finally, practicing knowledge engineers encounter this problem on a daily basis as they expand their ontologies. If someone has found an evolutionary explanation for why this should be so (i.e. why, as a species, we crave for necessary conditions but can live reasonably well without sufficient conditions), I would be very interested in learning about it.
Chapter 7 is concerned with the evaluation of the taxonomy of definitions and its possible applications. I found the evaluation part only marginally interesting because it reports on the evaluation of an algorithm that is never fully described and because it delves into some idiosyncrasies of the CCSD database which might not be of general interest, especially considering that the format of the CCSD is proprietary. The applications part contains some interesting ideas. For example, to use the notation introduced above in order to explain Schnelle's idea, that a database of CCSD definitions broken down into the components identified by Barnbrook's definition grammar can be queried for Q-{m}-i formulas, either in the antecedent or in the consequent to reveal robust word cluster that could hardly be listed exhaustively even by a native speakers. The reader who doubts this is invited to write down a list of verbs that can be naturally defined as "V-ing something somewhere" and compare it with the very interesting list Barnbrook produces on pages 231-2. For a publicly accessible database, compiled according to a similar design philosophy from a corpus of newspaper articles, one could visit Dekang Lin's page at http://www.cs.ualberta.ca/~lindek
A second interesting application is the harvesting of definition sentences for contextual information that could be used for word sense disambiguation, an idea that is being taken to its logical conclusion by the WordNet 2.0 release team; see http://www.cogsci.princeton.edu/~wn
presently engaged in the tagging of all word tokens included in WordNet definitions by means of the synset that corresponds to the one meaning, from those available for that word type, which is the one intended in definitional context in which the word token is found. A third idea is to use Barnbrook's definition grammar as a quality control tool to verify that definitions in a dictionary really do what they are intended to do (i.e. help a learner understanding the meaning of words differentiating every word from any other) and do so using a consistent and easily interpretable format. One possibility worth exploring would be having definitional assertions about concepts be stored in a language independent knowledge base (in the format mentioned above in the context of Schnelle's proposal) and definitions be automatically generated into any number of target languages into which the knowledge base contents can be paraphrased. A knowledge base of that kind is freely available for download at http://www.opencyc.org
CRITICAL EVALUATION My specific comments have already been worked into the review of the individual chapters. Here I will just summarize them as follows: the book is clearly written and filled with detailed discussions of a very large number of well chosen examples and as such it is pedagogically exemplary. At times, however, the reader finds herself wishing that something like a theory capable of delivering testable predictions would emerge from all those details. In this review I have tried to list what I consider the most promising candidates for such theory-like developments. I benefited a great deal from reading this book, as it has forced me to think about possible dimensions of the analysis of language independent of the more traditional ones of syntax and semantics. In this respect it reminds me of the kind of "goal oriented" analysis that can be found in the short essays in Rubinstein (2000). My main reason of disappointment, as a software developer, is that, due to copyright reasons, Barnbrook was not able to be more explicit in the description of his definition analysis algorithm. Also, I wonder how easy it would be to generalize the algorithm to operate on free form text (as opposed to the richly annotated CCSD database format).
REFERENCES Chierchia, Gennaro and Sally McConnell-Ginet (1991) Meaning and Grammar. MIT Press. Fodor, Jerry (1975) The Language of Thought: A Philosophical Study of Cognitive Psychology. T. Y. Crowell. Gamma, Erich (1995) Design Patterns: Elements of Reusable Object-oriented Software. Addison-Wesley. Harnad, Stevan (1990) "The Symbol Grounding Problem". Physica D42: 335-346 Manning, Chris and Hinrich Schuetze (1999) Foundations of Statistical Natural Language Processing. MIT Press. Rubinstein, Ariel (2000) Economics of Language. Cambridge University Press. Wierzbicka, Anna (1996) Semantics, Primes and Universals. Oxford University Press.
|
| |
ABOUT THE REVIEWER:
ABOUT THE REVIEWER
Stefano Bertolo currently works as a software developer on
projects that require expertise in information extraction,
knowledge representation and inference. After receiving a
Ph.D. in Philosophy and a Diploma in Cognitive Science from
Rutgers University he spent three years as a Post-Doctoral
associate at the Brain and Cognitive Science Department of
MIT, working on formal theories of human language learning.
|
|
|
|
|
|