LINGUIST List 9.96
Wed Jan 21 1998
FYI: Novel, Release of CoreLex
Editor for this issue: Elaine Halleck <elainelinguistlist.org>
Directory
Daniel L. Everett, Interesting novel
Paul Buitelaar, Release of CoreLex
Message 1: Interesting novel
Date: Mon, 19 Jan 1998 12:31:38 -0500 (EST)
From: Daniel L. Everett <deververb.linguist.pitt.edu>
Subject: Interesting novel
Folks,
There is an interesting novel out that linguists ought to enjoy. It is
entitled The Sparrow. The author is Mary Doria Russell, a Ph.D. in
paleoanthropology. It is published by RandomHouse/Ballantine. The story
is largely about Emilio Sandoz a Jesuit with a Ph.D. in Linguistics who
travels with a small group to the planet Rakhat. He has to conduct
fieldwork on the languages of this planet (and sends back articles for
publication to earth). Some interesting aspects of fieldwork are captured
well by Russell. There is a lot more to the novel than linguistics,
though. I highly recommend it. And I do not usually read or enjoy novels
very much.
- Dan Everett
******************************
******************************
Daniel L. Everett
Department of Linguistics
University of Pittsburgh
2816 CL
Pittsburgh, PA 15260
Phone: 412-624-8101; Fax: 412-624-6130
http://verb.linguist.pitt.edu/~dever
Message 2: Release of CoreLex
Date: Tue, 20 Jan 1998 18:36:07 -0500
From: Paul Buitelaar <paulbcs.brandeis.edu>
Subject: Release of CoreLex
Announcing the release of CoreLex
An ONTOLOGY, LEXICAL SEMANTIC DATABASE and TAGSET for nouns,
organized around SYSTEMATIC POLYSEMY and UNDERSPECIFICATION.
CoreLex developed out of a thesis on systematic polysemy and
underspecification of nouns, establishing an ontology and semantic
database of 126 semantic types, covering around 40,000 nouns and
defining a large number of systematic polysemous classes that are
derived by a careful analysis of sense distributions in WordNet. The
semantic types are underspecified representations based on Generative
Lexicon theory and are used in an underspecified approach to semantic
tagging, addressing two problems: sense enumeration (the difficulty of
deciding the number of discrete senses), due to systematic polysemy;
and multiple reference (NP's denoting more than one model-theoretic
referent), due to underspecification. Semantic tags that are based on
traditional, discrete senses tend to be too fine-grained for practical
use. For instance, WordNet has, on the lowest level, around 60,000
different tags (synsets) for nouns alone. The CoreLex approach, on the
other hand, offers a concise set of 126 tags that are inherently more
coarse-grained, by taking into account systematic polysemy and
underspecification.
The CoreLex database is freely available for research purposes, including
commercial ones. For more information on the database and on the thesis that
describes its motivation, construction and use, see the CoreLex webpage:
http://www.cs.brandeis.edu/~paulb/CoreLex/corelex.html