LINGUIST List 9.96

Wed Jan 21 1998

FYI: Novel, Release of CoreLex

Editor for this issue: Elaine Halleck <>


  1. Daniel L. Everett, Interesting novel
  2. Paul Buitelaar, Release of CoreLex

Message 1: Interesting novel

Date: Mon, 19 Jan 1998 12:31:38 -0500 (EST)
From: Daniel L. Everett <>
Subject: Interesting novel


There is an interesting novel out that linguists ought to enjoy. It is 
entitled The Sparrow. The author is Mary Doria Russell, a Ph.D. in 
paleoanthropology. It is published by RandomHouse/Ballantine. The story 
is largely about Emilio Sandoz a Jesuit with a Ph.D. in Linguistics who 
travels with a small group to the planet Rakhat. He has to conduct 
fieldwork on the languages of this planet (and sends back articles for 
publication to earth). Some interesting aspects of fieldwork are captured 
well by Russell. There is a lot more to the novel than linguistics, 
though. I highly recommend it. And I do not usually read or enjoy novels 
very much. 

- Dan Everett


Daniel L. Everett
Department of Linguistics
University of Pittsburgh
2816 CL
Pittsburgh, PA 15260
Phone: 412-624-8101; Fax: 412-624-6130
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: Release of CoreLex

Date: Tue, 20 Jan 1998 18:36:07 -0500
From: Paul Buitelaar <>
Subject: Release of CoreLex

		Announcing the release of CoreLex


CoreLex developed out of a thesis on systematic polysemy and
underspecification of nouns, establishing an ontology and semantic
database of 126 semantic types, covering around 40,000 nouns and
defining a large number of systematic polysemous classes that are
derived by a careful analysis of sense distributions in WordNet. The
semantic types are underspecified representations based on Generative
Lexicon theory and are used in an underspecified approach to semantic
tagging, addressing two problems: sense enumeration (the difficulty of
deciding the number of discrete senses), due to systematic polysemy;
and multiple reference (NP's denoting more than one model-theoretic
referent), due to underspecification. Semantic tags that are based on
traditional, discrete senses tend to be too fine-grained for practical
use. For instance, WordNet has, on the lowest level, around 60,000
different tags (synsets) for nouns alone. The CoreLex approach, on the
other hand, offers a concise set of 126 tags that are inherently more
coarse-grained, by taking into account systematic polysemy and

The CoreLex database is freely available for research purposes, including 
commercial ones. For more information on the database and on the thesis that 
describes its motivation, construction and use, see the CoreLex webpage:
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue