LINGUIST List 2.55

Thursday, 28 Feb 1991

FYI: LINGUIST Archive, UK Libraries, Lexical Research

Editor for this issue: <>


  1. The LINGUIST Moderators, Retrieving Back Issues of LINGUIST
  2. Natalie Maynor, Accessing UK Libraries
  3. , Consortium for Lexical Research

Message 1: Retrieving Back Issues of LINGUIST

Date: Thur, 28 Feb 91
From: The LINGUIST Moderators <>
Subject: Retrieving Back Issues of LINGUIST
Back issues of LINGUIST may be now obtained in one of two ways:

 1. You may send the server (address:
 the following message:

 GET archive/<issue-number>

e.g. GET archive/vol-2-43

 If you do not know what issue you want, you can get a directory
 of all issues, with their headings, by sending the following command
 to the server:

 GET archive/directory

 Note that this is the ONLY way you can obtain back-issues 
if you do not have Internet access.

 2. If you have Internet access, you may FTP all back issues
of LINGUIST in a single file from the University of Michigan. The
issues exist as a file LINGUIST.LST in the subdirectory LING on two 
machines, and
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: Accessing UK Libraries

Date: Wed, 27 Feb 91 06:47:29 CST
From: Natalie Maynor <nm1Ra.MsState.Edu>
Subject: Accessing UK Libraries
Information on accessing UK libraries is available via anonymous ftp
from The file is in docs/words-l and is named uk-
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 3: Consortium for Lexical Research

Date: Wed, 27 Feb 91 08:41:48 EST
From: <>
Subject: Consortium for Lexical Research

 ------- Forwarded Message

 From: (Ted Dunning)
 Subject: the consortium for lexical research
 Date: 20 Feb 91 18:33:45 GMT

 The Consortium for Lexical Research

 Rio Grande Research Corridor
 Computing Research Laboratory
 New Mexico State University
 Box 30001, Las Cruces, NM 88003.
 (505) 646-5466
 Fax: (505) 646-6218

 Work in computational linguistics has reached the point where the
 performance of many natural language processing systems is limited by
 a "lexical bottleneck". That is, such systems could handle much more
 text and produce much more impressive application results were it not
 for the fact that their lexicons are too small.

 The Association for Computational Linguistics has established the
 Consortium for Lexical Research (CLR), and DARPA has agreed to fund
 this. It will be sited at the Computing Research Laboratory, New
 Mexico, under its Director, Yorick Wilks, and an ACL committee
 consisting of Roy Byrd, Ralph Grishman, Mark Liberman and Don Walker.

 The Consortium for Lexical Research will be an organization for
 sharing lexical data and tools used to perform research on natural
 language dictionaries and lexicons, and for communicating the results
 of that research. Members of the Consortium will contribute resources
 to a repository and withdraw resources from it in order to perform
 their research. There is no requirement that withdrawals be
 compensated by contributions in kind.

 A basic premise of the proposal for cooperation on lexical
 research is that the research must be "precompetitive". That is, the
 CLR will not have as its goal the creation of commercial products.
 The goal of precompetitive research would be to augment our
 understanding of what lexicons contain and, specifically, to build
 computational lexicons having those contents.

 The task of the CLR is primarily to facilitate research, making
 available to the whole natural language processing community certain
 resources now held only by a few groups that have special
 relationships with companies or dictionary publishers. The CLR would
 as far as is practically possible accept contributions from any
 source, regardless of theoretical orientation, and make them available
 as widely as possible for research. There is also an underlying
 theoretical assumption or hope: that the contents of major lexicons
 are very similar, and that some neutral, or "polytheoretic," form of
 the information they contain can be at least a research goal, and
 would be a great boon if it could be achieved. A major activity of
 the CLR will be to negotiate agreements with "providers" on reassuring
 and advantageous terms to both suppliers and researchers. Major
 funders of work in this area in the US have indicated interest in
 making participation in the CLR a condition for financial support of
 research. An annual fee will be charged for membership. It is
 intended that after an initial start-up period, the Consortium become

 The Computing Research Lab (CRL) already has an active research
 program in computational lexicons, text processing, machine
 translation, etc., funded by DARPA and NSF as well as a range of
 machines appropriate for advanced computing on dictionaries.

 Resources and Services of the Consortium

 The following lists of lexical data and tools seem to provide a
 reasonable starting content for the repository. We will continually
 solicit and encourage additions to this list.


 1. word lists (proper nouns, count/mass nouns, causative verbs,
 movement verbs, predicative adjectives, etc.)

 2. published dictionaries

 3. specialized terminology, technical glossaries, etc.

 4. statistical data

 5. synonyms, antonyms, hypernyms, pertainyms, etc.

 6. phrase lists


 1. lexical data base management tools

 2. lexical query languages

 3. text analysis tools (concordance, KWIC, statistical analysis,
 collocation analysis, etc.)

 4. SGML tools (particularly tuned to dictionary encoding)

 5. parsers

 6. morphological analyzers

 7. user interfaces to dictionaries

 8. lexical workbenches

 9. dictionary definition sense taggers


 Repository management will involve cataloging and storing
 material in disparate formats, and providing for their retransmission
 (with conversion, where appropriate tools exist). In addition, it
 will be necessary to maintain a library of documentation describing
 the repository's contents and containing research papers resulting
 from projects that use the material. A brief description of the
 services to be provided is as follows:

 a. CRL will provide a catalog of, and act as a clearing-house for,
 utilities programs that have been written for existing online
 lexical data.

 b. CRL will compile a list of known mistakes, misprints, etc. that
 occur in each of the major published sources (dictionaries etc.).

 c. CRL will set up a new memorandum series explicitly devoted to the
 lexical center.

 d. CRL will also be a clearinghouse for preprints and hard-to-find
 reprints on machine-readable dictionaries.

 e. CRL also expects to conduct workshops in this area, including an
 inaugural workshop in late 1991 or early 1992.

 f. CRL would provide a catalog for access to repositories of
 corpus-manipulation tools held elsewhere.

 g. CRL has already set up a network accessible file transfer service.

 We invite you to participate in the Consortium for Lexical
 Research. Anyone interested in participating even in principle as a
 provider or consumer of data, tools, or services should send a message

 as should anyone who would like to be on our lexical information

 ------- End of Forwarded Message
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue