LINGUIST List 5.1379

Fri 02 Dec 1994

FYI: INL 5 Million Words Corpus, CLIN IV 1994 proceedings

Editor for this issue: <>


  1. , Repeated message: Announcement INL 5 Million Words Corpus
  2. Gosse Bouma, CLIN IV (Groningen 1994) proceedings

Message 1: Repeated message: Announcement INL 5 Million Words Corpus

Date: Fri, 02 Dec 1994 12:13:51 Repeated message: Announcement INL 5 Million Words Corpus
From: <>
Subject: Repeated message: Announcement INL 5 Million Words Corpus


On-line access to 5 mln words text corpus of Dutch for non-commercial

The Institute for Dutch Lexicology INL offers you the possibility to
consult a text corpus of ca. 5 million words of present-day Dutch text, by
the international computer network. This corpus is different from the Dutch
INL corpora on the ECI/MCI CD-ROM distributed by the Linguistic Data
Consortium and ELSNET. The texts are derived from books, magazines,
newspapers and TV broadcasts, and cover several topics such as journalism,
politics, environment, linguistics, leisure and business & employment. You
can easily define subcorpora on the basis of these parameters.

The retrieval system allows you to search for single words or for word
patterns, including some predefined syntactic patterns that can be changed
by the user. Searches concern the levels of word form, part of speech
(POS), and head word, both separately and in combination by use of Boolean
operators and proximity searches. During the search, data concerning
frequency and distribution over the texts are provided at several levels.
The output most often is a list of items, or a series of concordances
(words in context) with a variable, user-defined textual context. Sorting
facilities may make your analysis of the output data more comfortable. With
some limitations due to copyright, the output of your searches can be
transfered to your own computer by e-mail. It is not allowed to transfer
complete texts or substantial text parts.

Most of the data has not been corrected, neither on the level of the text,
nor on the level of POS and headword. POS and headword have automatically
been assigned to the word forms in the electronic text by lingware
developed at the INL.

The providers of the texts have given permission for use of their materials
for non-commercial, research purposes only. The conditions for commercial
use are still topic of discussion.

Please note that for an optimal use of the retrieval system, the use of a
VT 220 (or higher) terminal, or an appropriate terminal-emulator (e.g.
Kermit) is recommended.

In order to get access to this corpus, an individual user agreement has to
be signed. An electronic user agreement form can be obtained from our
mailserver MailservRulxho.Leidenuniv.NL. Type in the body of your e-mail
message: SEND [5MLN94]AGREEMNT.USE. For Dutch users a Dutch version is
available in the same directory. The filename is OVERKMST.GEB. Please make
a hard copy of the agreement form, sign it, and return the signed copy to:

Institute for Dutch Lexicology INL P.O. Box 9515, 2300 RA Leiden fax: 31 71
27 2115

After receipt of the signed user agreement, you will be informed about your
username and password.

If you need additional information, please send an e-mail message to
Helpdesk5mlnRulxho.Leidenuniv.NL, or send a fax to Mrs. dr. J.G. Kruyt.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: CLIN IV (Groningen 1994) proceedings

Date: Fri, 2 Dec 1994 11:20:12 +CLIN IV (Groningen 1994) proceedings
From: Gosse Bouma <>
Subject: CLIN IV (Groningen 1994) proceedings

CLIN IV (Groningen 1994) proceedings electronically available

The proceedings from the fourth CLIN meeting are now available
either thru WWW or thru FTP.

Please report any problems to:

Gertjan van Noord, Alfa-informatica RUG
Postbus 716, NL 9700 AS Groningen
tel. +31 50 635935 fax +31 50 634900

aarts.* Erik Aarts and Kees Trautwein
 Non-associative Lambek Categorial Grammar in Polynomial Time
bouma.* Gosse Bouma and Gertjan van Noord
 A Lexicalist Account of the Dutch Verbal Complex
dorrepaal.* Joke Dorrepaal
 An Alternative to the Binding Theory
erjavec.* Tomaz Erjavec
 Formalising Realizational Morphology in Typed Feature Structures
huijsen.* Willem-Olaf Huijsen
 Genetic Grammatical Inference
kirkeby.* Trond Kirkeby-Garstad and Krisztina Polgardi
 Against Prosodic Composition
lankhorst.* Marc M. Lankhorst
 A Genetic Algorithm for the Induction of Context-Free Grammars
mey.* Sjaak de Mey
 On the (In)dispensibility of Senses
ruessink.* Herbert Ruessink
 Tree Logic: A Formal Perspective on Transformations
schotel.* Henk Schotel
 SeSynPro, Towards a Workbench for Semantic Syntax
seuren.* Pieter A.M. Seuren
 Translation Relations in Semantic Syntax
wilco.* Wilco G. ter Stal and Paul E. van der Vet
 Two-level Semantic Analysis of Compounds: a
 Case Studie in Linguistic Engineering

CLIN IV: Ordering Information

Copies can also be ordered from the address below. In that case you
pay 30 Dutch guilders (Dutch orders) or 35 guilders (international
orders), inclusive of postage.

 CLIN IV 1993
 G. van Noord
 Alfa-informatica RUG
 Postbox 716
 NL 9700 AS Groningen Netherlands
 +31 50 635935
Gosse Bouma, Alfa-informatica, RUG, Postbus 716, 9700 AS Groningen tel. +31-50-635937 fax +31-50-635979
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue