LINGUIST List 5.279

Wed 09 Mar 1994

Sum: Natural Language Processing/Data Mining

Editor for this issue: <>


Directory

  1. Y. Shum, Natural Language Processing,Data Mining

Message 1: Natural Language Processing,Data Mining

Date: Mon, 7 Mar 94 11:41:44 ESTNatural Language Processing,Data Mining
From: Y. Shum <shuychonmehta.anu.edu.au>
Subject: Natural Language Processing,Data Mining


Hi there,
 Firstly , I apologise for not compiling all the responses
I got immeidately. Here's are some of the responses I got:

Data Mining:
from matthewcs.williams.edu :
You can ftp four papers that I've worked on (only 2 are really data
mining - the other two are cooperative database stuff) from
cs.williams.edu (anonymous login)

The papers are in pub/matthew

Another data mining paper could be obtained from
 ftp.cwi.nl in the directory pub.CWIreports/AA in the
file CS-R9406.ps.Z .

A part of speech tagger is available by anonymous ftp from:
 lightning.lcs.mit.edu in Pub/BRILL/programs and its documentation
in pub/BRILL/Papers

An online lexicon and a semantic concordance that goes along with it could
be found in /pub in clarity.princeton.edu

from CLAYKEdelphi.com
 NLP or NLU (using) is a big subject and you may be able to find a special
purpose system to help you with your task, but in reality it will be a
lexical analyser. Successful general-purpose NLP does not exist - yet. :-)

As for the general issue of indexing, I can recommend the book:

 INDEXERS ON INDEXING by The Royal Society of Indexers (London)

Sorry, I don't know the publisher's name or year of publication.


A Phd defence in Automatic Terminology Extraction could be obtained from
beaccv.fr abd it is in French.

-------------------------------------
EDITED)
To: shuychonmehta.anu.edu.au
Content-Length: 1445

I have received your question through Linguist List. I am doing some
research concerning data extraction from text. I have a list of paramaters
that must be extracted from texts about car accidents. Those parameters are
like : weather, speed, seat belt fastened or not, driver drunk...
The aim is to provide a tool capable of doing it automatically or able to
help an operator to do it. The first solution would use NLP and the second
would use information retrieval technics.
Here is MY idea about the subject :
 - The problem of NLP is that it does not seem to give enough precision in
interpretation of long texts.
 - It seems to be easier to use information retrieval technics, but they
cannot extract datas automaticaly since they do not make an interpretation
of what is said.
Maybe you could be interested in the proceedings of the Message
Understanding Conferences (MUC).
Regards

Thierry.

Thierry PERRON
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue