LINGUIST List 19.3048|
Wed Oct 08 2008
Software: Computational Ling/Text&Corpus Ling/Software for automatic text..
Editor for this issue: Susanne Vejdemo
To post to LINGUIST, use our convenient web form at
Software for automatic text processing
Message 1: Software for automatic text processing
From: Slava Yatsko <iatskogmail.com>
Subject: Software for automatic text processing
E-mail this message to a friend
The Computational Linguistics Laboratory at Katanov State University of
Khakasia (CLL at KSU) is pleased to announce the release of Linguistic
Toolbox – a package of programs for automatic text processing. Linguistic
Toolbox is a concordance that differs from existing analogues in the
- It has an integrated part-of-speech tagger thus allowing the user to
create his/her own annotated corpora. Profound linguistic research is often
based on a specific text genre (e.g. fiction, scientific text), linguistic
category (e.g. possession), or works of a particular author (e.g. Maugham).
Publicly available annotated national corpora with evenly distributed
genres often fail to meet the demands of such research and LIT has been
designed to fill this gap. By means of LIT the user can conduct various
searches on his/her own corpora and get statistical information on
distribution of various words, patterns, and phrases.
- Union, subtraction, and intersection operations. These operations are
used in the theory of sets to construct new sets from existing ones. Why
not perform these operations on texts, so that to construct new texts from
existing ones? For example using the subtraction operation the user can
subtract stopwords from a text, and using the intersection operation he/she
can get a list of words that occur in two or more texts with raw counts
assigned to each word. These functions may be of use for computing
distances between texts for the purposes of text classification and
- LIT has an integrated spreadsheet. Having obtained by means of LIT some
statistical information the user can perform computations in LIT itself
without consulting some commercially distributed products such as MS Excel.
- LIT has an integrated WordNet module by means of which the user can
search not only for a given word but also for words semantically related to it.
LIT is distributed as freeware and can be downloaded from the CLL's site at
The current version supports English and works on Windows machines.
V.Yatsko, Head of the CLL at KSU
Read more issues|LINGUIST home page|Top of issue
Please report any bad links or misclassified data
LINGUIST Homepage | Read
LINGUIST | Contact us
While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.