Software Details
| Title: | Software for automatic text processing |
|---|---|
| Submitter: | Slava Yatsko |
| Description: | Dear Colleagues, The Computational Linguistics Laboratory at Katanov State University of Khakasia (CLL at KSU) is pleased to announce the release of Linguistic Toolbox – a package of programs for automatic text processing. Linguistic Toolbox is a concordance that differs from existing analogues in the following respects. - It has an integrated part-of-speech tagger thus allowing the user to create his/her own annotated corpora. Profound linguistic research is often based on a specific text genre (e.g. fiction, scientific text), linguistic category (e.g. possession), or works of a particular author (e.g. Maugham). Publicly available annotated national corpora with evenly distributed genres often fail to meet the demands of such research and LIT has been designed to fill this gap. By means of LIT the user can conduct various searches on his/her own corpora and get statistical information on distribution of various words, patterns, and phrases. - Union, subtraction, and intersection operations. These operations are used in the theory of sets to construct new sets from existing ones. Why not perform these operations on texts, so that to construct new texts from existing ones? For example using the subtraction operation the user can subtract stopwords from a text, and using the intersection operation he/she can get a list of words that occur in two or more texts with raw counts assigned to each word. These functions may be of use for computing distances between texts for the purposes of text classification and categorization. - LIT has an integrated spreadsheet. Having obtained by means of LIT some statistical information the user can perform computations in LIT itself without consulting some commercially distributed products such as MS Excel. - LIT has an integrated WordNet module by means of which the user can search not only for a given word but also for words semantically related to it. LIT is distributed as freeware and can be downloaded from the CLL's site at http://www.cll.khsu.ru/cll/products.aspx?productid=5 The current version supports English and works on Windows machines. V.Yatsko, Head of the CLL at KSU |
| Linguistic Field(s): |
Computational Linguistics Text/Corpus Linguistics |
| LL Issue: | 19.3048 |
| Date Posted: | 08-Oct-2008 |


