LINGUIST List 8.1822

Sun Dec 21 1997

FYI: Software for Corpus Searches

Editor for this issue: Anita Huang <anitalinguistlist.org>


Directory

  1. Lee Hartman, Software for Corpus Searches

Message 1: Software for Corpus Searches

Date: Wed, 17 Dec 1997 09:30:11 -0600 (CST)
From: Lee Hartman <lhartmansiu.edu>
Subject: Software for Corpus Searches

Software for corpus searches

I'm announcing the release of a software program named
"Busca: A Searcher for word patterns in texts" (Version 3 -- December 1997).

Busca is a DOS-based program that searches a set of text
files for a specified pattern of words or for a string of
characters. When searching for a word pattern, Busca uses the
punctuation of the text to search sentence by sentence. The
word pattern is defined in terms of a focus word, with
possibilities for specifying the first, second, and/or third
neighboring word before and/or after it, as well as a "floating"
word located anywhere in the sentence. Words in the search
template can be defined in terms of their beginning (xxx-),
their ending (-xxx), a contained string (-xxx-), or their
entirety (xxx). Each word position in the template may contain
up to ten alternative forms.

Busca can be directed to search a set of texts that are
contained in a large number of files, and these files may reside
in different DOS directories.

Busca was originally designed to be used with a corpus in
Spanish -- the Argentine and Chilean texts of the "Corpus de
Referencia de la Lengua Espan~ola Contemporanea" (CRLEC),
accessible at http://lola.lllf.uam.es -- but it can be used with
any set of ASCII text files that use conventional sentence
punctuation ("." and "?" and "!"). The program is available
both in English (busc3eng.zip) and in Spanish (busc3esp.zip).

Busca is intended for free, non-profit distribution. Users
are requested to acknowledge Busca in publication of any
research that benefits from use of the program.

Here is the address from which to download Busca:

 http://www.siu.edu/~nmc/busca.html


- ------------------------------------------------------------------
Lee Hartman
Dept. of Foreign Languages
Southern Illinois University
Carbondale, IL 62901-4521
U.S.A.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue