Publishing Partner: Cambridge University Press CUP Extra Publisher Login
amazon logo
More Info


New from Oxford University Press!

ad

Oxford Handbook of Corpus Phonology

Edited by Jacques Durand, Ulrike Gut, and Gjert Kristoffersen

Offers the first detailed examination of corpus phonology and serves as a practical guide for researchers interested in compiling or using phonological corpora


New from Cambridge University Press!

ad

The Languages of the Jews: A Sociolinguistic History

By Bernard Spolsky

A vivid commentary on Jewish survival and Jewish speech communities that will be enjoyed by the general reader, and is essential reading for students and researchers interested in the study of Middle Eastern languages, Jewish studies, and sociolinguistics.


New from Brill!

ad

Indo-European Linguistics

New Open Access journal on Indo-European Linguistics is now available!


Summary Details


Query:   GREPPING SUMMARY
Author:  Dr R Doctor
Submitter Email:  click here to access email
Linguistic LingField(s):   Computational Linguistics
Morphology

Summary:   Thanks to all who responded to my request for grepping under a DOS
environment with the following syntax:

< grep -r <FN1> <FN2> FN3 >

< where FN1 is the file with the set of strings to be grepped >

< FN2 is the data-base >

< FN3 is the output. >

With the suggestions and help that I have got, I have literally
"grepped" all about GREP. Thanks a lot I got a whole lot of answers
which I am summarising below:

1. The first was to use PERL script to write my own GREP: For both
Unix and dos, perl is a language that will easily allow to create a
small program that will do what you ask. More information about perl,
including free downloads for many environments, can be gotten from the
perl language home page, http://www.perl.com/perl/index.html.

2. The second suggestion was similar in nature: use AWK and LEX tools
for the job.

3.Under UNIX environment three types of GREP were proposed:
a. egrep

egrep -f fn1 fn2 > fn3

where fn1 is a file containing the search patterns (one per line). If
you only want to search for literal strings (no special characters)
then you can use fgrep instead of egrep. Do 'man grep' for more
details,

b. fgrep

fgrep -f patt-file-name < database-to-search > results-file
will work, assuming patt-file-name is a file of _strings_ (regular
expressions containing metacharacters are not allowed by fgrep.) Say
man fgrep
to get the details.
One hitch however it will only match strings, not regular
expressions.

c.sgrep
The sgrep utility (not standard UNIX) permits complex (and nested)
patterns to be searched for.


4. Under DOS
The gnu tools are now available under DOS; gnu has only one
grep and lets you do this (according to the manual) with
grep -f F1 F2 >F3
This works and I have used it with success. Thanks to Andreas Mengel

Incidentally egrep,sgrep and fgrep versions for DOS exist and can be
found at:

ftp.rediris.es/mirror/simtelnet/gnu/gnuish/grep15.zip
Thanks to Susana Sotelo Docio

5. Another suggestion was to use sed
sed -n -f <file> permits many patterns to be searched for (with
some problems when multiple matches occur on a line.

6. Another solution under DOS was to grep for a large number of
strings at once in a `regular expression'.

A second alternative was to batch-file the operation, which I am
using at present as a solution, but wanted something more functional.

7. A commercial solution was also proposed: MKS (Mortice-Kern) in
Canada makes a commercial set of Unix apps and commands for use in DOS
and Windows environments, including ksh, awk, grep, gres. Their grep
syntax is:
grep -f pattfile file > output


______________________________________________________________________
Many Thanks to:

Martin Wynne <eiamjw@comp.lancs.ac.uk>
Will Dowling <willd@spectranet.ca>
Kevin Bretonnel Cohen <kevin@cmhcsys.com>
Mark Liberman <myl@unagi.cis.upenn.edu>
John E. Koontz koontz@boulder.nist.gov
Peter Hamer <P.G.Hamer@nortel.co.uk>
Stuart Luppescu <s-luppescu@uchicago.edu>
Stephen P Spackman <stephen@softguard.com>
D.Lee <d.lee@lancaster.ac.uk>
Chris Culy <cculy@blue.weeg.uiowa.edu>
David Palmer <palmer@linus.mitre.org>
Shravan Vasishth <vasishth@ling.ohio-state.edu>
Susana Sotelo Doc'io" <fesdocio@usc.es>
Andreas Mengel <mengel@babylon.kgw.tu-berlin.de>

for their prompt and helpful replies to my query.

LL Issue: 8.1178
Date Posted: 14-Aug-1997
Original Query: Read original query


Back

Sums main page