Editor for this issue: Ann Dizdar <dizdar
tam2000.tamu.edu>
Dear all: A while back my colleague Maria Paula Santalla and I (Jose Luis Sancho) posted an enquiry about corpus analysis resources for Spanish. The following is a summary of what we have been referred to. We would like to thank for their kind responses (order irrelevant): Max Louwerse, Mike Scott, Carlos Subirats, Ken Litkowski, Jean V'eronis, Yorick Wilks, Sandro Pedrazzini, John Aberdeen, Ana Mart'inez, Nuno Miguel Cavalheiro Marques and Ken Beesley. This list exhausts our 'inbox'; therefore, we beg anyone else who responded and is not mentioned above to forgive us (or our server); In that case, retry, please. Note that the enquiry was posted in various lists, hence information not necessarily coming from this list may be quoted bellow. We apologize for any multiplicities. ##Max Louwerse (<M.M.LouwerseMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuestud.let.ruu.nl>) told us about the Qualrs-lst on which a lot of tag-software has been discussed. As for software, he mentioned NUDIST (Sage Publishers) and Notabene, whose homepage is http://sls-www.lcs.mit.edu/~flammia/Nb.html and ftp://sls-www.lcs.mit.edu/pub/flammia/Nb." You can also email to Giovanni Flammia (flammia
mit.edu). ##Mike Scott (<ms2928
ac.uk>) suggested http://www.liv.ac.uk/~ms2928/wordsmit.html This accesses WordSmith Tools (Oxford Univ. Press 1996). ##Carlos Subirats (<lali1
uab.es>) pointed to a 'Etiquetador y desambiguizador del espanol', developed by the Laboratorio de Linguistica Informatica de la Universidad Autonoma de Barcelona. The address provided is Carlos Subirats Ruggeberg Universidad Autonoma de Barcelona Laboratorio de Linguistica Informatica Edificio B 08193 Bellaterra, Spain e-mail: c.subirats
oasis.uab.es e-mail: c.subirats
cc.uab.es Fax: (343)-581-16-86 Tel: (343)-581-22-29 ##Ken Litkowski <71520.307
CompuServe.COM> directed us to some dictionary utilities for creating and maintaining lexica. A description of this software is available at http://www.clres.com ##Jean V'eronis (<veronis
univ-aix.fr>) suggested a look at http://www.lpl.univ-aix.fr/projects/multext/ and contacting Nuria Bel (nuria
gilcub.es). ##Yorick Wilks (<yorick
dcs.shef.ac.uk>) pointed to david
crl.nmsu.edu ##Sandro Pedraziini (<sandro
idsia.ch>) pointed to a system with wich you can not only create and maintain lexica, but you can use it to generate different forms of taggers, lemmatizers. A description of it can be found at http://www.ifi.unibas.ch/grudo/grudo.html http://www.idsia.ch/wordmanager.html ##John Aberdeen (<aberdeen
mitre.org>) mentioned a fast part of speech tagger, based on Eric Brill's notion of tranformation based error driven learning. ##Ana Mart'inez (<sysnet
bitmailer.net>) mentioned MABLe, a 'multilingual letter authoring tool'. ##Nuno Miguel Cavalheiro Marques (<nmm
di.fct.unl.pt>) brought to our attention two POS taggers, one using Viterbi tagging and HMM and the other using Neural Networks. You can find a short review of this work at http://www-ia.di.fct.unl.pt/~nmm http://www-ia.di.fct.unl.pt/~glint/Glint There you can also access an article about POLARIS:a morphological lexical acquisition and retrieval data base system. Contact with Gabriel Lopes (gpl
fct.unl.pt) was also suggested. ##Ken Beesley (<Ken.Beesley
Grenoble.RXRC.Xerox.com>) noted that the Rank Xerox Research Centre in Grenoble France has developed systems for tokenization (word/term division) morphological analysis (for syntax, or, less detailed, for tagging) part-of-speech "guesser" (for words not found by the morphological analysis) tagging (based on an HMM tagger, trained on a corpus) for Spanish. You can experiment with the morphological analysis and tagger on http://www.xerox.fr/grenoble/mltt/home.html Thank you very much again. See you on the net Jose Luis Sancho Maria Paula Santalla sancho
crea.rae.es santalla
crea.rae.es