Publishing Partner: Cambridge University Press CUP Extra Publisher Login
amazon logo
More Info


New from Oxford University Press!

ad

Oxford Handbook of Corpus Phonology

Edited by Jacques Durand, Ulrike Gut, and Gjert Kristoffersen

Offers the first detailed examination of corpus phonology and serves as a practical guide for researchers interested in compiling or using phonological corpora


New from Cambridge University Press!

ad

The Languages of the Jews: A Sociolinguistic History

By Bernard Spolsky

A vivid commentary on Jewish survival and Jewish speech communities that will be enjoyed by the general reader, and is essential reading for students and researchers interested in the study of Middle Eastern languages, Jewish studies, and sociolinguistics.


New from Brill!

ad

Indo-European Linguistics

New Open Access journal on Indo-European Linguistics is now available!


Summary Details


Query:   Corpora English and German
Author:  Frank Oswalt
Submitter Email:  click here to access email
Linguistic LingField(s):   Language Documentation
Text/Corpus Linguistics

Summary:   For Query: Linguist 11.1877

Howdy y'all,

a long while back I asked for information on German and English corpora which are tagged for grammatical functions, as well as for accessible parallel English-German corpora. Here is a summary of the replies I got.


ENGLISH GRAMMATICALLY TAGGED CORPORA

Joybrato Mukherjee (j.mukherjee@uni-bonn.de) drew my attention to the International Corpus of English, which can be ordered at the following website (which also allows you to download a very nice demo version):

http://www.ucl.ac.uk/english-usage/ice/


GERMAN GRAMMATICALLY TAGGED CORPORA

George Smith (george@bloomfield.phil1.uni-potsdam.de) drew my attention to the NEGRA and TIGER projects, which can be reached via the following websites:

http://www.coli.uni-sb.de/sfb378/negra-corpus/
http://www.coli.uni-sb.de/cl/projects/tiger/


PARALLEL CORPORA GERMAN-ENGLISH

Anatol Stefanowitsch (anatol@rice.edu) drew my attention to a small web-accessible parallel corpus at the University of Chemnitz:

http://www.tu-chemnitz.de/phil/InternetGrammar/

Some people have their own collections of parallel texts, which they may or may not be willing to share with others (there may be copyright issues here).
The two that agreed to be mentioned here are
- Raphael Salkie (R.M.Salkie@bton.ac.uk), who has a collection of parallel texts from websites, literature, manuals, EU- documents, political writing and speeches coming to about 800.000 words in each language.
- Anatol Stefanowitsch, who has a small collection of parallel texts from news magazines (about 15,000 words), and who is in the process of assembling a larger parallel corpus of narrative writing.


VARIOUS

Martin Frost (Martin@sinequa.com) drew my attention to the following websites:

http://www.mpi.nl/world/tg/corpora/corpora.html
http://www.ifi.unizh.ch/CL
http://www.ims.uni-stuttgart.de/projekte/corplex/
http://www.icp.grenet.fr/ELRA/fr/cata/tabtext.html

Thanks also to Klaus Abels, Petra Steiner, and Monika Budde for other helpful hints.

Take care now,
Frank Oswalt

LL Issue: 12.526
Date Posted: 25-Feb-2001
Original Query: Read original query


Back

Sums main page