Publishing Partner: Cambridge University Press CUP Extra Wiley-Blackwell Publisher Login
amazon logo
More Info


New from Oxford University Press!

ad

Language Planning as a Sociolinguistic Experiment

By: Ernst Jahr

Provides richly detailed insight into the uniqueness of the Norwegian language development. Marks the 200th anniversary of the birth of the Norwegian nation following centuries of Danish rule


New from Cambridge University Press!

ad

Acquiring Phonology: A Cross-Generational Case-Study

By Neil Smith

The study also highlights the constructs of current linguistic theory, arguing for distinctive features and the notion 'onset' and against some of the claims of Optimality Theory and Usage-based accounts.


New from Brill!

ad

Language Production and Interpretation: Linguistics meets Cognition

By Henk Zeevat

The importance of Henk Zeevat's new monograph cannot be overstated. [...] I recommend it to anyone who combines interests in language, logic, and computation [...]. David Beaver, University of Texas at Austin


Summary Details


Query:   Corpora English and German
Author:  Frank Oswalt
Submitter Email:  click here to access email
Linguistic LingField(s):   Language Documentation
Text/Corpus Linguistics

Summary:   For Query: Linguist 11.1877

Howdy y'all,

a long while back I asked for information on German and English corpora which are tagged for grammatical functions, as well as for accessible parallel English-German corpora. Here is a summary of the replies I got.


ENGLISH GRAMMATICALLY TAGGED CORPORA

Joybrato Mukherjee (j.mukherjee@uni-bonn.de) drew my attention to the International Corpus of English, which can be ordered at the following website (which also allows you to download a very nice demo version):

http://www.ucl.ac.uk/english-usage/ice/


GERMAN GRAMMATICALLY TAGGED CORPORA

George Smith (george@bloomfield.phil1.uni-potsdam.de) drew my attention to the NEGRA and TIGER projects, which can be reached via the following websites:

http://www.coli.uni-sb.de/sfb378/negra-corpus/
http://www.coli.uni-sb.de/cl/projects/tiger/


PARALLEL CORPORA GERMAN-ENGLISH

Anatol Stefanowitsch (anatol@rice.edu) drew my attention to a small web-accessible parallel corpus at the University of Chemnitz:

http://www.tu-chemnitz.de/phil/InternetGrammar/

Some people have their own collections of parallel texts, which they may or may not be willing to share with others (there may be copyright issues here).
The two that agreed to be mentioned here are
- Raphael Salkie (R.M.Salkie@bton.ac.uk), who has a collection of parallel texts from websites, literature, manuals, EU- documents, political writing and speeches coming to about 800.000 words in each language.
- Anatol Stefanowitsch, who has a small collection of parallel texts from news magazines (about 15,000 words), and who is in the process of assembling a larger parallel corpus of narrative writing.


VARIOUS

Martin Frost (Martin@sinequa.com) drew my attention to the following websites:

http://www.mpi.nl/world/tg/corpora/corpora.html
http://www.ifi.unizh.ch/CL
http://www.ims.uni-stuttgart.de/projekte/corplex/
http://www.icp.grenet.fr/ELRA/fr/cata/tabtext.html

Thanks also to Klaus Abels, Petra Steiner, and Monika Budde for other helpful hints.

Take care now,
Frank Oswalt

LL Issue: 12.526
Date Posted: 25-Feb-2001
Original Query: Read original query


Back

Sums main page