LINGUIST List 14.505

Thu Feb 20 2003

FYI: Grad Course,New PhD Program,Spanish Corpus

Editor for this issue: James Yuells <>


  1. cstit-enquiries, Graduate Course: COMPUTER SPEECH, TEXT, AND INTERNET TECHNOLOGY
  2. Wolfgang Schulze, International Doctoral Program in Linguistics
  3. Mark Davies, Corpus del Espa�ol - 100 million words wide range


Date: Tue, 18 Feb 2003 02:41:15 +0000
From: cstit-enquiries <>


- ----------------------------------------------------------------------




- ----------------------------------------------------------------------

 This course has replaced the highly successful M.Phil in
 Computer Speech and Language Processing. 

 Like its predecessor, a key aim of the masters course in
 Computer Speech, Text and Internet Technology is to teach the
 fundamental theory of speech and natural language processing.
 However, the new course also focuses on its application to
 information management and access within the framework of emerging
 Internet and W3C standards, such as XML text and speech annotation.

 It runs from early October to end of July and consists of two terms 
 of lectures and practicals followed by a three month project. The
 final degree is awarded on the basis of coursework, examination
 and project.

 The course differs from some other programmes by providing an
 in-depth practical and theoretical grounding in the techniques for
 speech and language processing which form the basis for today's
 commercial and research prototype systems. There are strong links
 with industry and many of our past students have gone on to work
 for high-tech start-ups and industrial research laboratories, either
 immediately or after completing a PhD.

 To further strengthen our links with industry, we are making this
 course available to students wishing to pursue it on a part-time
 basis. (Note that part-time enrolment requires attendance in
 Cambridge 1+1/2 days / week during term time.) 

 Cambridge is a major international centre for research in both
 speech and language processing. The course is taught by leading
 researchers in these areas who have active collaborations with
 industrial and academic laboratories in Europe, the US and Japan.

 The EPSRC have funded a number of studentships for the course which
 are currently available to qualifying applicants. We especially
 encourage applications from students with a background in
 engineering, computer science, mathematics, and/or linguistics.

For further details please consult the course URL:

or contact:

Lise Gough 
University of Cambridge 
Computer Laboratory 
William Gates Building 
15 JJ Thomson Avenue 
Cambridge CB3 0FD 
Tel: +44 (0) 1223 334656 
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: International Doctoral Program in Linguistics

Date: Wed, 19 Feb 2003 09:48:51 +0000
From: Wolfgang Schulze <>
Subject: International Doctoral Program in Linguistics

University of Munich (Germany)
International Doctoral Program in Linguistics (LIPP)
Language Theory and Applied Linguistics

The Ph.D. program LIPP (Language Theory and Applied Linguistics)
offers a research-oriented, well-structured and optimally supervised
doctoral study program, which will lead to a Ph.D. within three years.

The courses in the program are planned and taught by professors from
the 12 participating disciplines; they are specialists in their field
and represent a broad and interrelated spectrum of theoretical
positions, methodological approaches and practical applications. The
program is interdisciplinary and focusses on linguistic theory as well
as language use and their interrelation. Doctoral dissertations
comprise studies from a general linguistic viewpoint, comparative
approaches, the investigation of a specific language (synchronic or
historic), as well as studies of texts and discourses with a view to
their institutional context and the social significance and impact of
their language.

One of LIPP's goals is to improve international cooperation and
scholarly multi-lingualism. Highly qualified doctoral candidates from
home and abroad are therefore especially encouraged to apply.

Profile LIPP is a program which takes six semesters (i.e. three
years). Credits are acquired in courses which are specifically geared
to the needs of the doctoral candidates. All candidates are
individually supervised.

The interdisciplinary program comprises four modules approaching
language from different perspectives: 
- Language Phenomenology and Typology 
- Empirical study of natural languages and its methodology 
-Language and Society 
- Language Theory and Language Modelling

The doctoral candidates participate in courses from all four
modules. After the doctoral dissertation has been accepted, the final
step is a defence of the dissertation.

Eligible candidates should have an excellent degree
(Magister Artium, diploma, Staatsexamen, Master of Arts (with Thesis),
Matrise, Laurea, etc.) in one of the subjects affiliated to the
doctoral program, in particular:

Albanian Studies, English Philology/Linguistics, Finno-Ugric
Studies/Uralic Studies and Siberian Languages, General and Typological
Linguistics, German as a Foreign Language/Transnational German
Studies, German Philology/Linguistics, Indo-European Studies,
Phonetics and Speech Communication, Psycholinguistics and Elocution,
Romance Philology/Linguistics, Slavonic Philology/Linguistics,
Theoretical Linguistics.

Admission is for the winter semester of each year. Further information
on registration and the selection procedure can be obtained by
contacting Please apply well in advance.

Linguistik - Internationales Promotions-Programm LIPP
Sprachtheorie und Angewandte Sprachwissenschaft
Ludwigstr. 27, 
D-80539 M�nchen 
Tel.: +49 89 2180 3846 
Tel.: +49 89 2180 5382 (office)
Fax.: +49 89 2180 13990
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 3: Corpus del Espa�ol - 100 million words wide range

Date: Wed, 19 Feb 2003 14:12:15 -0600
From: Mark Davies <>
Subject: Corpus del Espa�ol - 100 million words wide range

A new version of the 100 million word [Corpus del Espa�ol] is now online.
This corpus has been created by Mark Davies of Illinois State University
(with funding from the National Endowment for the Humanities), and is
available for free access and use at

This searchable collection of more than 10,000 texts from the 1200s-1900s
allows a wider range of searches than any other corpus of Spanish. Users
can search by:

- synonyms [30,000 word sets]: e.g. what are the most common synonyms of
[inteligente] or [rico]
- collocations [what words occur most with others]: e.g. the most common
adjectives with [cara], the most common nouns that occur after [suave], or
the most common verbs with [chistes]
- frequency: e.g. what new verbs have arisen since the 1800s, or what
synonyms of [roto] are more common in written than in spoken Spanish
- grammatical category: e.g. the most common infinitives occurring after
[imposible de], or the most common adjectives after [noche]
- lemma [word forms]: e.g. the frequency of all of the forms of [decir] -
in the 1200s, 1500s, or 1900s.
- word patterns: e.g. word ending in [-azo], or with [-camin-] anywhere
in the word
- user-defined lists: create your own lists (e.g. words related to
emotions or clothing), and then re-use them in subsequent searches
- any combination of any of the previous searches (example: all forms of
all synonyms of [decir], followed by all forms of all synonyms of [chiste].

Please feel free to pass along this information to another other teachers or
students who you think might be interested.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue