LINGUIST List 13.511
Sat Feb 23 2002
FYI: Master's in Computational Ling, New Corpora
Editor for this issue: Marie Klopfenstein <marielinguistlist.org>
Directory
jpmg, Graduate Course: COMPUTER SPEECH, TEXT, AND INTERNET TECHNOLOGY
LDC Office, New LDC Corpora
Message 1: Graduate Course: COMPUTER SPEECH, TEXT, AND INTERNET TECHNOLOGY
Date: Fri, 22 Feb 2002 09:41:56 GMT
From: jpmg <jpmgeng.cam.ac.uk>
Subject: Graduate Course: COMPUTER SPEECH, TEXT, AND INTERNET TECHNOLOGY
*** ONE YEAR GRADUATE COURSE ***
** STUDENTSHIPS AVAILABLE **
- ----------------------------------------------------------------------
COMPUTER SPEECH, TEXT, AND INTERNET TECHNOLOGY
ONE YEAR MASTERS COURSE
THE DEPARTMENT OF ENGINEERING AND THE COMPUTER LABORATORY
UNIVERSITY OF CAMBRIDGE
- ----------------------------------------------------------------------
This new Masters course has replaced the highly successful M.Phil in
Computer Speech and Language Processing.
Like its predecessor, a key aim of the new Masters course in
Computer Speech, Text and Internet Technology is to teach the
fundamental theory of speech and natural language processing.
However, the new course also focuses on its application to
information management and access within the framework of emerging
Internet and W3C standards, such as XML text and speech annotation.
It runs from early October to August and consists of two terms of
lectures and practicals followed by a three month project. The
final degree is awarded on the basis of coursework, examination
and project.
The course differs from some other programmes by providing an
in-depth practical and theoretical grounding in the techniques for
speech and language processing which form the basis for today's
commercial and research prototype systems. There are strong links
with industry and many of our past students have gone on to work
for high-tech start-ups and industrial research laboratories, either
immediately or after completing a PhD.
To further strengthen our links with industry, we are making this
course available to students wishing to pursue it on a part-time
basis. We currently have two part-time students in their first year
and strongly encourage others to apply. (Note that part-time enrolment
requires attendance in Cambridge 1+1/2 days / week during term time.)
Cambridge is a major international centre for research in both
speech and language processing. The course is taught by leading
researchers in these areas who have active collaborations with
industrial and academic laboratories in Europe, the US and Japan.
The EPSRC have funded a number of studentships for the course which
are currently available to qualifying applicants. We especially
encourage applications from students with a background in
engineering, computer science, mathematics, linguistics and/or
psychology.
For further details please consult the course URL:
http://svr-www.eng.cam.ac.uk/cstit/
or contact:
Mrs Mavis Barber (Computer Speech, Text, and Internet Technology)
Department of Engineering, University of Cambridge
Trumpington Street, Cambridge CB2 1PZ, UK
Tel: +44-1223-332752
Fax: +44-1223-332662
Email: cstit-enquirieseng.cam.ac.uk
- ----------------------------------------------------------------------
Message 2: New LDC Corpora
Date: Fri, 22 Feb 2002 11:58:45 -0500
From: LDC Office <ldcldc.upenn.edu>
Subject: New LDC Corpora
* RST Discourse Treebank *
* Multiple-Translation Chinese Corpus *
The Linguistic Data Consortium (LDC) is pleased to announce the
availability of the RST Discourse Treebank. This ftp publication
has been authored by Lynn Carlson, Daniel Marcu, and Mary Ellen
Okurowski. It contains a selection of 385 Wall Street Journal articles
from the Penn Treebank which have been annotated with discourse
structure in the framework of Rhetorical Structure Theory (RST).
Additionally, the
corpus includes a number of human generated extracts and abstracts
associated with the original documents.
For further information, including a link to the discourse annotation
tool used for this database, please visit:
http://www.ldc.upenn.edu/Catalog/LDC2002T07.html
Institutions that have membership in the LDC during the 2002
Membership Year will be able to receive this corpus free of charge.
Nonmembers may purchase this publication for $100.
*
The Linguistic Data Consortium (LDC) would like to announce the
availability of the Multiple-Translation Chinese Corpus. This ftp
publication was designed to support the development of automatic means
for evaluating translation quality. The corpus consists of 105 stories
drawn from Mandarin Chinese journalistic text. These stories were
translated several times into English by both human translators and MT
systems.
For further information, including a Chinese text with a sample English
translation, please visit:
http://www.ldc.upenn.edu/Catalog/LDC2002T01.html
Institutions that have membership in the LDC during the 2002
Membership Year will be able to receive this corpus free of charge.
Nonmembers may purchase this publication for $400.
*
If you need additional information before placing your order, or
would like to inquire about membership in the LDC, please send email to
<ldcldc.upenn.edu> or call (215) 573-1275.
- ------------------------------------------------------------------
Linguistic Data Consortium Phone: (215) 573-1275
3615 Market Street Fax: (215) 573-2175
Suite 200 email: ldcunagi.cis.upenn.edu
Philadelphia, PA 19104-2608 www: http://www.ldc.upenn.edu