LINGUIST List 13.511

Sat Feb 23 2002

FYI: Master's in Computational Ling, New Corpora

Editor for this issue: Marie Klopfenstein <marielinguistlist.org>


Directory

  • jpmg, Graduate Course: COMPUTER SPEECH, TEXT, AND INTERNET TECHNOLOGY
  • LDC Office, New LDC Corpora

    Message 1: Graduate Course: COMPUTER SPEECH, TEXT, AND INTERNET TECHNOLOGY

    Date: Fri, 22 Feb 2002 09:41:56 GMT
    From: jpmg <jpmgeng.cam.ac.uk>
    Subject: Graduate Course: COMPUTER SPEECH, TEXT, AND INTERNET TECHNOLOGY


    *** ONE YEAR GRADUATE COURSE *** ** STUDENTSHIPS AVAILABLE **

    - ---------------------------------------------------------------------- COMPUTER SPEECH, TEXT, AND INTERNET TECHNOLOGY

    ONE YEAR MASTERS COURSE

    THE DEPARTMENT OF ENGINEERING AND THE COMPUTER LABORATORY

    UNIVERSITY OF CAMBRIDGE

    - ----------------------------------------------------------------------

    This new Masters course has replaced the highly successful M.Phil in Computer Speech and Language Processing.

    Like its predecessor, a key aim of the new Masters course in Computer Speech, Text and Internet Technology is to teach the fundamental theory of speech and natural language processing. However, the new course also focuses on its application to information management and access within the framework of emerging Internet and W3C standards, such as XML text and speech annotation.

    It runs from early October to August and consists of two terms of lectures and practicals followed by a three month project. The final degree is awarded on the basis of coursework, examination and project.

    The course differs from some other programmes by providing an in-depth practical and theoretical grounding in the techniques for speech and language processing which form the basis for today's commercial and research prototype systems. There are strong links with industry and many of our past students have gone on to work for high-tech start-ups and industrial research laboratories, either immediately or after completing a PhD.

    To further strengthen our links with industry, we are making this course available to students wishing to pursue it on a part-time basis. We currently have two part-time students in their first year and strongly encourage others to apply. (Note that part-time enrolment requires attendance in Cambridge 1+1/2 days / week during term time.)

    Cambridge is a major international centre for research in both speech and language processing. The course is taught by leading researchers in these areas who have active collaborations with industrial and academic laboratories in Europe, the US and Japan.

    The EPSRC have funded a number of studentships for the course which are currently available to qualifying applicants. We especially encourage applications from students with a background in engineering, computer science, mathematics, linguistics and/or psychology.

    For further details please consult the course URL:

    http://svr-www.eng.cam.ac.uk/cstit/

    or contact:

    Mrs Mavis Barber (Computer Speech, Text, and Internet Technology) Department of Engineering, University of Cambridge Trumpington Street, Cambridge CB2 1PZ, UK Tel: +44-1223-332752 Fax: +44-1223-332662 Email: cstit-enquirieseng.cam.ac.uk - ----------------------------------------------------------------------

    Message 2: New LDC Corpora

    Date: Fri, 22 Feb 2002 11:58:45 -0500
    From: LDC Office <ldcldc.upenn.edu>
    Subject: New LDC Corpora


    * RST Discourse Treebank *

    * Multiple-Translation Chinese Corpus *

    The Linguistic Data Consortium (LDC) is pleased to announce the availability of the RST Discourse Treebank. This ftp publication has been authored by Lynn Carlson, Daniel Marcu, and Mary Ellen Okurowski. It contains a selection of 385 Wall Street Journal articles from the Penn Treebank which have been annotated with discourse structure in the framework of Rhetorical Structure Theory (RST). Additionally, the corpus includes a number of human generated extracts and abstracts associated with the original documents.

    For further information, including a link to the discourse annotation tool used for this database, please visit:

    http://www.ldc.upenn.edu/Catalog/LDC2002T07.html

    Institutions that have membership in the LDC during the 2002 Membership Year will be able to receive this corpus free of charge. Nonmembers may purchase this publication for $100.

    *

    The Linguistic Data Consortium (LDC) would like to announce the availability of the Multiple-Translation Chinese Corpus. This ftp publication was designed to support the development of automatic means for evaluating translation quality. The corpus consists of 105 stories drawn from Mandarin Chinese journalistic text. These stories were translated several times into English by both human translators and MT systems.

    For further information, including a Chinese text with a sample English translation, please visit:

    http://www.ldc.upenn.edu/Catalog/LDC2002T01.html

    Institutions that have membership in the LDC during the 2002 Membership Year will be able to receive this corpus free of charge. Nonmembers may purchase this publication for $400.

    *

    If you need additional information before placing your order, or would like to inquire about membership in the LDC, please send email to <ldcldc.upenn.edu> or call (215) 573-1275.

    - ------------------------------------------------------------------ Linguistic Data Consortium Phone: (215) 573-1275 3615 Market Street Fax: (215) 573-2175 Suite 200 email: ldcunagi.cis.upenn.edu Philadelphia, PA 19104-2608 www: http://www.ldc.upenn.edu