* * * * * * * * * * * * * * * * * * * * * * * *
LINGUIST List logo Eastern Michigan University Wayne State University *
* People & Organizations * Jobs * Calls & Conferences * Publications * Language Resources * Text & Computer Tools * Teaching & Learning * Mailing Lists * Search *
* *
LINGUIST List 21.2291

Thu May 20 2010

Diss: Comp Ling/Translation: Barreiro: 'Make It Simple...'

Editor for this issue: Mfon Udoinyang <mfonlinguistlist.org>


To post to LINGUIST, use our convenient web form at http://linguistlist.org/LL/posttolinguist.cfm.
Directory
        1.    Anabela Barreiro, Make It Simple with Paraphrases: Automated paraphrasing for authoring aids and machine translation

Message 1: Make It Simple with Paraphrases: Automated paraphrasing for authoring aids and machine translation
Date: 19-May-2010
From: Anabela Barreiro <barreiro_anabelahotmail.com>
Subject: Make It Simple with Paraphrases: Automated paraphrasing for authoring aids and machine translation
E-mail this message to a friend

Institution: Universidade do Porto
Program: Machine Translation
Dissertation Status: Completed
Degree Date: 2008

Author: Anabela Marques Barreiro

Dissertation Title: Make It Simple with Paraphrases: Automated paraphrasing for authoring aids and machine translation

Dissertation URL: http://www.linguateca.pt/Repositorio/AB-Thesis_030409.pdf

Linguistic Field(s): Translation
                            Computational Linguistics
                            Translation

Subject Language(s): English (eng)
                            Portuguese (por)

Dissertation Director:
Belinda Maia
Adam Meyers

Dissertation Abstract:

This dissertation introduces a novel approach to improving machine
translation by focusing on paraphrasing of support verb constructions. The
challenge of the research was to paraphrase predicate nominal expressions
such as fazer uma análise (to do an analysis) with predicate verbals, such
as analisar (to analyse), applying language paraphrasing capabilities to
produce better machine translation results. In particular cases, the
paraphrasing consisted in replacing the semantically weak support verb of
the predicate nominal construction with lexical-syntactic and stylistic
variants, such as realizar uma análise or efectuar uma análise (to perform
an analysis). When support verb constructions were identified and replaced
with semantically equivalent or similar verbal expressions as a
pre-processing step to translating, an average 21% improvement was observed
in the evaluated quality of the results of Portuguese-English machine
translation and, an average 31% improvement in the results of
English-Portuguese machine translation. The research was based on a
contrastive linguistic analysis of support verb constructions and of their
paraphrases, which were organized in several syntactic-semantic subclasses
according to the theoretical and methodological principles of the
Lexicon-Grammar Theory, established in the Harrisian framework of
Transformational Operator Grammar. This study looked into one particular
category of multiword expression, support verb construction, but it was
designed to be repeatable and extensible to other types of multiword
expression, namely to idiomatic expressions such as dar o braço a torcer
(to give up) and to syntactically free constructions, such as noun phrase
coordination or the passive voice. All linguistic information was
formalized in dictionaries and grammars developed with the NooJ linguistic
environment. This linguistic information was explored for several natural
language processing tasks, from both a monolingual and a bilingual
perspective. The Portuguese-English bilingual resources of the open source
Port4NooJ natural language processing system were built as groundwork for
the study. They integrate the SAL ontology of the OpenLogos system. Based
on Port4NooJ, automated paraphrasing software tools ReWriter and ParaMT
were also created to re-write and translate support verb constructions.
ReEscreve, the Portuguese version of ReWriter, is being used as an
authoring aid online public service and its interface is described in this
dissertation. The automated paraphrasing of support verb constructions
through ReEscreve allows a 40% improvement of the quality of the machine
translation results in that context.



Read more issues|LINGUIST home page|Top of issue



Page Updated: 20-May-2010

Supported in part by the National Science Foundation       About LINGUIST    |   Contact Us       ILIT Logo
While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed on its pages, it cannot vouch for their contents.