LINGUIST List 21.2291

Thu May 20 2010

Diss: Comp Ling/Translation: Barreiro: 'Make It Simple...'

Editor for this issue: Mfon Udoinyang <mfonlinguistlist.org>


        1.    Anabela Barreiro, Make It Simple with Paraphrases: Automated paraphrasing for authoring aids and machine translation

Message 1: Make It Simple with Paraphrases: Automated paraphrasing for authoring aids and machine translation
Date: 19-May-2010
From: Anabela Barreiro <barreiro_anabelahotmail.com>
Subject: Make It Simple with Paraphrases: Automated paraphrasing for authoring aids and machine translation
E-mail this message to a friend

Institution: Universidade do Porto Program: Machine Translation Dissertation Status: Completed Degree Date: 2008

Author: Anabela Marques Barreiro

Dissertation Title: Make It Simple with Paraphrases: Automated paraphrasing for authoring aids and machine translation

Dissertation URL: http://www.linguateca.pt/Repositorio/AB-Thesis_030409.pdf

Linguistic Field(s): Translation                             Computational Linguistics                             Translation
Subject Language(s): English (eng)                             Portuguese (por)
Dissertation Director:
Belinda Maia Adam Meyers
Dissertation Abstract:

This dissertation introduces a novel approach to improving machinetranslation by focusing on paraphrasing of support verb constructions. Thechallenge of the research was to paraphrase predicate nominal expressionssuch as fazer uma análise (to do an analysis) with predicate verbals, suchas analisar (to analyse), applying language paraphrasing capabilities toproduce better machine translation results. In particular cases, theparaphrasing consisted in replacing the semantically weak support verb ofthe predicate nominal construction with lexical-syntactic and stylisticvariants, such as realizar uma análise or efectuar uma análise (to performan analysis). When support verb constructions were identified and replacedwith semantically equivalent or similar verbal expressions as apre-processing step to translating, an average 21% improvement was observedin the evaluated quality of the results of Portuguese-English machinetranslation and, an average 31% improvement in the results ofEnglish-Portuguese machine translation. The research was based on acontrastive linguistic analysis of support verb constructions and of theirparaphrases, which were organized in several syntactic-semantic subclassesaccording to the theoretical and methodological principles of theLexicon-Grammar Theory, established in the Harrisian framework ofTransformational Operator Grammar. This study looked into one particularcategory of multiword expression, support verb construction, but it wasdesigned to be repeatable and extensible to other types of multiwordexpression, namely to idiomatic expressions such as dar o braço a torcer(to give up) and to syntactically free constructions, such as noun phrasecoordination or the passive voice. All linguistic information wasformalized in dictionaries and grammars developed with the NooJ linguisticenvironment. This linguistic information was explored for several naturallanguage processing tasks, from both a monolingual and a bilingualperspective. The Portuguese-English bilingual resources of the open sourcePort4NooJ natural language processing system were built as groundwork forthe study. They integrate the SAL ontology of the OpenLogos system. Basedon Port4NooJ, automated paraphrasing software tools ReWriter and ParaMTwere also created to re-write and translate support verb constructions.ReEscreve, the Portuguese version of ReWriter, is being used as anauthoring aid online public service and its interface is described in thisdissertation. The automated paraphrasing of support verb constructionsthrough ReEscreve allows a 40% improvement of the quality of the machinetranslation results in that context.



Page Updated: 20-May-2010