LINGUIST List 30.2627
Wed Jul 03 2019
Diss: German; Greek, Modern; Computational Linguistics; Morphology; Text/Corpus Linguistics; Translation: Christina Valavani: ''Analysis and Processing of German Multi-word Financial Terms in Bilingual and Multilingual Applications''
Editor for this issue: Sarah Robinson <srobinsonlinguistlist.org>
Date: 01-Jul-2019
From: Christina Valavani <cvalavani
hotmail.com>
Subject: Analysis and Processing of German Multi-word Financial Terms in Bilingual and Multilingual Applications
E-mail this message to a friend Institution: National and Kapodistrian University of Athens
Program: Department of German Language and Literature
Dissertation Status: Completed
Degree Date: 2019
Author: Christina Valavani
Dissertation Title: Analysis and Processing of German Multi-word Financial Terms in Bilingual and Multilingual Applications
Linguistic Field(s): Computational Linguistics
Morphology
Text/Corpus Linguistics
Translation
Subject Language(s):
German (deu) Greek, Modern (ell) Dissertation Director:
Christina Alexandris
Georgios Mikros
Batsalia Freideriki
Dissertation Abstract:
The present Thesis concerns the analysis, translation and processing of German multi-word compounds as financial and economic terms in journalistic texts and business news. The German multi-word compounds constituting financial and economic terms are analyzed in respect to Modern Greek and their machine translation is evaluated with available online machine translation tools. In particular, the GoogleTranslate machine translation tool is used, as well as its latest updated version (with Deep Learning). Finally, an algorithm and statistical approach is proposed, for the correct analysis and processing of the German multi-word financial and economic terms.
The present study involves the comparison of theoretical models in German (for example, Sternefeld, 2006, Elsen 2011 and Ralli, 2007) and in Modern Greek for the analysis of compound words and multi-word compounds. The analysis is based on empirical data from a large corpus of collected German financial texts and business news available online from major German media and the German press. According to the empirical data, sixteen (16) most commonly occurring structures of German multi-word compounds as financial and economic terms are determined (1).
In addition, a parallel corpus is constructed for the German financial texts and business news and the respective equivalent terms in Modern Greek. In addition to the parallel corpus, a separate database contains the errors and error types for the machine translation of the German multi-word financial and economic terms into Modern Greek. The errors and error types are determined according to a specified set of criteria and related research in the domain of Terminology. Five (5) main categories and various sub-categories of machine translation errors are defined and evaluated (2). We note here that the due to the particularities of the language pair German-Greek, most error categories continue to persist, despite the latest developments in Machine Translation. From the empirical data and findings, a set of translation models is determined (3). The constructed translation models constitute the basis for the proposed theoretical model integrating the existing theoretical models in German and in Modern Greek (Sternefeld, 2006, Elsen 2011 and Ralli, 2007) (4). The theoretical model presented is connected to the proposed algorithm analyzing German multi-word financial and economic terms. The algorithm (5), involving the use of IBM-Models, produces re-ordered (re-ordering algorithm), re-constructed, re-phrased equivalent financial terms and expressions in Modern Greek, targeting to precision and correctness. The proposed algorithm is reinforced with statistical models, for example, Bhattacharrya (2015), which demonstrate an evident difference in output quality and efficiency in respect to lexical-based approaches.
Page Updated: 03-Jul-2019