Publishing Partner: Cambridge University Press CUP Extra Publisher Login

New from Cambridge University Press!


Revitalizing Endangered Languages

Edited by Justyna Olko & Julia Sallabank

Revitalizing Endangered Languages "This guidebook provides ideas and strategies, as well as some background, to help with the effective revitalization of endangered languages. It covers a broad scope of themes including effective planning, benefits, wellbeing, economic aspects, attitudes and ideologies."

E-mail this page 1

We Have a New Site!

With the help of your donations we have been making good progress on designing and launching our new website! Check it out at!
***We are still in our beta stages for the new site--if you have any feedback, be sure to let us know at***

Dissertation Information

Title: Towards an image-term co-occurence model for multilingual terminology alignment and cross-language image indexing Add Dissertation
Author: Diego Burgos H. Update Dissertation
Email: click here to access email
Institution: Universitat Pompeu Fabra, Linguistic Sciences and Applied Linguistics
Completed in: 2014
Linguistic Subfield(s): Computational Linguistics; Text/Corpus Linguistics; Translation;
Subject Language(s): English
Director(s): Leo Wanner

Abstract: This thesis addresses the potential that the relation between terms and images in multilingual specialized documentation has for glossary compilation, terminology alignment, and image indexing. It takes advantage of the recurrent use of these two modes of communication (i.e., text and images) in digital documents to build a bimodal co-occurrence model which aims at dynamically compiling glossaries of a wider coverage. The model relies on the developments of content-based image retrieval (CBIR) and text processing techniques. CBIR is used to make two images from different origin match, and text processing supports term recognition, artifact noun classification, and image-term association. The model aligns one image with its denominating term from collateral text, and then aligns this image with another image of the same artifact from a different document, which also enables the alignment of the two equivalent denominating terms. The ultimate goal of the model is to tackle the limitations and drawbacks of current static terminological repositories by generating bimodal, bilingual glossaries that reflect real usage, even when terms and images may originate from noisy corpora.