|
Description:
|
This book provides an overview of various techniques for the alignment of
bitexts. It describes general concepts and strategies that can be applied to
map corresponding parts in parallel documents on various levels of
granularity. Bitexts are valuable linguistic resources for many different
research fields and practical applications. The most predominant application
is machine translation, in particular, statistical machine translation. However,
there are various other threads that can be followed which may be supported
by the rich linguistic knowledge implicitly stored in parallel resources. Bitexts
have been explored in lexicography, word sense disambiguation, terminology
extraction, computer-aided language learning and translation studies to name
just a few. The book covers the essential tasks that have to be carried out
when building parallel corpora starting from the collection of translated
documents up to sub-sentential alignments. In particular, it describes various
approaches to document alignment, sentence alignment, word alignment and
tree structure alignment. It also includes a list of resources and a
comprehensive review of the literature on alignment techniques.
Table of Contents: Introduction / Basic Concepts and Terminology / Building
Parallel Corpora / Sentence Alignment / Word Alignment / Phrase and Tree
Alignment / Concluding Remarks
|