Publishing Partner: Cambridge University Press CUP Extra Publisher Login

New from Cambridge University Press!


Voice Quality

By John H. Esling, Scott R. Moisik, Allison Benner, Lise Crevier-Buchman

Voice Quality "The first description of voice quality production in forty years, this book provides a new framework for its study: The Laryngeal Articulator Model. Informed by instrumental examinations of the laryngeal articulatory mechanism, it revises our understanding of articulatory postures to explain the actions, vibrations and resonances generated in the epilarynx and pharynx."

New from Oxford University Press!


Let's Talk

By David Crystal

Let's Talk "Explores the factors that motivate so many different kinds of talk and reveals the rules we use unconsciously, even in the most routine exchanges of everyday conversation."

E-mail this page 1

We Have a New Site!

With the help of your donations we have been making good progress on designing and launching our new website! Check it out at!
***We are still in our beta stages for the new site--if you have any feedback, be sure to let us know at***

Dissertation Information

Title: Cross-Lingual Voice Conversion Add Dissertation
Author: Oytun Turk Update Dissertation
Email: click here to access email
Institution: Boğaziçi University, Electrical and Electronics Engineering
Completed in: 2007
Linguistic Subfield(s): Computational Linguistics;
Director(s): Prof. Dr. Levent Arslan

Abstract: Cross-lingual voice conversion refers to the automatic transformation of a
source speaker’s voice to a target speaker’s voice in a language that the
target speaker cannot speak. It involves a set of statistical analysis,
pattern recognition, machine learning, and signal processing techniques.
This study focuses on the problems related to cross-lingual voice
conversion by discussing open research questions, presenting new methods,
and performing comparisons with the state-of-the-art techniques. In the
training stage, a Phonetic Hidden Markov Model based automatic segmentation
and alignment method is developed for cross-lingual applications which
support text-independent and text-dependent modes. Vocal tract
transformation function is estimated using weighted speech frame mapping in
more detail. Adjusting the weights, similarity to target voice and output
quality can be balanced depending on the requirements of the cross-lingual
voice conversion application. A context-matching algorithm is developed to
reduce the one-to-many mapping problems and enable non-parallel training.
Another set of improvements are proposed for prosody transformation
including stylistic modeling and transformation of pitch and the speaking
rate. A high quality cross-lingual voice conversion database is designed
for the evaluation of the proposed methods. The database consists of
recordings from bilingual speakers of American English and Turkish. It is
employed in objective and subjective evaluations, and in case studies for
testing new ideas in cross-lingual voice conversion.