LINGUIST List 18.2951
Wed Oct 10 2007
Diss: Comp Ling: Turk: 'Cross-Lingual Voice Conversion'
Editor for this issue: Luiza Newlin Lukowicz
<luizalinguistlist.org>
Directory
1. Oytun
Turk,
Cross-Lingual Voice Conversion
Message 1: Cross-Lingual Voice Conversion
Date: 10-Oct-2007
From: Oytun Turk <oytunturkgmail.com>
Subject: Cross-Lingual Voice Conversion
E-mail this message to a friend
Institution: Boğaziçi University
Program: Electrical and Electronics Engineering
Dissertation Status: Completed
Degree Date: 2007
Author: Oytun Turk
Dissertation Title: Cross-Lingual Voice Conversion
Linguistic Field(s):
Computational Linguistics
Dissertation Director:
Prof. Dr. Levent Mustafa Arslan
Dissertation Abstract:
Cross-lingual voice conversion refers to the automatic transformation of asource speaker's voice to a target speaker's voice in a language that thetarget speaker cannot speak. It involves a set of statistical analysis,pattern recognition, machine learning, and signal processing techniques.This study focuses on the problems related to cross-lingual voiceconversion by discussing open research questions, presenting new methods,and performing comparisons with the state-of-the-art techniques. In thetraining stage, a Phonetic Hidden Markov Model based automatic segmentationand alignment method is developed for cross-lingual applications whichsupport text-independent and text-dependent modes. Vocal tracttransformation function is estimated using weighted speech frame mapping inmore detail. Adjusting the weights, similarity to target voice and outputquality can be balanced depending on the requirements of the cross-lingualvoice conversion application. A context-matching algorithm is developed toreduce the one-to-many mapping problems and enable non-parallel training.Another set of improvements are proposed for prosody transformationincluding stylistic modeling and transformation of pitch and the speakingrate. A high quality cross-lingual voice conversion database is designedfor the evaluation of the proposed methods. The database consists ofrecordings from bilingual speakers of American English and Turkish. It isemployed in objective and subjective evaluations, and in case studies fortesting new ideas in cross-lingual voice conversion.
|