LINGUIST List 24.3117

Wed Jul 31 2013

Qs: Issues in creating a speech corpus

Editor for this issue: Alex Isotalo <>

Date: 26-Jul-2013
From: Pankaj Dwivedi <>
Subject: Issues in creating a speech corpus
E-mail this message to a friend

Hello all,

I am working on a lessor known dialect of Hindi language. I have around 15 hours of its speech data recorded with a professional recorder-Olympus LS100. Data mainly include free discourses from a variety of fields such as stories, daily routine, recipes, experiences, common words in isolation etc.I have also created text files/text grids for audio files using PRAAT. I am wondering if I can create a small speech corpus out of it. If yes, How? What next step should I take? I want to create a TTS system for it. Is it possible? Please explain it to me step by step.

You help will be duly acknowledged in research publications in form of a co-author?

Thank you!

Linguistic Field(s): Computational Linguistics                             Text/Corpus Linguistics

Page Updated: 31-Jul-2013