Corpus-based methods will be found at the heart of many language and speech processing systems. This book provides an in-depth introduction to these technologies through chapters describing basic statistical modeling techniques for language and speech, the use of Hidden Markov Models in continuous speech recognition, the development of dialogue systems, part-of-speech tagging and partial parsing, data-oriented parsing and n-gram language modeling.
The book attempts to give both a clear overview of the main technologies used in language and speech processing, along with sufficient mathematics to understand the underlying principles. There is also an extensive bibliography to enable topics of interest to be pursued further. Overall, we believe that the book will give newcomers a solid introduction to the field and it will give existing practitioners a concise review of the principal technologies used in state-of-the-art language and speech processing systems.