A key concern of researchers involved in the creation and sharing of language resources is to attain maximum usability, reliability and longevity of these resources for present and future researchers in the language sciences. The view developed in this volume is that spoken corpora construction and sharing are major research endeavours that should also be laid open to academic debate in a manner that is more visible than is currently the case in corpus linguistics.
The present volume brings together multiple research perspectives to bear on the question of what constitutes best practices for the construction of spoken corpora. The book brings into closer contact scholars whose specializations have often remained in relatively different streams of scientific investigation; that is, scholars whose work falls primarily in conversation analysis, pragmatics and discourse analysis, but who are involved in spoken corpus compilation, on the one hand, and scholars who also specialize in linguistics but who have been intensively involved in developing various infrastructures for spoken corpora, on the other hand. This combination of scholars brings into better relief the concerns of data providers, data curators and data users in linguistic research.
This book is thus unique in that it highlights best practices from both the perspective of assembling, annotating and linguistic analysis of spoken corpora, as well as from the perspective of processing, archiving and disseminating spoken language. In doing so, the contributions emphasise not only the considerable promise that the rapid technological changes that society continues to experience in this area offer, but also possible dangers for the unwary.