Call Deadline: 30-Sep-2016
Computer science has allowed linguists to tackle big data in big corpora. Technology has stirred many debates about data collection, corpus design or statistical significance. Corpus linguistics have gained in theoretical strength as corpus linguists explored the notion and the object. Large corpora have been extensively dealt with - but what about ''small corpora''? Small corpora (SC), in contrast, have drawn little attention from the scientific community so far. What is a small corpus? Are there several types? When and how are they used? Could we - or should we - do without them? SC raise two main questions:
1) Is a SC any corpus that cannot be categorized as a big corpus?
2) What are the epistemological and methodological implications of SC ?
This special issue aims at discussing SC directly, steering the debate away from already discussed notions (quantitative/qualitative opposition, representativeness, implementation, etc.). We want to focus on SC for themselves, starting by acknowledging their existence and accounting for their actual uses: linguists and language science scholars constantly use them.
Common SC are doctoral research corpora, test corpora (to train or test an automatic process or implementation), exploratory corpora to investigate the validity of a topic, method or of data set. Less or little documented languages are studied on small(er) corpora. Dead languages provide us with finite corpora: do they share issues with SC? Linguists can also work with a zero corpus or created data (as with invented languages). It thus appears that SC play a seminal role in language research, especially so since they are usually used at early stages. Their use is ubiquitous in language studies today, and this raises issues in multiple domains of the field.
SC force us to consider our research practices for what they are. As Antoinette Renouf (2007) explains, ''my analysis of the continuing creation of small corpora when it is technologically possible to create larger ones is that here necessity is playing a larger role''.
Searchable ''by hand'' (Cameron & Deignan 2003) or analyzed by the linguist (Koester 2010) SC can be practical. Other definitions are quantitative, defining different minimum word-numbers (Fachinetti 2007, McEnery & Wilson 1996). But Vaughan & Clancy (2013) noticed how blurry the limit actually is.
We invite contribution dealing with SC as such with their potential specificities in design, use and purpose. Contributions from any field of language science are welcome. Theoretical and epistemological considerations will be particularly appreciated.
Some possible axes of investigation include but are not limited to the practical problem (why choose SC? Is it always a choice?); the specialization problem (why would SC be specialized? What does specialization mean exactly?); the purpose problem (can certain types of SC depend on research domains for example? What about SC for purposes other than research, e.g. language learning?)
Long Abstract (max. 2 pages) in both PDF and text formats + a statement of purpose, in French, English, Italian or Spanish, sent to firstname.lastname@example.org by September 30, 2016
Sept. 30, 2016: deadline for abstract submission
Nov. 30, 2016: notification of acceptance
May 15, 2017: deadline for complete articles
Jan. 15, 2018: publication of Corpus special issue (digital and printed versions)
Charlotte Danino (Associate professor, Université Paris 3 - Sorbonne Nouvelle): email@example.com