LINGUIST List 35.3417

Wed Dec 04 2024

Confs: Scoping Workshop: Corpus linguistics 2040: Which data, which methods, which models?

Editor for this issue: Erin Steitz <ensteitzlinguistlist.org>



Date: 28-Nov-2024
From: Christian Mair <christian.mairanglistik.uni-freiburg.de>
Subject: Scoping Workshop: Corpus linguistics 2040: Which data, which methods, which models?
E-mail this message to a friend

Scoping workshop: “Corpus linguistics 2040: Which data, which methods, which models?”
Short Title: FutureCorp

Date: 10-Jul-2025 - 11-Jul-2025
Location: Institut für Deutsche Sprache (IDS), Mannheim, Germany
Contact: Andreas Witt
Contact Email: [email protected]
Meeting URL: https://www.ids-mannheim.de/fi/veranstaltungen/workshop-corpus-linguistics-2040/

Linguistic Field(s): Computational Linguistics; General Linguistics; Text/Corpus Linguistics; Translation
Subject Language(s): English (eng)
German (deu)
Italian (ita)
Lithuanian (lit)
Spanish (spa)
Language Family(ies): Indo-Aryan

Meeting Description:

This two-day event, jointly organised by the English Department of the University of Freiburg and the Institut für Deutsche Sprache (IDS) in Mannheim, is designed as a scoping workshop on the future of corpus linguistics, highlighting empirical, methodological and conceptual issues facing our research community. Although the two organising institutions focus on English and German, corpus linguists working on other languages are explicitly invited to attend and contribute. We are convinced that debate across specialisations and language boundaries will be mutually beneficial.

The study of language structure, variation and change with digital corpora has moved from the margins to the centre of linguistics over the past five decades, promoting usage-based models within linguistics and making (corpus-)linguistics relevant and attractive in the wider domain of the Digital Humanities. In spite of the overall success, progress has been uneven in places. For example, all but a handful of languages are still under-resourced, and even in those boasting rich corpus-linguistic working environments, specific text-types (e.g. spontaneous conversation) are under-represented. Recently, corpus linguistic routines have been disrupted by advances in AI-based text generation and machine translation. Some challenges are practical, such as the question of how future corpora should handle data that are partly or fully machine-generated. Others are conceptual. Today, large reference corpora of pluricentric languages such as English, German and Spanish commonly use national standard varieties as a major ordering principle. By 2040, however, the widespread use of AI-based language technologies in everyday communication may make national boundaries less important; automatic algorithms may partly take over from educated elites as agents of linguistic standardisation. Whatever future is envisaged for corpus-linguistics, one thing remains clear: More numerous, more diverse and more complex corpora will also require more attention to issues of sustainable infrastructure for data preservation and enrichment. The following four colleagues have agreed to offer keynotes:
- Mark Davies (Provo UT, USA): “English-Corpora.org: Challenges, innovation, and sustainability”
- Silvia Bernardini (Bologna, Italy): “Beyond monolingualism: Challenges and affordances of corpus linguistics for the study of multilingualism and new forms of translation”
- Michaela Mahlberg (Erlangen, Germany): “Corpus linguistics and storytelling: Data and connections”
- Ramunė Kasperė (Kaunas, Lithuania): “Corpus construction and technologisation in the age of AI: Challenges for ‘smaller’ languages“

Bursaries: Funding permitting, we will be able to provide a limited number of travel grants, awarded on a competitive basis after the abstract submission deadline, to applicants who are early career researchers, employed on part-time or short-term contracts or facing similar challenges. Applications can be submitted during the registration process.

Organisers:
Andreas Witt, Institut für Deutsche Sprache (IDS), Mannheim
Christian Mair, English Department, University of Freiburg

Key dates:
From 1 Dec. 2024: Submission of abstracts
16 Feb. 2025: Abstract submission deadline
1 March 2025: Notification of acceptance; registration

For information see the conference website (https://www.ids-mannheim.de/fi/veranstaltungen/workshop-corpus-linguistics-2040/). A full programme will be available from 15 March 2025. All inquiries should be addressed to: [email protected]




Page Updated: 03-Dec-2024


LINGUIST List is supported by the following publishers: