LINGUIST List 21.2205|
Wed May 12 2010
Internships: Historical Ling, Ling & Lit, Text/Corpus Ling: Intern, Tufts U.
Editor for this issue: Matthew Lahrman
The LINGUIST List strongly encourages employers to use non-discriminatory standards in hiring policy. In particular we urge that employers do not discriminate on the grounds of race, ethnicity, nationality, age, religion, gender, or sexual orientation. However, we have no means of enforcing these standards.
To submit an internship, use our convenient web form at http://linguistlist.org/internship
Historical Linguistics, Linguistics & Literature, Text/Corpus Linguistics : Undergraduate Intern, Neo-Latin Metadata, Perseus Digital Library, Tufts University, Medford, Massachusetts, USA
Message 1: Historical Linguistics, Linguistics & Literature, Text/Corpus Linguistics : Undergraduate Intern, Neo-Latin Metadata, Perseus Digital Library, Tufts University, Medford, Massachusetts, USA
From: Anne Mahoney <anne.mahoneytufts.edu>
Subject: Historical Linguistics, Linguistics & Literature, Text/Corpus Linguistics : Undergraduate Intern, Neo-Latin Metadata, Perseus Digital Library, Tufts University, Medford, Massachusetts, USA
E-mail this message to a friend
University or Organization: Tufts University
Department: Perseus Digital Library, classics department
Web Address: http://www.perseus.tufts.edu
Type of Work: Annotation
Ling & Literature
Internship Location: Medford, Massachusetts, USA
Minimum Education Level: No Minimum
Special Qualifications: Reading knowledge of Latin.
Two undergrad positions at Perseus Digital Library, Tufts U, summer 2010.
With the rise of large open digitization projects such as the Internet
Archive and Google Books, we are witnessing an explosive growth in the
number of source texts becoming available to researchers in historical
languages. The Internet Archive alone contains over 12,585 texts
catalogued as Latin, including classical prose and poetry written under
the Roman Empire, ecclesiastical treatises from the Middle Ages, and
dissertations from 19th-century Germany written -- in Latin -- on the
philosophy of Hegel. At 1.7 billion words, this collection eclipses the
extant corpus of Classical Latin by several orders of magnitude and
begins to offer insight into grand questions such as the evolution of a
language over both time and space.
One of Tufts' goals in this data-intensive computing project is to be
able to track the spread of linguistic features within a language and
ideas across languages over the two millennia that Latin was used as a
lingua franca across Europe. While much of this research operates on the
textual data itself, the ability to chart such movement in both space
and time requires accurate extra-textual metadata, including both the
place and date of a work's composition. The library records available to
us, in contrast, report the place and date of publication for a specific
edition -- which, for historical texts, is often far removed from the
time and place of original composition. For establishing the differences
in usage between the Latin of Vergil's Aeneid and that of Jean Calvin's
Institutio Christianae Religionis, it is far more important for us to
know that the former was composed ca. 19 BCE and the latter in 1536 CE
than the date of any later editions.
In this project, undergraduates will supplement the existing million
book metadata by researching the dates and locations of composition for
the subset of the Internet Archive collection that has been catalogued
(or otherwise identified) as being written in Latin. While some authors
(like Vergil and Calvin) have more-or-less established dates and places
of composition for their works, others (such as more obscure medieval
authors) do not. In either case, both will require the student to
conduct substantial research to determine the date of original
publication (if one exists) or to delimit the smallest time window
possible given the state of current research on each author. This will
require students to leverage their skills as nascent humanists while
also placing an emphasis on computational thinking, exposing them to the
far wider range of tasks to which traditional modes of scholarship can
The resulting data that will be produced as part of this internship are
crucial for allowing us to begin analyzing the spread of linguistic
features across space and time -- it simply cannot be done with the
existing metadata in the collection. The research experience of
undergraduates here will, in a very tangible way, contribute to the
success of the larger project. Note: This internship position has already been filled.
Application Deadline: This internship position has been filled.
Prof. Gregory Crane
Read more issues|LINGUIST home page|Top of issue
Please report any bad links or misclassified data
LINGUIST Homepage | Read
LINGUIST | Contact us
While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.