Over the past 10 years, ILIT and its precursor, The LINGUIST List, have received generous support for their activities from several major funding sources. In particular they have been awarded eight different National Science Foundation grants to support linguistic infrastructure development as well as innovative, technology-based research projects. Some of ILIT's current NSF-funded research projects are described below.
As the world's largest online linguistics resource, The LINGUIST List is dedicated to disseminating information on languages and language analysis, and to providing the discipline of linguistics with the infrastructure necessary to function in the digital world. LINGUIST maintains a web-site with over 2000 pages and runs a mailing list with more than 25,000 subscribers worldwide. LINGUIST also hosts searchable archives of more than 130 other linguistic mailing lists. LINGUIST is a free resource, run by linguistics professors and graduate students, and supported largely by donations.
LL-MAP is a project designed to integrate language information with data from the physical and social sciences by means of a Geographical Information System (GIS). Its online geodata collection facility integrates maps of language distrbution, linguistic traits, socioeconomic and climate data. We are working on creating an interface that will allow scholars to create maps based on their own observation and to mend the information on language distribution drawn from public sources. The interface will also be integrated with the existing LINGUIST List databases and with the Multitree project. LL-MAP is a three-year joint project of Eastern Michigan University and Stockholm University, in collaboration with five other projects and archives and is funded by a National Science Foundation grant (HSD 0527512).
MultiTree is a digital library of scholarly hypotheses about language relationships and subgroupings. This information is organized in a searchable database with a web interface, and each hypothesis is presented graphically as an interactive hyperbolic display of a family tree, accompanied by information on all of the languages involved and the bibliographical sources of the hypothesis. MultiTree interacts with the LL-MAP Project, a geolinguistic database which provides users a fully functional Geographical Information System through which linguistic data - including subgrouping information - can be viewed in its geographical context. Both these databases are integrated with other existing LINGUIST List databases, providing access to a wealth of information on related books, articles, dissertations, and conferences. MultiTree is a three-year project funded by a National Science Foundation grant (BCS 04040000).
RELISH is a two year project funded by the NEH and the German Research Foundation (DFG). ILIT, the Max Planck Institute for Psycholinguistics and the University of Frankfurt collaborate in this project to unify standardization efforts in lexicon formats and data category description. The goal of the project is to create a unified searchable virtual archive of lexicons of endangered languages. We will make the following 6-8 lexicons interoperable: Archi, Iwaidja, Kayardild, Mocovi, Salar, Tofa, Udi, Wichita.
This project is a collaborative effort with the Alaska Native Language Center (ANLC), supported by a National Science Foundation grant (OPP-0326805). The goals are to digitize Dena'ina language legacy materials in the ANLC and to create the Qenaga website to enable Dena'ina community members to have online access to those materials. The project also provides training in linguistic fieldwork, and best practices in language data digitization and archiving to the Dena'ina community and to graduate students in linguistics.
The "Union Catalogue" of the Open Language Archives Community (OLAC) is hosted by The LINGUIST List. Currently, there are more than 28,000 searchable records in 26 linguistics-related archives, such as the Alaska Native Language Archive, the Perseus Project, and the Oxford Text Archive.
In collaboration with the University of Washington, ILIT is "Implementing the GOLD Community of Practice: Laying the Foundations for a Linguistics Cyberinfrastructure." The primary goal of GOLDComm is to increase the amount of ontology-aware linguistic data available to the researcher. The specific objectives of GOLDComm are four-fold: to improve intelligent harvesting of linguistic data for the Online Database of INterlinear text (ODIN); to mark up and integrate such data within the ontology-driven framework known as the General Ontology for Linguistic Description (GOLD); to develop a general search facility that models general linguistic knowledge with specific analytical knowledge of particular languages; and to provide an interactive and dynamic environment that allows the linguistic community to have input and make modifications to the core ontology that is at the heart of the data integration process. The long-term goal of the project is to offer the average linguist access to large amounts of structured and searchable linguistic data. The two-year GOLDComm project is funded by a National Science Foundation grant (BCS 0720122).
The LEGO project is digitizing a number of lexicons (provided by their creators) and storing them in a database, with an online viewing and search facility. The lexicons are tagged with terms from GOLD, the General Ontology for Linguistic Description, making them interoperable with one another with respect to the grammatical information they contain. This means that users will be able to construct linguistically interesting queries over these lexicons, which are drawn from 16 different projects covering more than 300 languages. This will be a significant resource for typologists, semanticists, lexicographers, translators and other researchers. In the future, the process that has been established for digitizing, uploading and tagging these lexicons will be available to be applied to other lexicons as well, so that the network of information can continue to grow.