LINGUIST List 19.694|
Sat Mar 01 2008
FYI: Funding Opportunity: Automating Deep Language Understanding
Editor for this issue: Ann Sawyer
To post to LINGUIST, use our convenient web form at
Funding Opportunity: Automating Deep Language Understanding
Message 1: Funding Opportunity: Automating Deep Language Understanding
From: Ann Sawyer <sawyerlinguistlist.org>
Subject: Funding Opportunity: Automating Deep Language Understanding
E-mail this message to a friend
Readers, please note the time-sensitivity of the following announcement:
IARPA (the [US] Intelligence Advanced Research Projects Activity) is
seeking proposals for the initial phase of a new program dedicated to
automating deep language understanding through the discovery of
human-language indicators of social meaning. IARPA is the advanced
research organization established by the Office of the Director of National
Intelligence (ODNI) in October 2007. IARPA's principal mission is to impact
fundamentally and positively the quality of the future operational
processes of the Intelligence Community.
The preceding and following paragraphs are extracted from the full
solicitation, available at:
Researchers who are interested in submitting a proposal to this
solicitation are urged to read it right away, as there are many details and
the deadline for applications is March 22, 2008.
The Socio-Cultural Content in Language (SCIL) Program intends to explore
and develop innovative designs, algorithms, methods, techniques and
technologies to extend language understanding into the socio-cultural
arena. The program will, in the end, develop automated resources that
provide users with a broadened understanding of the contextual and social
value of the information with which they work.
Human language use reflects social and cultural norms, contexts and
expectations. Social variables (such as religion, status, gender,
education) and contextual features (such as formality, participant beliefs,
social situation) can influence the form and features of language. Because
language use responds to such social and cultural influences, then
correlating social goals with language forms and content should provide a
rich and expanded understanding of the attributes, roles and nature of the
associations and intentions of the users of the language.
Current human language technologies show little ability to "understand" or
capture the social dimensions of language. Today, information analysts
gather facts, generally without the context in which these facts occur.
Yet, human language does more than serve as a means of transferring factual
information. Referential meaning (i.e., conveying information about the
real world) is only one aspect of language use. Language can also convey
feelings and other unstated meaning; elicit behaviors from others; and
build and maintain relationships.... Understanding the global community of
today requires access to the varying worldviews of the players on the
world stage. Many dimensions of these worldviews are reflected in language.
Strides have been made in addressing the handling and processing of human
language data, in areas such as information retrieval and extraction,
machine translation, categorization, and speech and hard-copy processing.
Although challenges remain in these areas, researchers in human language
technology are positioned to extend their capabilities to a new arena. That
new arena is the discovery and representation of social and cultural
insights from human language use.
The goal of the SCIL Program is to develop a methodology for identifying
language indicators (i.e., their form, meaning and strength) of the social
characteristics and objectives of members of a social group. The
relationship between language indicators and social objectives will be
culture- and language-specific but the aim is to generalize across
languages and cultures. People tend to want to accomplish similar social
goals; it is how they do this that differs.
The social sciences have developed theories of behavior that are relevant
to this effort. These theories and systems can serve as the framework for
understanding social principles as well as for generalizing across
cultures. (As an example, Brown and Levinson in the 1980's proposed a
theory of politeness that abstracted away from language forms and
culture-specific strategies and provided a generalized view of politeness
that (presumably) can apply across languages or cultures.) The goal of the
Program, then, is to develop a methodology for addressing similar social
goals in different languages and cultures. Although using one language as a
baseline is permitted, proposers should keep in mind that the goal is to be able to
apply insights on linguistic indicators of a social function to a new
language and culture.
The SCIL Program is envisioned as a five-year effort that will be initiated
at the beginning of the second half of FY2008. Phase 1 of the Program will
consist of a base period of 14-months with two possible option years. The
final deliverable for the base period will be made at the 12-months mark.
Work may continue in the following two months but, based on the work
accomplished in the first 12 months, the Government will determine whether
to exercise the first option year. Year 1 of the Program will focus on
development of a proof-of-concept that automates techniques
and resources that link linguistic features with social goals and extended
meaning. Based on the results of the prior period, option years may be
exercised to expand the work. Proposals for an additional phase 2 of 2
years will be solicited under this BAA at the end of the third year.
The primary focus of the Program is on human language. The aim is to
associate linguistic cues and features with particular social goals and
constructs of a social group (e.g., leadership, coercion, politeness).
Because much social research on social norms and rules exists, it is not
the intent of the program to develop new social theories. The research is
focused on the automation of the association of linguistic features with
Traditional approaches to social network analysis are not of interest, but
social groups and the behaviors of their members, as conducted through or
supported by language, are.
Enhancement to information extraction technologies is also not of value to
the Program, although such techniques can be used if it is demonstrated
that the correlation between social goals and linguistic cues can be met.
There are three dimensions to this effort: the social features and
activities of the group and its members; the linguistic features that serve
as evidence of social goals; and the social science theories that help to
define the social features. It is the correlation of these three dimensions
that is important to the Program, showing how language serves as evidence
of social functions.
Because of the expected diversity in the problems that will be addressed,
the Program will not supply data to the participants. Data collection will
be the responsibility of the proposer.
The proposer must make clear what data will be used, what the features of
the data are (i.e., language, source, participants, size, etc.), how the
data are relevant to the topic of interest and how the data sets are
sufficiently large and rich to enable the identification of correlations
between the specific social problem being addressed and the language of the
The amount of data should support the research question and the development
of a convincing proof of concept. There is particular interest in the
proposed use of blogs, emails, conversations, text messaging and chat. It
is not expected that newswire will provide a rich source of information
because it generally reports on interactions versus documenting them. Data
from languages other than English and cross-cultural data are of special
interest and will be considered positively.
The goal of the Program is to provide analysts with language indicators of
social phenomena, and the strength of those indicators, in one or a large
group of documents or interactions. It is envisioned that the individual
efforts in the Program will result, in the end, in an integrated resource
that provides insights into multiple social and cultural dimensions of a
dataset. It is the responsibility of the proposer to specify how the
insights gathered will be represented and automated.
SCIL is open to all research and development organizations, including
Academic and eligible non-profit and not-for-profit institutions; Large and
small businesses; Collaborative ventures from mixed sources; and Federally
Funded Research and Development Centers (FFRDCs) and Laboratories. All
international organizations will be required to team in a subcontract role
with a U.S.-based organization.
Proposers are invited to submit proposals for a base period of 14-months
with two possible option years, indicating how the anticipated work of the
base year would be extended and enhanced in the option year(s). The
Government anticipates funding approximately 6-10 proposals for the first
year at varying levels of effort. The base period is expected to fall
within the $300,000 to $500,000 range. This funding range is an
approximation. Cost proposals should reflect the realistic cost of the
proposed work. Option years will be in the same funding range.
The initial set of proposals is due on March 22, 2008 NLT 3:00 p.m. (MST)
to the Department of the Interior/National Business Center address.
Proposals must be submitted in accordance with the requirements and
procedures identified in the BAA and this PIP. To be considered, full,
complete proposals (in original, one copy, and electronic media) must be
received. For overnight package delivery, proposals should be addressed to
the following address:
Dept of the Interior
National Business Center, Acquisition Services Directorate
Sierra Vista Branch
Augur & Adair Streets (Bldg. 22208, 2nd Floor)
Dept of the Interior
Fort Huachuca, AZ 85613
Linguistic Field(s): General Linguistics
Read more issues|LINGUIST home page|Top of issue
Please report any bad links or misclassified data
LINGUIST Homepage | Read
LINGUIST | Contact us
While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.