LINGUIST List 6.748

Wed 31 May 1995

FYI: Program announcement from National Science Foundation

Editor for this issue: <>


  1. Gary Strong, Program Annnouncement from NSF of Interest

Message 1: Program Annnouncement from NSF of Interest

Date: Wed, 24 May 1995 16:11:53 Program Annnouncement from NSF of Interest
From: Gary Strong <>
Subject: Program Annnouncement from NSF of Interest


Program Solicitation


The Information, Robotics and Intelligent Systems Division (IRIS) and the
Cross-Disciplinary Activities Office (CDA) of the Computer, Information
Science and Engineering Directorate (CISE) of the National Science
Foundation (NSF) and the Software and Intelligent Systems Technology Office
(SISTO) of the Advanced Research Projects Agency (ARPA) plan to jointly
support research and development devoted to developing linguistic resources
for use in human language technology.

The aim of this joint initiative between NSF and ARPA is to accelerate the
progress in human language technology by supporting the research and
development of widely-accessible and affordable language resources and
closely related data resources. It is also of interest to encourage access
to these resources by exploring alternative delivery mechanisms that the
research community may incorporate as requested resources in their

Technical advances in spoken and written information technology have
resulted in the escalation of the number of information services and their
importance to the economy. The impact of the resulting national
information infrastructure on the well-being of the nation is acknowledged
and strongly supported by the academic, industrial and governmental
segments of our society. In order to provide full access to and full
benefit from these information services, significant advances are needed in
their ease of access and use. Spoken and written language technology is a
key means of bridging the gap between suppliers and consumers of
information services because it is the principal way in which
human-computer communication can become seamless with human-human
communication. Therefore, it is critically important to develop the human
language technologies required to unleash the potential benefits of future
information services.

Continuing advances in computing performance and new trainable language
models will allow consistent improvements in natural language processing if
appropriate corpora are available. Rapid advances are being made, and
computer-based human language technology research is now having an impact
in application areas such as telephony and multimodal communication. Yet,
current capabilities fall short of what is possible and what is needed.
Far more powerful language models must be created , trained, and compared
against realistic language data. This creates a demand for more
comprehensive national and multilingual language resources with which to
train the models, and for a wider variety of contextual linguistic data,
such as video and audio as an accompaniment to text and dialogue corpora.

This initiative has three main foci: (1) the continued improvement and
extension of speech, text, and closely related language resources to
support research and development in human language technology and
associated areas, such as interlanguage communications; (2) focused
experimental research and data collection involving multimodal types of
human language data resources; and (3) innovative ways to make these
resources widely available to potential users for both research and
education. The last two foci are described in Type II awards below.

Type I Award. Improvement in Basic Speech and Text Data Resources
Resources of interest are those created, maintained, and distributed to
provide broad training and evaluation data for basic research and
technological advances in the following areas:
- Speech recognition, including the transcription of high-quality
continuous speech and other contextual information from talkers unknown to
the system.
- Speech understanding, in which the focus is primarily on
domain-specific database query and update by voice.
- Information retrieval, in which the retrieval request is made in terms
of speech, text, or other closely associated modalities.
- Machine translation, including computer-aided human translation and
interlanguage dialog.

Human language data resources include, but are not limited to, annotated
and unannotated corpora of speech, speech with contextual accompaniment,
parallel speech and text in multiple languages, and common lexicons.
Languages of interest include, but are not limited to, English, major
European languages, Japanese, Chinese, and Arabic. Resources created under
this focus of the initiative must be of enduring value to a broad community
of human language technology researchers. The resources must be well
publicized and openly accessible to the general linguistic research

Delivery mechanisms may involve, for example, the use of high-performance
computer and communication networks or conventional compact disks, but
other innovative distribution methods are encouraged. The coordination of
resource development along with consortial arrangements with other
interested agencies and organizations, including those in other countries
is also of critical interest.

Type II Awards. New Approaches and Means of Data Collection and
While the primary interest of this initiative is resource support for
research in speech and text recognition and understanding, related support
on a smaller scale is also available for the following areas of innovation:
- Development of innovative resources. Examples include: The collection
and annotation of video, involving facial gestures and hand movements while
speaking to advance research on multi-modal communication using kinesics.
Dialogue data collection and annotation to serve as a foundation for the
advancement of research on natural language understanding in realistic
situations of human-to-human communication.
- Novel methods of delivery for multimedia resources to support, for
example, such areas as the study of prosody, facial expression
understanding, multi-agent dialogues, or others.
- Transportable software tools for speech and written language data
access and analysis.
- Novel mechanisms for language data capture. Means to capture and make
available samples such as contrived on-line speech understanding
experiments or scenarios for public access and data collection.
Experiments using such data to advance language research on speech
recognition in noisy environments over telephones by ordinary users.

This initiative is expected to provide overall a total of approximately
$3.5 million, depending on funding availability, to one or more awardees in
the following two categories:
- One large, standard award in the broad area of data collection,
archival and distribution of speech, text, and closely related modalities
or supportive annotations (Type I Award above). This award may be in the
form of an NSF grant or cooperative agreement, depending on the structure
of the project. Funding for this award will begin in late FY95. The total
budget should not exceed $2 million over a 30-month period. It's duration
may depend on the proposer's method for achieving self-sufficiency.
- Several smaller grants in the range of $150K to $250K per year for up
to three years toward one or more innovative approaches to language data or
its delivery (Type II Awards above). Funding for these awards will be made
when FY96 funds are available.

Proposers must state which of the two above categories best describes their
effort and should propose a budget accordingly. A single institution may
propose in both categories in separate proposals.

All proposals should refer to this Program Solicitation by number, and
should be prepared and submitted in accordance with the guidelines
contained in Grant Proposal Guide (NSF 94-2, January 1994). In addition,
type I proposals must include, within the regular page limits, special
sections for Improvement in Basic Speech and Text Data Resources proposals
(type I above) as follows:
- Evidence for Financial Self-Sufficiency. Proposers should provide
convincing arguments that self-sufficiency can be achieved by the end of
the award. The case can be made on the basis of revenues, industrial
participation, memberships, or other assistance.
- Revenue Plan. A plan should be given for how fees for the use of
resources will be determined and how revenues from fees charged will be
allocated within the project.
- Data Offer. A statement should be included that details the types and
volumes of data that the proposers could provide.

Nine (9) copies of each proposal, including one bearing original
signatures, should be addressed to:

 Human Language Resources
 National Science Foundation
 Proposal Processing Unit
 4201 Wilson Blvd. Room P60
 Arlington, VA 22230

One information copy should be sent to:

 Gary W. Strong, Program Director
 Interactive Systems
 National Science Foundation
 4201 Wilson Blvd. Room 1115
 Arlington, VA 22230


Academic and other not-for-profit research institutions in the United
States with computer and information science research capability are
invited to submit proposals. While proposals may involve unfunded
collaboration with industry or other agencies of the government, an
academic or research institution must be the prime research management
organization submitting the proposal.


Proposals submitted in response to this solicitation must be:

(1) received by NSF no later than 5PM July 14, 1995;
(2) be postmarked no later than five (5) days prior to the deadline date; or
(3) be sent via commercial overnight mail no later than two (2) days
prior to the deadline date to be considered for award.

The Type I award is planned for September 1995, with Type II awards to be
made shortly afterwards.


Telephone and email queries about this announcement are welcomed and should
be addressed to:

 Gary W. Strong, Program Director
 Interactive Systems
 (703) 306-1928


Proposals will be subject to review by a panel of external experts from the
scientific community. Supplemental ad hoc reviews may be solicited as
feasible and necessary to achieve a fair and accurate review of all
proposals. Some potentially successful submissions for the Type 1 award
(above) may receive site visits if deemed desirable in order to properly
evaluate the proposals.

Criteria by which the proposals will be judged include those published in
NSF 94-2, Grant Proposal Guide, but with special emphasis to be placed on
the impact of the proposed project on the infrastructure of science and
engineering and on the plan for becoming self-sufficient for Type I (above)

The specific impact on infrastructure to be assessed by the reviewers is
the likelihood that the language resources to be developed and the delivery
mechanisms proposed will be of the nature and quality to significantly
benefit language research and development processes. In addition, Type I
proposals will be evaluated in terms of the institution's ability to
establish a revenue mechanism that will permit it to continue to provide
resources access significantly beyond the period of award.

NSF and ARPA will jointly make the final selection of all awards under this
initiative, considering the recommendations of all the external reviewers.
Awards to successful projects will be made through NSF from funding
provided by both agencies.


Grants and cooperative agreements are administered in accordance with the
terms and conditions of NSF Grant General Conditions (GC-1) and NSF
Cooperative Agreement Conditions (CA-1), copies of which may be requested
from the NSF Forms and Publications Unit cited below under the section
ADDITIONAL INFORMATION. More comprehensive information is contained in the
NSF Grant Policy Manual (NSF 88-47), available through a subscription
offered by the Superintendent of Documents, Government Printing Office,
Washington, DC 20402.

The Foundation provides awards for research in the sciences and
engineering. The awardee is wholly responsible for the conduct of such
research and preparation of the results for publication. The Foundation
does not assume responsibility for such findings or their interpretation.

The Foundation welcomes proposals on behalf of all qualified scientists and
engineers, and strongly encourages women, minorities and persons with
disabilities to compete fully in any of the research and research related
programs described in this document.

In accordance with Federal statues and regulations and NSF policies, no
person, on grounds of race, color, age, sex, national origin, or disability
shall be excluded from participation in, denied the benefits of, or be
subject to discrimination under any program or activity receiving financial
assistance from the National Science Foundation.

THE NSF has TDD (Telephonic Device for the Deaf) capability, which enables
individuals with hearing impairment to communicate with the Division of
Human Resource Management about NSF programs, employment, or general
information. This number is (703) 306-0090.


NSF information and publications are available electronically via the World
Wide Web (the URL is, via Internet Gopher (on host, via anonymous FTP (from, or by sending
an email request (sent to if you don't know the publication
number or if you do). You may also send a written request to:

 NSF Forms and Publications Unit
 Room P-15
 4201 Wilson Blvd.
 Arlington, VA 22230


These awards provide funding for special assistance or equipment to enable
persons with disabilities (investigators and other staff, including student
research assistants) to work on NSF projects. See the program announcement
or contact the program coordinator at (703) 306-1636.


The information requested on proposal forms is solicited under the
authority of the National Science Foundation Act of 1950, as amended. It
will be used in connection with the selection of qualified proposals and
may be disclosed to qualified reviewers and staff assistants as part of the
review process; to applicant institutions/grantees; to provide or obtain
data regarding the application review process, award decisions, or the
administration of awards; to government contractors, experts, volunteers,
and researchers as necessary to complete assigned work; and to other
government agencies in order to coordinate programs. See System of
Records, NSF-50, Principal Investigator/Proposal File and Associated
Records and NSF-51, 60 Federal Register 4449 (January 23, 1995),
Reviewer/Proposal File and Associated Records, 59 Federal Register 8031
(February 17, 1994). Submission of the information is voluntary. Failure
to provide full and complete information, however, may reduce the
possibility of your receiving an award.

Public reporting burden for this collection of information is estimated to
average 120 hours per response, including the time for reviewing
instructions. Send comments regarding this burden estimate or any other
aspect of this collection of information, including suggestions for
reducing this burden, to:

Herman G. Fleming
Reports Clearance Officer
Division of Contracts, Policy, and Oversight
National Science Foundation
Arlington, VA 22230

and to:

Office of Management and Budget
OIRM-Paperwork Reduction Project (3145-0058)
Washington, DC 20503

OMB 3145-0058
P.T.: 34
K.W.: 1004144; 1004000; 0410000

Catalog of Federal Domestic Assistance No. 47.070

NSF 95-100 (New)

Gary W. Strong, Program Director, Interactive Systems
National Science Foundation, 4201 Wilson Blvd., Room 1115
Arlington, VA 22230
(703)306-1928; FAX: (703)306-0599; Email:
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue