Editor for this issue: <>
HUMAN LANGUAGE RESOURCES _____________________________________________ Program Solicitation A JOINT INITIATIVE OF: NATIONAL SCIENCE FOUNDATION COMPUTER AND INFORMATION SCIENCE AND ENGINEERING DIRECTORATE and ADVANCED RESEARCH PROJECTS AGENCY SOFTWARE AND INTELLIGENT SYSTEMS TECHNOLOGY OFFICE DEADLINE: JULY 14, 1995 INTRODUCTION The Information, Robotics and Intelligent Systems Division (IRIS) and the Cross-Disciplinary Activities Office (CDA) of the Computer, Information Science and Engineering Directorate (CISE) of the National Science Foundation (NSF) and the Software and Intelligent Systems Technology Office (SISTO) of the Advanced Research Projects Agency (ARPA) plan to jointly support research and development devoted to developing linguistic resources for use in human language technology. The aim of this joint initiative between NSF and ARPA is to accelerate the progress in human language technology by supporting the research and development of widely-accessible and affordable language resources and closely related data resources. It is also of interest to encourage access to these resources by exploring alternative delivery mechanisms that the research community may incorporate as requested resources in their proposals. Technical advances in spoken and written information technology have resulted in the escalation of the number of information services and their importance to the economy. The impact of the resulting national information infrastructure on the well-being of the nation is acknowledged and strongly supported by the academic, industrial and governmental segments of our society. In order to provide full access to and full benefit from these information services, significant advances are needed in their ease of access and use. Spoken and written language technology is a key means of bridging the gap between suppliers and consumers of information services because it is the principal way in which human-computer communication can become seamless with human-human communication. Therefore, it is critically important to develop the human language technologies required to unleash the potential benefits of future information services. Continuing advances in computing performance and new trainable language models will allow consistent improvements in natural language processing if appropriate corpora are available. Rapid advances are being made, and computer-based human language technology research is now having an impact in application areas such as telephony and multimodal communication. Yet, current capabilities fall short of what is possible and what is needed. Far more powerful language models must be created , trained, and compared against realistic language data. This creates a demand for more comprehensive national and multilingual language resources with which to train the models, and for a wider variety of contextual linguistic data, such as video and audio as an accompaniment to text and dialogue corpora. TOPICS OF INTEREST This initiative has three main foci: (1) the continued improvement and extension of speech, text, and closely related language resources to support research and development in human language technology and associated areas, such as interlanguage communications; (2) focused experimental research and data collection involving multimodal types of human language data resources; and (3) innovative ways to make these resources widely available to potential users for both research and education. The last two foci are described in Type II awards below. Type I Award. Improvement in Basic Speech and Text Data Resources Resources of interest are those created, maintained, and distributed to provide broad training and evaluation data for basic research and technological advances in the following areas: - Speech recognition, including the transcription of high-quality continuous speech and other contextual information from talkers unknown to the system. - Speech understanding, in which the focus is primarily on domain-specific database query and update by voice. - Information retrieval, in which the retrieval request is made in terms of speech, text, or other closely associated modalities. - Machine translation, including computer-aided human translation and interlanguage dialog. Human language data resources include, but are not limited to, annotated and unannotated corpora of speech, speech with contextual accompaniment, parallel speech and text in multiple languages, and common lexicons. Languages of interest include, but are not limited to, English, major European languages, Japanese, Chinese, and Arabic. Resources created under this focus of the initiative must be of enduring value to a broad community of human language technology researchers. The resources must be well publicized and openly accessible to the general linguistic research community. Delivery mechanisms may involve, for example, the use of high-performance computer and communication networks or conventional compact disks, but other innovative distribution methods are encouraged. The coordination of resource development along with consortial arrangements with other interested agencies and organizations, including those in other countries is also of critical interest. Type II Awards. New Approaches and Means of Data Collection and Distribution While the primary interest of this initiative is resource support for research in speech and text recognition and understanding, related support on a smaller scale is also available for the following areas of innovation: - Development of innovative resources. Examples include: The collection and annotation of video, involving facial gestures and hand movements while speaking to advance research on multi-modal communication using kinesics. Dialogue data collection and annotation to serve as a foundation for the advancement of research on natural language understanding in realistic situations of human-to-human communication. - Novel methods of delivery for multimedia resources to support, for example, such areas as the study of prosody, facial expression understanding, multi-agent dialogues, or others. - Transportable software tools for speech and written language data access and analysis. - Novel mechanisms for language data capture. Means to capture and make available samples such as contrived on-line speech understanding experiments or scenarios for public access and data collection. Experiments using such data to advance language research on speech recognition in noisy environments over telephones by ordinary users. SCOPE OF SUPPORT This initiative is expected to provide overall a total of approximately $3.5 million, depending on funding availability, to one or more awardees in the following two categories: - One large, standard award in the broad area of data collection, archival and distribution of speech, text, and closely related modalities or supportive annotations (Type I Award above). This award may be in the form of an NSF grant or cooperative agreement, depending on the structure of the project. Funding for this award will begin in late FY95. The total budget should not exceed $2 million over a 30-month period. It's duration may depend on the proposer's method for achieving self-sufficiency. - Several smaller grants in the range of $150K to $250K per year for up to three years toward one or more innovative approaches to language data or its delivery (Type II Awards above). Funding for these awards will be made when FY96 funds are available. Proposers must state which of the two above categories best describes their effort and should propose a budget accordingly. A single institution may propose in both categories in separate proposals. PREPARATION AND SUBMISSION OF PROPOSALS All proposals should refer to this Program Solicitation by number, and should be prepared and submitted in accordance with the guidelines contained in Grant Proposal Guide (NSF 94-2, January 1994). In addition, type I proposals must include, within the regular page limits, special sections for Improvement in Basic Speech and Text Data Resources proposals (type I above) as follows: - Evidence for Financial Self-Sufficiency. Proposers should provide convincing arguments that self-sufficiency can be achieved by the end of the award. The case can be made on the basis of revenues, industrial participation, memberships, or other assistance. - Revenue Plan. A plan should be given for how fees for the use of resources will be determined and how revenues from fees charged will be allocated within the project. - Data Offer. A statement should be included that details the types and volumes of data that the proposers could provide. Nine (9) copies of each proposal, including one bearing original signatures, should be addressed to: Human Language Resources Announcement National Science Foundation Proposal Processing Unit 4201 Wilson Blvd. Room P60 Arlington, VA 22230 One information copy should be sent to: Gary W. Strong, Program Director Interactive Systems National Science Foundation 4201 Wilson Blvd. Room 1115 Arlington, VA 22230 WHO MAY APPLY Academic and other not-for-profit research institutions in the United States with computer and information science research capability are invited to submit proposals. While proposals may involve unfunded collaboration with industry or other agencies of the government, an academic or research institution must be the prime research management organization submitting the proposal. WHEN TO SUBMIT Proposals submitted in response to this solicitation must be: (1) received by NSF no later than 5PM July 14, 1995; (2) be postmarked no later than five (5) days prior to the deadline date; or (3) be sent via commercial overnight mail no later than two (2) days prior to the deadline date to be considered for award. The Type I award is planned for September 1995, with Type II awards to be made shortly afterwards. INQUIRIES Telephone and email queries about this announcement are welcomed and should be addressed to: Gary W. Strong, Program Director Interactive Systems (703) 306-1928 gstrongMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuensf.gov PROPOSAL EVALUATION AND AWARD Proposals will be subject to review by a panel of external experts from the scientific community. Supplemental ad hoc reviews may be solicited as feasible and necessary to achieve a fair and accurate review of all proposals. Some potentially successful submissions for the Type 1 award (above) may receive site visits if deemed desirable in order to properly evaluate the proposals. Criteria by which the proposals will be judged include those published in NSF 94-2, Grant Proposal Guide, but with special emphasis to be placed on the impact of the proposed project on the infrastructure of science and engineering and on the plan for becoming self-sufficient for Type I (above) proposals. The specific impact on infrastructure to be assessed by the reviewers is the likelihood that the language resources to be developed and the delivery mechanisms proposed will be of the nature and quality to significantly benefit language research and development processes. In addition, Type I proposals will be evaluated in terms of the institution's ability to establish a revenue mechanism that will permit it to continue to provide resources access significantly beyond the period of award. NSF and ARPA will jointly make the final selection of all awards under this initiative, considering the recommendations of all the external reviewers. Awards to successful projects will be made through NSF from funding provided by both agencies. AWARD ADMINISTRATION Grants and cooperative agreements are administered in accordance with the terms and conditions of NSF Grant General Conditions (GC-1) and NSF Cooperative Agreement Conditions (CA-1), copies of which may be requested from the NSF Forms and Publications Unit cited below under the section ADDITIONAL INFORMATION. More comprehensive information is contained in the NSF Grant Policy Manual (NSF 88-47), available through a subscription offered by the Superintendent of Documents, Government Printing Office, Washington, DC 20402. The Foundation provides awards for research in the sciences and engineering. The awardee is wholly responsible for the conduct of such research and preparation of the results for publication. The Foundation does not assume responsibility for such findings or their interpretation. The Foundation welcomes proposals on behalf of all qualified scientists and engineers, and strongly encourages women, minorities and persons with disabilities to compete fully in any of the research and research related programs described in this document. In accordance with Federal statues and regulations and NSF policies, no person, on grounds of race, color, age, sex, national origin, or disability shall be excluded from participation in, denied the benefits of, or be subject to discrimination under any program or activity receiving financial assistance from the National Science Foundation. THE NSF has TDD (Telephonic Device for the Deaf) capability, which enables individuals with hearing impairment to communicate with the Division of Human Resource Management about NSF programs, employment, or general information. This number is (703) 306-0090. ADDITIONAL INFORMATION NSF information and publications are available electronically via the World Wide Web (the URL is http://www.nsf.gov/), via Internet Gopher (on host stis.nsf.gov), via anonymous FTP (from ftp://stis.nsf.gov), or by sending an email request (sent to info
nsf.gov if you don't know the publication number or pubs
nsf.gov if you do). You may also send a written request to: NSF Forms and Publications Unit Room P-15 4201 Wilson Blvd. Arlington, VA 22230 FACILITATION AWARDS FOR SCIENTISTS AND ENGINEERS WITH DISABILITIES (FASED) These awards provide funding for special assistance or equipment to enable persons with disabilities (investigators and other staff, including student research assistants) to work on NSF projects. See the program announcement or contact the program coordinator at (703) 306-1636. PRIVACY AND PUBLIC BURDEN STATEMENTS The information requested on proposal forms is solicited under the authority of the National Science Foundation Act of 1950, as amended. It will be used in connection with the selection of qualified proposals and may be disclosed to qualified reviewers and staff assistants as part of the review process; to applicant institutions/grantees; to provide or obtain data regarding the application review process, award decisions, or the administration of awards; to government contractors, experts, volunteers, and researchers as necessary to complete assigned work; and to other government agencies in order to coordinate programs. See System of Records, NSF-50, Principal Investigator/Proposal File and Associated Records and NSF-51, 60 Federal Register 4449 (January 23, 1995), Reviewer/Proposal File and Associated Records, 59 Federal Register 8031 (February 17, 1994). Submission of the information is voluntary. Failure to provide full and complete information, however, may reduce the possibility of your receiving an award. Public reporting burden for this collection of information is estimated to average 120 hours per response, including the time for reviewing instructions. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to: Herman G. Fleming Reports Clearance Officer Division of Contracts, Policy, and Oversight National Science Foundation Arlington, VA 22230 and to: Office of Management and Budget OIRM-Paperwork Reduction Project (3145-0058) Washington, DC 20503 OMB 3145-0058 P.T.: 34 K.W.: 1004144; 1004000; 0410000 Catalog of Federal Domestic Assistance No. 47.070 NSF 95-100 (New) ========================================================= Gary W. Strong, Program Director, Interactive Systems National Science Foundation, 4201 Wilson Blvd., Room 1115 Arlington, VA 22230 (703)306-1928; FAX: (703)306-0599; Email: gstrong
nsf.gov =========================================================