Editor for this issue: Andrea Berez <andrea
linguistlist.org>
Information Retrieval for Question Answering Short Title: IR4QA Date: 29-Jul-2004 - 29-Jul-2004 Location: Sheffield, United Kingdom Contact: Rob Gaizauskas Contact Email: R.GaizauskasMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuesheffield.ac.uk Meeting URL: Linguistic Sub-field: General Linguistics, Computational Linguistics Call Deadline: 07-Jun-2004 This is a session of the following conference: 27th Annual International ACM SIGIR Conference on Research and Development in IR Meeting Description: Open domain question answering has become a very active research area over the past few years, due in large measure to the stimulus of the TREC Question Answering track. This track addresses the task of finding *answers* to natural language (NL) questions (e.g. ``How tall is the Eiffel Tower?'' ``Who is Aaron Copland?'') from large text collections. This task stands in contrast to the more conventional IR task of retrieving *documents* relevant to a query, where the query may be simply a collection of keywords (e.g. ``Eiffel Tower'', ``American composer, born Brooklyn NY 1900, ...''). Call for Papers SIGIR'04 Workshop INFORMATION RETRIEVAL FOR QUESTION ANSWERING (IR4QA) July 29, 2004, Sheffield, UK Finding answers requires processing texts at a level of detail that cannot be carried out at retrieval time for very large text collections. This limitation has led many researchers to propose, broadly, a two stage approach to the QA task. In stage one a subset of query-relevant texts are selected from the whole collection. In stage two this subset is subjected to detailed processing for answer extraction. To date stage one has received limited explicit attention, despite its obvious importance -- performance at stage two is bounded by performance at stage one. The goal of this workshop is to correct this situation, and, hopefully, to draw attention of IR researchers to the specific challenges raised by QA. A straightforward approach to stage one is to employ a conventional IR engine, using the NL question as the query and with the collection indexed in the standard manner, to retrieve the initial set of candidate answer bearing documents for stage two. However, a number of possibilities arise to optimise this set-up for QA, including: * preprocessing the question in creating the IR query; * preprocessing the collection to identify significant information that can be included in the indexation for retrieval; * adapting the similarity metric used in selecting documents; * modifying the form of retrieval return, e.g. to deliver passages rather than whole documents. For this workshop, we solicit papers that address any aspect of how this first, retrieval stage of QA can be adapted to improve overall system performance. Possible topics include, but are not limited to: * parametrizations/optimizations of specific IR systems for QA * studies of query formation strategies suited to QA * different uses of IR for factoid vs. non-factoid questions * utility of term matching constraints, e.g. term proximity, for QA * analyses of passage retrieval vs full document retrieval for QA * analyses of boolean vs ranked retrieval for QA * impact of IR performance on overall QA performance * named entity preprocessing of questions or collections * corpus preprocessing to create corpus-specific thesauri for question expansion * evaluation measures for assessing IR for QA The workshop will include paper presentations and discussion. All those wishing to make a presentation should submit a 5-8 page position paper; other attendees may submit a short abstract on why this topic is of interest to them. The papers should describe recent work and may be preliminary in nature. The programme committee will arrange the presentations and discussion based on the quality of submissions and expressed interests of the attendees, and may invite other presentations as well. See http://www.sigir.org/sigir2004 for further details. Important Dates =============== Position paper submission: June 7 Acceptance notification: June 23 Final papers due: July 6 Workshop: July 29 \end{tabbing} Submission Instructions ======================= Position papers should be no more than 4000 words (5-8 pages). The standard ACM conference style is recommended (see: http://www.acm.org/sigs/pubs/proceed/template.html). Submissions must be sent electronically in PDF or PostScript format to: Rob Gaizauskas R.Gaizauskas
sheffield.ac.uk Workshop Organizers =================== Rob Gaizauskas (University of Sheffield) Mark Hepple (University of Sheffield) Mark Greenwood (University of Sheffield) Programme Committee =================== Shannon Bradshaw (University of Iowa) Charles Clarke (University of Waterloo) Sanda Harabagiu (University of Texas at Dallas) Eduard Hovy (University of Southern California) Jimmy Lin (Massachusetts Institute of Technology) Christof Monz (University of Maryland) John Prager (IBM) Dragomir Radev (University of Michigan) Maarten de Rijke (University of Amsterdam) Horacio Saggion (University of Sheffield) Karen Sparck-Jones (University of Cambridge) Tomek Strzalkowski (State University of New York, Albany) Ellen Voorhees (NIST)
Arabic Language Resources and Tools Conference Date: 22-Sep-2004 - 23-Sep-2004 Location: Cairo, Egypt Contact: Bente Maegaard Contact Email: nemlarMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuecst.dk Meeting URL: http://www.nemlar.org Linguistic Sub-field: Computational Linguistics Subject Language Family: Arabic ,Arabic based Call Deadline: 15-May-2004 Meeting Description: The focus will be put on on Arabic language technology and on the necessary language resources and tools for both research and commercial development of language technology for Arabic. Multilingual language technology is also in the focus, as well as general methodologies. Evaluation of modules and systems is another field which is closely related to language resources, because language resources are used to perform the evaluation. Conference aims Language Resources (LRs) are recognised as a central component of the linguistic infrastructure, necessary for the development of HLT applications and products, and therefore for industrial development. In this conference, we will focus on Arabic language technology and on the necessary language resources and tools for both research and commercial development of language technology for Arabic. Multilingual language technology is also in the focus, as well as general methodologies. Evaluation of modules and systems is another field which is closely related to language resources, because language resources are used to perform the evaluation. Consequently we also invite papers in this area. Substantial mutual benefits are achieved by addressing these issues through international collaboration. For this reason, the conference is organised at the international level. The aim of this conference is to provide an overview of the state-of-the-art for Arabic resources and tools, discuss problems and opportunities, exchange information regarding LRs, their applications, ongoing and planned activities, industrial uses and needs, requirements coming from the new e-society, both with respect to policy issues and to technological and organisational ones. Conference topics - Issues in the design, construction and use of Arabic Language Resources (LRs) - Issues in Human Language Technologies (HLT) evaluation - Policy issues, international cooperation, strategies for the support of LR - Exploitation of Arabic data for the development of language technologies Please check the web site www.nemlar.org for the full Call text. Programme The Scientific Programme will include invited talks and oral presentations, referenced demonstrations and panels. Abstract submission On-line submission forms will soon be available. Please check the project and conference web pages, www.nemlar.org. Important dates - Submission of proposals for papers, referenced demos: 15 May 2004 - Notification of acceptance: 15 June 2004 - Final versions for the proceedings: 20 August 2004 Programme chairs - Khalid Choukri, ELDA, Paris, France (co-chair) - Bente Maegaard, CST, University of Copenhagen, Denmark (co-chair) The programme committee, the scientific committee and the organising committee are found on the project and conference web site. NEMLAR For more information about NEMLAR (Network for Euro-Mediterranean LAnguage Resource and human language technology development and support), please contact: Bente Maegaard (co-ordinator) Tel: + 45 35 32 90 90 Fax: + 45 35 32 90 89 Email: nemlar
cst.dk Web: www.nemlar.org