LINGUIST List 11.1315

Tue Jun 13 2000

FYI: Lang Resources/ELRA, Australian Ling Society

Editor for this issue: Lydia Grebenyova <lydialinguistlist.org>


Directory

  1. Valerie Mapelli, Portuguese Corpus/Lexicon - ELRA News
  2. Valerie Mapelli, AURORA Project Database - ELRA News
  3. John Henderson, Australian Linguistic Society 1999 - Conf Proceedings

Message 1: Portuguese Corpus/Lexicon - ELRA News

Date: June 13, 2000 16:20:12 +0200
From: Valerie Mapelli <mapellielda.fr>
Subject: Portuguese Corpus/Lexicon - ELRA News

___________________________________________________________
				ELRA
		European Language Resources Association
			 ELRA News 
___________________________________________________________

		 *** ELRA NEW RESOURCES ***

We are happy to announce new resources available via ELRA:

ELRA-W0024 PAROLE Portuguese Corpus
ELRA-L0035 PAROLE Portuguese Lexicon

A description of each database is given below.

_______________________________________
ELRA-W0024 PAROLE Portuguese Corpus
_______________________________________

The parole Portuguese corpus contains approximately 3 million 
running words of European Portuguese distributed by Medium, 
as follows:
- Newspaper: about 65%, covering the period 1996-1997 of 3 titles;
- Book: about 20%, concerning 12 titles from 3 editing houses;
- Periodical: about 5%, concerning 7 weekly issues of 1 title, 1996;
- Miscellaneous: about 10%, concerning several files distributed by 8 titles.
The corpus was classified and encoded according to the common 
core parole encoding standard. The file format of this corpus is SGML.

A subcorpus of the PAROLE Portuguese Corpus, which reproduces 
approximately the whole Corpus distribution by Medium 
(Newspaper: about 65%, Book: ab. 20%, Periodical: ab. 5%, 
Miscellaneous: ab. 10%) is also available.
It has about 250,000 words morpho-syntactically tagged accordingly 
to the parole common tagset and morpho-syntactic annotation standards. 
Disambiguation was manually checked.

_______________________________________
ELRA-L0035 PAROLE Portuguese Lexicon
_______________________________________

The PAROLE Portuguese Lexicon is constituted by 20 thousand 
entries morpho-syntactically and syntactically encoded, accordingly 
to the parole common encoding standards. The data is in SGML format.

=====================================
For further information, please contact:

 ELRA/ELDA	 Tel +33 01 43 13 33 33
 55-57 rue Brillat-Savarin Fax +33 01 43 13 33 30
 F-75013 Paris, France E-mail mapellielda.fr

or visit the online catalogue on our Web site:

 http://www.icp.grenet.fr/ELRA/home.html
 or http://www.elda.fr
===================================== 
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: AURORA Project Database - ELRA News

Date: June 13, 2000 16:20:16 +0200
From: Valerie Mapelli <mapellielda.fr>
Subject: AURORA Project Database - ELRA News

___________________________________________________________
				ELRA
		European Language Resources Association
			 ELRA News 
___________________________________________________________

 
		 *** AURORA Project Database ***
 
ELRA is releasing two databases made within the ETSI STQ-AURORA DSR
working group.
_______________________________________
AURORA Project Database 2.0
_______________________________________
 
The Aurora project is releasing a revised version of the Noisy TI digits 
database to follow on the work of ETSI. This CD set is a replacement for
the previous set (version 1.0 consisted of 2 CDs while version 2.0 now
consists of 4 CDs) . 

This database is intended for the evaluation of algorithms for front-end
feature extraction algorithms in background noise but may also be used
more widely by speech researchers to evaluate and compare the performance
of noise robust speech recognition algorithms.
 
Compared to version 1.0 the changes are as follows:
1) The files are restored to the energy level of the original speech 
in the TI digits database.
2) One of the noise types added to the speech has been changed (the babble
one)
3) There is an additional test sets where the noises are mismatched to
those used in the training set 
4) There is a convolutional distortion test.
5) There is a clean training set
 
The CD ROM will be used for the next round of ETSI Aurora standards
evaluation. 

_______________________________________
AURORA Project Database 3.0 - Subset of SpeechDat-Car
Finnish database
_______________________________________

This database is a subset of the SpeechDat-Car database in Finnish 
language which has been collected as part of the European Union 
funded SpeechDat-Car project. It contains isolated and connected 
Finnish digits spoken in the following driving conditions inside a car:

1.	0 km/hr with the car engine on
2.	40-60 km/hr with the car windows closed
3.	40-60 km/hr with the car windows open
4.	100-120km/hr with no music in the background
5.	100-120km/hr with music in the background
 
The database also contains the software needed to run simulations
using the Entropic's HTK, which has been adopted as the "standard" 
HMM recogniser for the Aurora standard evaluation.


=====================================
For further information, please contact:

 ELRA/ELDA	 Tel +33 01 43 13 33 33
 55-57 rue Brillat-Savarin Fax +33 01 43 13 33 30
 F-75013 Paris, France E-mail mapellielda.fr
 
or visit the online catalogue on our Web site:
 
 http://www.icp.grenet.fr/ELRA/home.html
 or http://www.elda.fr
===================================== 
 
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 3: Australian Linguistic Society 1999 - Conf Proceedings

Date: Mon, 12 Jun 2000 13:20:30 +0800
From: John Henderson <john.hendersonuwa.edu.au>
Subject: Australian Linguistic Society 1999 - Conf Proceedings

The Proceedings of the 1999 Australian Linguistic Society Conference are
published at
http://www.arts.uwa.edu.au/LingWWW/als99/proceedings. All papers are
available in pdf format.

Contents:
The Lexicon and Quantity Implicatures
	Keith Allan
A Preliminary Analysis of Lebanese Arabic Intonation
	Dana Chahal
An Acoustic-Phonetic Descriptive Analysis of Pitch Realisations in
Kagoshima Japanese
	Shunichi Ishihara
Constraints on the Pre-auxiliary Position in Warlpiri and the Nature of the
Auxiliary
	Mary Laughren
Thematic Role Hierarchies and Role Engagement
	Tom Mylne
Suffix Coherence and Stress in Australian Languages
	Rob Pensalfini
"Just do it ...!" Discourse strategies for 'getting the message across' in
a factory production team
	Maria Stubbe
False witness: when historical texts fail
	Nicholas Thieberger
Set Marking Tags - 'and stuff'
	Joanne Winter and Catrin Norrby

_______________________________
Department of Linguistics,
University of Western Australia
WA 6907
Ph. (08) 9380 2870 (direct)
	(Int'l 61 8 9380 2870)
Fax (08) 9380 1154
	(Int'l 61 8 9380 2870)
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue