LINGUIST List 5.1345

Tue 22 Nov 1994

Sum: Speech processing software

Editor for this issue: <>


Directory

  1. Papazachariou D, Sum: software

Message 1: Sum: software

Date: Mon, 21 Nov 94 19:09:10 GMSum: software
From: Papazachariou D <papazessex.ac.uk>
Subject: Sum: software

Dear colleagues,
I am really grateful for your assistance and your information, which is
 presented bellow:
 A) for DOS/Windows

1. Speech Viewer
(with no more information either about the manufacturer, the price and any
 comments about its function).

2. CECIL (Computerised Extraction of Components of Intonation and Language)
Price: around $300
8 bit sampling  8, 13, or 19.5 kHz, to DOS PC memory, via parallel port.
Software extraction and display (EGA or VGA) of various features, e.g.,
 waveform
 raw F0
 smoothed F0
 etc.
Has built-in IPA font for transcription/labelling of analysed signals.
address: "JAARS (International Computer Services, Box 248 JAARS Road, Waxhaw, NC
 28173, (USA?) (704) 843-6151, Fax: (704) 843-6200)" (Kimberly Soto).
(It seems that CECIL is hardware of the following package)

3.SIL Speech Analysis System
A package that "does a very good job of tracking and displaying F0. It will
 also display the waveform and/or amplitude contour simultaneously and can do
 some rudimentary spectral analysis as well. The total cost for the hardware
 interface and software is about $300. The hardware component consists of a
 small box that connects to the parallel port of the computer and has input
 jacks for a microphone or tape recorder. The version runs under MS-DOS,
 although I'm told a Windows version is either available now or will be soon.
 The address to order from is:
International Computer Services
Attn: Customer Services
Box 248
Waxhaw, NC 28173-0248
USA
Phone: 704-843-6257
They also list an e-mail address: icsust1.jaars.sil.org. I have not
found this address to work in the past". (Rod Casali)

4. CSRE (Canadian Speech Research Environment).
The program is developed by Don Jamieson and others in Western Ontario.
It needs a 386 PC or more. The program cost $ 400 (in 1993) and the
 manufacturers recommended an Ariel board as hardware (which cost around 2-3
 thousand dollars in 1993).
"address:
AVAAZ Innovation Inc.
PO Box 8040
Wonderland Rd. North
London Ontario N6G 2B0
Canada
Tel(519) 472-7944
Fax: (519) 472-7819" (Franke Ingolf)

5. DSP (Digital Signal Processing)
The manufacturer is :
 Ariel Corporation phone: (908) 249-2900
 433 River Road fax: (908) 249-2123
 Highland Park, NJ 08904 DSP BBS: (908) 249-2124 (300-9600 bps)
 (I believe in USA)
In addition, they offer the SpeechStation, a complete speech-synthesis package.

6. SpeechStation (sencimetrics)
(with no more information either about the manufacturer -which could be the
 Ariel Corporation-, the price and any comments about its function).

7. Cspeech
"For a DOS environment, CSpeech does a great job of displaying
a waveform, a fundamental frequency contour, and an amplitude contour,
(as well as other analyses, including a spectrogram) on one screen.
For further information about CSpeech, contact:
 Paul Milenkovic
 Dept. of Electrical & Computer Engineering
 University of Wisconsin - Madison
 Madison, WI 53706 U.S.A.
 milenkovicengr.wisc.edu" (Charles Read)

8. Kay Elemetrics' Computer Speech Lab (CSL)
"very few of the commercial pitch tracers are good with noisy recordings.
Kay claim to have a new super-robust system which works with their CLS work
station." (Linda Shockey)
"this set-up must cost around $3500-$5000 without the DOS machine itself."
(Alex Francis)
"The best PC system for general purpose speech analysis, including spectrograms
linear prediction analysis and all kinds of other things, is the CSL system from
 KAY.
It is a lot more expensive, but is really quite sophisticated and very
 impressive.
Strangely, though, the F0 tracks are not very dependable - there is sometimes
doubling and halving, and quite often you need to fiddle with the parameters to
 get
anything at all usable." (David Deterding)
"address:
KAY Elemetics Corp.
12 Maple Avenue
PO Box 2025
Pine Brook, NJ 07058-2025
USA
Tel: (201) 227-2000
Fax: (201) 227-7760" (Franke Ingolf)

9. Loughborough Sound Images Speech Work station
 This program runs on a PC AT (or 286/386 based compatible), with 640k RAM,
 EGA/VGA graphics, Microsoft Mouse (or compatible), Hard disk (40 MB
 recommended), RAM disk (required for stereo recording or fast sample rates),
 DOS version 3.0 or greater. The LSI Speech Workstation can display the signal
 in a variety of ways, including black and white or full-colour spectrograms,
 waveforms, spectral slices (cross-section through a spectrogram which is
 displayed horizontally across the screen)... All of them are reasonably fast,
 especially on a 386 PC. A wide range of bandwidths is available for the
 spectrogram and the spectral slice, and the waveform can be
scaled.. Several of these can be displayed at the same time by
splitting the screen. The screen can also be split to accommodate
parts of two separate recordings. The analog card supplied with the
Speech Workstation has two input channels each of which can be
connected to either a microphone or line output. Two markers are
available, which allow you to pecutting and pasting, copying etc. It
is possible to play only marked sections of the signal on the screen.
The maximum length of analysed speech was 3-5 minutes.

The manufacturers are:
Loughborough Sound Images Limited
The Technology Centre
Epinal Way
Loughborough
ENGLAND
LE11 0QE
Telephone: (0509) 231 843 Telex: 34 1409 LUFBRA G
Fax: (0509) 262 433

10. SFS
"I'll include the whole README file:

 SPEECH FILING SYSTEM
 Computer Tools For Speech Research
 Department of Phonetics and Linguistics
 University College London

Introduction

SFS provides a computing environment for conducting research into the
nature of speech. It comprises software tools, file and data formats,
subroutine libraries, graphics, standards and special programming
languages. It performs standard operations such as acquisition,
replay, waveform editing and labelling, spectrographic and formant
analysis and fundamental frequency estimation. It runs under Unix and
DOS environments and is currently running on Sun, Hewlett-Packard,
Masscomp and 486PC. SFS is copyrighted University College London, but
is currently supplied free of charge to research establishments for
non-profit use. SFS is supplied as is with no warranty or support.

Features

Operating environments:
 Unix, Protected-Mode DOS (with GNU compiler)
Supported Data Acquisition/Replay:
 Masscomp: AD12F, DA08
 Sun: SPARC-2 8-bit, SPARC-10 16-bit
 IBM-PC: Data Translation 2811, PCLX, UCL Parallel Printer DAC
 (SFS supports networked replay from Unix to PC)
Supported Graphics Devices:
 Masscomp: 6-plane colour graphics
 Sun: SPARC-2 monochrome console, SunTools
 Sun, HP: X-Windows
 PC: VGA and SVGA
 (SFS supports networked graphics from Unix to PC)
 Epson 24-bit dot matrix
 Kyocera laser printer
 Postscript laser printer
 WordPerfect graphics file output
Utilities:
 create SFS file, list SFS file, display/print SFS file,
 copy/link/remove items in SFS file, dump contents of SFS file.
Analysis programs:
 Acquisition and replay, waveform processing, Laryngographic
 processing, fundamental frequency estimation (from SP or from
 LX), formant frequency estimation, formant synthesis,
 spectrographic analysis, LPC analysis/synthesis, filterbank
 analysis/synthesis, PSOLA prosody manipulation.
File formats:
 Import from text, binary and ILS files; save multiple data items
 in SFS files and compare; standard formats for speech, Lx, Tx,
 Fx, annotations, synthesizer data, spectra, spectrograms, LPC
 coefficients, parameter tracks, etc; export to binary, text,
 ILS, HTK, etc; processing history maintained in file.
Subroutine libraries:
 Supports SFS file I/O and dynamic memory allocation for data
 sets; matrix operations; device-independent graphics.
Special purpose languages:
 SML Speech Measurement Language - interpreted language for
 measuring data in SFS files; SPC Speech Pascal - compiled
 language for waveform manipulation and analysis; C-SPAN -
 compiled language for synthetic speech stimuli generation.
Source
SFS is available by anonymous FTP from: pitch.phon.ucl.ac.uk in the
directory /pub/sfs (from August 1993). The README file gives current
version information.
Remember that we are unable to service requests for support on this
software. Bug fixes only may be sent to sfsphonetics.ucl.ac.uk; requests
for help may be ignored.
Acknowledgements
SFS has been developed from software written during the SPAR Alvey
Project involving GEC, Imperial College London, University College
London and Leeds University. The software that is distributed contains
only the UCL contribution to that project. Additional, compatible
software may be available from these partners or from other current
users of SFS, for example at York University. Please contact Mark
Huckvale for further information about ownership and other available
software.
Mark Huckvale
University College London
Gower Street
London WC1E 6BT
SFSphonetics.ucl.ac.uk" (Hannes Pirker)

11. SPECTRO 3000
"2 channel signal analyser (separate devise)
This analyser has the best pitch technique (SIFT and CEPSTRUM) I
have ever seen - but is very expensive (about 50.000 DM).
address:
MEDAV
Digitale Signalverarbeitungs GmbH
Graefenberger Strasse 34
D-91080 Uttenreuth
Germany" (Franke Ingolf)

 B) for Mac
1. Voice Navigator.
The only thing that I learned about this software (?) is the name -only- of the
 manufacturer, i.e. Articulate Systems.

2. MacSpeech Lab
The manufacturer is: GW Instruments , 35 Medford St. , Somerville , MA 02143
 (Canada ?)
(617) 625-4096
(617) 625-1322 (fax)

3. MacRecorder
(No information about it)

4. DSP
(the same as for DOS/Windows)

5. Signalyse (version 3.0)
"Signalyze(TM) 3.0 is an integrated speech signal analysis application for the
 Macintosh. It does signal editing and direct signal I/O to/from a number of
 devices. Version 3.0 has a user-friendly multi-level labeling feature: each
 label is coded for a linguistic level (e.g., segment, syllable, etc.). Level
 names are determined by the user and are color-coded. Also new in Version 3.0:
 Speech slow-down and speed-up (up to five times), color/grayscale spectrograms
 right with signal, AV-Macintosh support, easy vertical zoom, and more.

Signalyze has a large number of spectral analysis tools: spectrograms (B/W, 16
 and 256 colors/grays), cepstrograms, cone kernels, LPC-grams, FFT spectra and
 cepstra, and LPC spectra. Also included are statistics, dB measurement,
 interpolated signal resampling, transformations, envelopes, zero passages, and
 filtering. The manual is 224 pages, the on-board contextual Help is in English,
 French and German, and the whole interface is switchable to English, French and
 German. The program is about 980 k at the present. It runs on any Mac from the
 MacPlus on up (4 Mb and hard disk required).

Prices effective January 31, 1994:
 Individual license: $350.
 Departmental license: $750.
 Organizational license: $1250
 Extra manuals: $25 per manual.

Shipping costs:
1. U.S., Canada and Europe: $10 priority/air mail
2. Rest of the world: $20 priority/air mail
3. 3-day shipping anywhere in the world: $50"

Also, "Here are some details on the new labeling facility in Signalyze (version
3.0, Macintosh-specific software). This may be of use to people working in
prosody.

LEVELS
The Signalyze labeling operation works by levels. For each label, you
specify a level, such as "segment", "syllable", "phrase" etc. Each level
has its own label color and its own, user-definable name. Labels are marked
for their level by a number placed in front of the label name (e.g., "4:
its", means that the label "its" is marked for the fourth level).

WHAT GETS LABELED
You can label either points in the signal or selected portions of the
signal. The labels for selected portions are placed at the center of the
selection and are marked by angular brackets (")...(").

RE-EDITING, ADDING AND DELETING LABELS
Labels can be re-edited, new labels can be added anywhere in the signal,
and labels can be deleted individually or for an entire signal.

REPRODUCE LABELED SEGMENT
While the label is open for editing, you can play the selected portion of
the signal by doing COMMAND-Y. If you're labeling a selection, and if
you've set the audio to play signal selections, you'll hear the segment
which you're about to label.

ALIGN LABELS
You can choose any of nine different vertical positions for placing labels.
When you close a label with a click in the close rectangle or with RETURN,
the label automatically "snaps" to the nearest standard position.

SAVE LABELS
The label information is saved as a TAB-delimited TEXT file with the
extension ".lbl". It is stored in the same folder as the signal file.

LABEL FORMAT
The label format is in the public domain and is fully documented. It is
available on the Signalyze servers (see below).

TRANSPARENT SAVING AND OPENING
Open File operations on signal files that have accompanying ".lbl" TEXT
files cause the label information to be read into Signalyze. Save File
operations on signals with labeling information automatically saves the
labels in the same folder as the signal. Save File operations for signals
without labeling information erases whatever label file may have existed in
the folder.

LABEL IN PHONETICS
You can use any phonetic or normal font in the labels. However for
phonetics, it is recommended stick to SigPalFont (the shareware font
supplied with Signalyze). SigPalFont preserves the numbers and angular
brackets you need to indicate labeling levels, which is usually not
possible with other phonetic fonts.

SEARCH BY LABEL OR BY LEVEL
You can search for labels using either a given labeling level or the
label's name. You can specify two separate search patterns. Switch between
the two patterns with the SHIFT-LOCK key.

MORE INFO AND DEMO
Information on Signalyze Version 3.0 is available as follows:

BY FTP:
FTP MACFL4082.unil.ch or FTP 130.223.104.31
login anonymous

By Gopher Server:
Name of machine: gopher.unil.ch, find "Europe" and "Switzerland", select
"University of Lausanne", select "Autres Gophers de l3UNIL", select
"Faculte des Lettres", select "Laboratoire d3analyse informatique de la
parole (LAIP)", select "Speech Analysis and Speech Synthesis", select
"Signalyze"

Prof. Eric Keller (New email address: Eric.Kellerimm.unil.ch)

Laboratoire d'analyse // ^ || ||==\\
 informatique de la parole (LAIP) // / \ || || ||
Lettres, Universite de Lausanne // // \\ || ||==//
CH-1015 LAUSANNE, SWITZERLAND // //===\\ || ||
FAX +41 21 6924639/+41 21 692 4510 //==== // \\ || || ".

6. UCLA-Uppsala Analysis Package (to run on MacRecorder files)
"Write to:
Software Manager, Phonetics Lab, Linguistics Dept, UCLA, Los Angeles,
CA 90024-1543 for our order form. It's $5 for just this disk." (Peter
 Ladefoged)

7.GW Instruments Soundscope (formerly MacSpeechLab II) (MAC)
 "More able to be tailored to individual uses than
 CSL, but also a little clumsier and slower than CSL.
 I think it runs about $3000." (Alex Francis)

 C) for Unix
1. XWaves+
It costs around $5000 and the manufacturer is:
"ENTROPIC RESEARCH LABORATORY, INC.
 600 Pennsylvania Ave., SE
 Washington DC 20003
 USA" (Franke Ingolf)

2. DSP
(It has the same specifications with the SFS that has been presented above,
 about DOS/ Windows).

3. SFS (
It has the same specifications with the SFS that has been presented above, about
 DOS/ Windows).

4. Digital Ears
No more information, only the name of the manufacturer, i.e. Metaresearch

5. OGI Speech
"This is free!!! I have never used it professionally, but while working at Los
 Alamos Nat'l. Labs I did get a chance to play with it. It comes with a good
 manual, and works
with a number of different sound-file types, and can be configured for a number
 of different platforms (Sun & SGI). ... It needs some additional hardware."
(Alex Francis).

6. Entropic SPS
Software that runs through the xwaves+ package. I do not know its prise; also, I
 did not understand if it is a latest software improvement of xwaves+.
"address:
ENTROPIC RESEARCH LABORATORY, INC.
600 Pennsylvania Ave., SE
Washington DC 20003
USA" (Franke Ingolf)

Additional information:
 Read, Buder & Kent. (1992) 'Speech Analysis Systems: an evaluation',
 Journal of Speech and Hearing Research, 35, 314-332.

 P.C. Bagshaw, S.M. Hiller & M.A. Jack, (1993), 'Enhanced pitch tracking and
 the processing of f0 contours for computer aided intonation teaching', Proc.
 3rd European Conference on Speech Communication and Technology, pp 1003-6

Finally,
 "NATURAL LANGUAGE SOFTWARE REGISTRY

The Natural Language Software Registry is a catalogue of software
implementing core natural language processing techniques, whether
available on a commercial or non commercial basis. The current
version includes

+ speech signal processors, such as the Computerised Speech Lab
 (Kay Electronics)
+ morphological analysers, such as PC-KIMMO
 (Summer Institute for Linguistics)
+ parsers, such as Alveytools (University of Edinburgh)
+ knowledge representation systems, such as Rhet
 (University of Rochester)
+ multicomponent systems, such as ELU (ISSCO), PENMAN (ISI),
 Pundit (UNISYS), SNePS (SUNY Buffalo),
+ applications programs (misc.)

This document is available on-line via anonymous ftp to ftp.dfki.uni-sb.de
(directory:registry), by email to

 registrydfki.uni-sb.de,

and by physical mail to the address below. If you have developed a piece
of software for natural language processing that other researchers might
find useful, you can include it by returning the description form below.
If you are interested in the preliminary draft of the Registry, do not
hesitate to drop us an email message and we will be happy to send it to
you." (Jane Edwards)
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue