LINGUIST List 7.371

Sat Mar 9 1996

Sum: Computerize Dialect Dictionary

Editor for this issue: T. Daniel Seely <>


  1. "J. Fix", Re: Summary of Responses - Computerise Dialect Dictionary

Message 1: Re: Summary of Responses - Computerise Dialect Dictionary

Date: Fri, 08 Mar 1996 21:41:38 GMT
From: "J. Fix" <>
Subject: Re: Summary of Responses - Computerise Dialect Dictionary

On February 18th 1996 I posted a message asking for help on how to
computerize a German dialect dictionary. Finally I found the time to put
together a short summary of all responses.

People who answered were:

John Clifton <>
Will Dowling <>
Matthias Heyn <>
Jacques Van Keymeulen <>
Alexander King <>
Nenad Koncar <>
Wilfried Kuhn <>
Andrea de Leeuw van Weenen <>
Robin Lombard <>
Kazuto Matsumura <>
Jon Mills <>
Ole Norling-Christensen <>
Elisabeth Seitz <>
George Smith <>
C. M. Sperberg-McQueen <U35395UICVM.CC.UIC.EDU>
Julie Thornton <>
Tony Vital <>
Ralf Vollmann <>

I want to thank everyone very much indeed for their time, interest, and
patience in dealing with my queries.

The nature of my question makes it virtually impossible to give a
concise summary. Sorry, if I have collected the bits and pieces
here rather than offering a homogeneous overview.

Quite a lot of replies recommended to look at SGML, the Standardized
General Markup Language. This is a kind of metalanguage which allows you
to create your own markup language. As far as I understand, you mark up the
data, either manually or automatically, and view it with an appropriate
program (comparable to HTML documents - one of those markup language based
on SGML - which is parsed and viewed by a WWW browser).

A recommended web site including many pointers to other SGML resources is:

A recommended newsgroup is comp.text.sgml

Beside SGML as such, there is the Text Encoding Initiative (TEI) which
has published so-called TEI Guidelines intending to provide a kind of
standardized framework for text encoding for the humanities. For dictionary
people, especially interesting is chapter 12 on printed dictionaries.

TEI's web site is

Programs to use - among others I assume - in order to turn a dictionary
(or any other document) into SGML, viz. to use it once it is in electronic
form are
- "sgmls" (free)
- "Author/Editor" (SoftQuad,
- "XGML" (the company is called Exoterica, based in Canada,
Special dictionary parsers are
- "DIPA" (used at the Danish Dictionary) and
- "LexParse" (used at the University of Tuebingen, Germany).

Other programs.
Suggested and/or used by replicants to build databases (among them
dictionaries) are:
- "the SIL program Shoebox"
	unable to comment on this one
- Access (Microsoft; for Win)
	well-known RDBMS
- FileMaker Pro (Claris; for Mac and Win)
	as well
- HyperCard (for Mac)
	one of the first hypertext tools
- AskSam (for DOS)
- World Translator (for Win and Mac)
	look at
- Folio VIEWS
	"a free-text database management tool";
	(educational price approx. 300 USD)
- MultiTerm (for Win)
	look at "a commercial product and market
	leader in the field of terminology database systems"

A suitable programming language to create a database that can
include graphics and sound seems to be LPA Win_Prolog.

Dictionary and similar projects I was referred to are:
- The New OED (
- The Danish Dictionary (email:
- Sound Database (email:
- Dictionary of Gamilaraay/Kamilaroi (put on the W3 at
- De Woordenboek van de Vlaamse Dialecten (email:
- Dictionary of the Slovene Language (no contact address)
- Atlante Linguistico del Ladino Dolomitico e Dialetti Limitrofi (ALD)

An overview of electronic dictionaries in connection with SGML is given in
- Bergenholtz & Tarp (eds.): Manual of Specialized Lexicography. John
 Benjamins Publishers. 1995 (in particular, pp. 37-46).
 ISBN (Europe): 90 272 1612 6
 ISBN (USA): 1-55619 693-8

The following book was quite useful to get a first impression of SGML:
- van Herwijnen, Eric: Pracitcal SGML. 2nd edtion. Kluwer Academic
 Publishers. 1995. (ISBN: 0-7923-9434-8)

Two interesting and pretty specialized titles for the lexicographer are:
- Frakes, William B. and Ricardo Baeza-Yates: Information Retrieval. Data
 Structures and Algorithms. Prentice Hall. 1992.
- Witten, Ian H., Alistar Moffat, and Timothy C. Bell: Managing Gigabytes.
 Compressing and Indexing Documents and Images. Van Nostrand Reinhold. 1994.

In reference to MS Access although not focusing on dictionaries there
were two books recommended:
- Rob, Peter and Treyton Williams: Database Design and Application
 Development with Microsoft Access 2.0. New York, London: McGraw-Hill.
 1995. (ISBN: 0070530513)
- Ortmann, Dirk: Access 2.0 fuer Datenbankentwickler. Muenchen: Hanser
 (= Hanser Programmier Praxis.) 1995. (ISBN: 3-446-18122-9) [German]

This is the first summary I have written to a mailing list so far. If
this one is too short, too long, too imprecise, etc. please tell me. Although
I have looked at several others before composing it I am not sure if it
fulfills its purpose.

- ----------------------------------------------------------------
 Jakob Fix, University of Kent at Canterbury,
- ----------------------------------------------------------------
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue