LINGUIST List 7.371

Sat Mar 9 1996

Sum: Computerize Dialect Dictionary

Editor for this issue: T. Daniel Seely <dseelyemunix.emich.edu>


Directory

  1. "J. Fix", Re: Summary of Responses - Computerise Dialect Dictionary

Message 1: Re: Summary of Responses - Computerise Dialect Dictionary

Date: Fri, 08 Mar 1996 21:41:38 GMT
From: "J. Fix" <jf4ukc.ac.uk>
Subject: Re: Summary of Responses - Computerise Dialect Dictionary

On February 18th 1996 I posted a message asking for help on how to
computerize a German dialect dictionary. Finally I found the time to put
together a short summary of all responses.

People who answered were:

John Clifton <JMCliftonaol.com>
Will Dowling <willfranklin.com>
Matthias Heyn <100633.1517compuserve.com>
Jacques Van Keymeulen <Jacques.VanKeymeulenrug.ac.be>
Alexander King <adk8cdarwin.clas.virginia.edu>
Nenad Koncar <nk3doc.ic.ac.uk>
Wilfried Kuhn <100737.3261compuserve.com>
Andrea de Leeuw van Weenen <LeeuwvWRULLET.LeidenUniv.nl>
Robin Lombard <lombardlanglab.uta.edu>
Kazuto Matsumura <kmatsumtooyoo.l.u-tokyo.ac.jp>
Jon Mills <jon.millsluton.ac.uk>
Ole Norling-Christensen <olenccoco.ihi.ku.dk>
Elisabeth Seitz <elisabeth.seitzuni-tuebingen.de>
George Smith <gsmithzedat.fu-berlin.de>
C. M. Sperberg-McQueen <U35395UICVM.CC.UIC.EDU>
Julie Thornton <JTHORNTOeagle.call.gov>
Tony Vital <vitaledectlk.enet.dec.com>
Ralf Vollmann <ralfkfs.oeaw.ac.at>

I want to thank everyone very much indeed for their time, interest, and
patience in dealing with my queries.

The nature of my question makes it virtually impossible to give a
concise summary. Sorry, if I have collected the bits and pieces
here rather than offering a homogeneous overview.


SGML, TEI.
~~~~~~~~~
Quite a lot of replies recommended to look at SGML, the Standardized
General Markup Language. This is a kind of metalanguage which allows you
to create your own markup language. As far as I understand, you mark up the
data, either manually or automatically, and view it with an appropriate
program (comparable to HTML documents - one of those markup language based
on SGML - which is parsed and viewed by a WWW browser).

A recommended web site including many pointers to other SGML resources is:
http://www.sil.org/sgml/sgml.html

A recommended newsgroup is comp.text.sgml

Beside SGML as such, there is the Text Encoding Initiative (TEI) which
has published so-called TEI Guidelines intending to provide a kind of
standardized framework for text encoding for the humanities. For dictionary
people, especially interesting is chapter 12 on printed dictionaries.

TEI's web site is http://www-tei.uic.edu/orgs/tei
TEI's mailing list is TEI-L at LISTSERVUICVM.CC.UIC.EDU

Programs to use - among others I assume - in order to turn a dictionary
(or any other document) into SGML, viz. to use it once it is in electronic
form are
- "sgmls" (free)
- "Author/Editor" (SoftQuad, http://www.softquad.com)
- "XGML" (the company is called Exoterica, based in Canada,
 http://www.exoterica.com).
Special dictionary parsers are
- "DIPA" (used at the Danish Dictionary) and
- "LexParse" (used at the University of Tuebingen, Germany).

Other programs.
~~~~~~~~~~~~~~
Suggested and/or used by replicants to build databases (among them
dictionaries) are:
- "the SIL program Shoebox"
	unable to comment on this one
- Access (Microsoft; for Win)
	well-known RDBMS
- FileMaker Pro (Claris; for Mac and Win)
	as well
- HyperCard (for Mac)
	one of the first hypertext tools
- AskSam (for DOS)
	DBMS
- World Translator (for Win and Mac)
	look at http://www.net-shopper.co.uk/software/ibm/trans/index.htm
- Folio VIEWS
	"a free-text database management tool"; http://www.folio.com
	(educational price approx. 300 USD)
- MultiTerm (for Win)
	look at http://www.trados.com "a commercial product and market
	leader in the field of terminology database systems"

Misc.
~~~~
A suitable programming language to create a database that can
include graphics and sound seems to be LPA Win_Prolog.


Dictionary and similar projects I was referred to are:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- The New OED (http://bluebox.uwaterloo.ca/OED/index.html)
- The Danish Dictionary (email: olenccoco.ihi.ku.dk)
- Sound Database (email: ralfkfs.oeaw.ac.at)
- Dictionary of Gamilaraay/Kamilaroi (put on the W3 at
 http://coombs.anu.edu.au/WWWVLPages/AborigPages/LANG/GAMDICT/GAMDICT.HTM)
- De Woordenboek van de Vlaamse Dialecten (email:
 Jacques.VanKeymeulenrug.ac.be)
- Dictionary of the Slovene Language (no contact address)
- Atlante Linguistico del Ladino Dolomitico e Dialetti Limitrofi (ALD)
 (http://www.sbg.ac.at/rom/people/proj/ald/allgemei.htm)

Books.
~~~~~
An overview of electronic dictionaries in connection with SGML is given in
- Bergenholtz & Tarp (eds.): Manual of Specialized Lexicography. John
 Benjamins Publishers. 1995 (in particular, pp. 37-46).
 ISBN (Europe): 90 272 1612 6
 ISBN (USA): 1-55619 693-8

The following book was quite useful to get a first impression of SGML:
- van Herwijnen, Eric: Pracitcal SGML. 2nd edtion. Kluwer Academic
 Publishers. 1995. (ISBN: 0-7923-9434-8)

Two interesting and pretty specialized titles for the lexicographer are:
- Frakes, William B. and Ricardo Baeza-Yates: Information Retrieval. Data
 Structures and Algorithms. Prentice Hall. 1992.
- Witten, Ian H., Alistar Moffat, and Timothy C. Bell: Managing Gigabytes.
 Compressing and Indexing Documents and Images. Van Nostrand Reinhold. 1994.

In reference to MS Access although not focusing on dictionaries there
were two books recommended:
- Rob, Peter and Treyton Williams: Database Design and Application
 Development with Microsoft Access 2.0. New York, London: McGraw-Hill.
 1995. (ISBN: 0070530513)
- Ortmann, Dirk: Access 2.0 fuer Datenbankentwickler. Muenchen: Hanser
 (= Hanser Programmier Praxis.) 1995. (ISBN: 3-446-18122-9) [German]


This is the first summary I have written to a mailing list so far. If
this one is too short, too long, too imprecise, etc. please tell me. Although
I have looked at several others before composing it I am not sure if it
fulfills its purpose.


- ----------------------------------------------------------------
 Jakob Fix, University of Kent at Canterbury, jf4ukc.ac.uk
- ----------------------------------------------------------------
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue