Editor for this issue: T. Daniel Seely <dseely
emunix.emich.edu>
A little while ago I asked for information about grammar development environments (GDEs) as teaching tools. The particular questions I had were as follows: 1. Information on availability 2. Hardware/software requirements 3. Range of formalisms supported 4. Extent of customisability if any 5. Ease of writing and testing grammars 6. Quality of display of analyses on screen 7. Ease of debugging grammars 8. Speed and reliability 9. General user-friendliness 10. How used in teaching, and with what kind of students I'm very grateful to those who responded, viz.: Melina Alexa: alexaMail to author|Respond to list|Read more issues|LINGUIST home page|Top of issuedarmstadt.gmd.de Charles Boisvert: charles
ccl.umist.ac.uk Ian Crookston: I.Crookston
uk.ac.lmu Chris Culy: cculy
edu.uiowa.weeg.vaxa Mary Dalrymple: dalrymple
parc.xerox.com Stephen Nightingale: night
uk.ac.ed.ling Bilge Say: say
bilkent.edu.tr Nancy Underwood: nancy
dk.ku.cst Martin Volk: volk
ch.unizh.ifi Shuly Wintner: shuly
cs.technion.ac.il - -------------------------------- Here is a summary of the systems available, and the information I received: - - A useful web address is Natural Language Software Registry Homepage at http://cl-www.dfki.uni-sb.de/cl/registry/draft.html#top - - `Syntactica' by Richard Larson et al. is a teaching tool designed for undergraduates. See MIT Press linguistics catalogue. For PCs running NeXTStep or NeXT stations, but a Windows 95 version is apparently promised for next year. - - Chris Culy has writen a simple GDE for CFGs using HyperCrad on the Mac. It's shareware (US$10). It's available through: http://www.uiowa.edu/~linguist/classes/lfr-fall96/index.html - - SIL has a parser/lexicon as part of its PC-PATR. See: http://www.sil.org/ - - Linguistic Instruments in Gothenburg have written some grammar development environments for the Mac. The formalisms covered are CFG, PATR, categorial grammar and DCG. The user interface is fairly friendly, with nice graphics for structures. This is commercial software; a site licence for the whole system costs $500. Contact: lager
se.gu.ling. - - A number of LFG-specific systems are available. Here is a summary of the information I received from Mary Dalrymple: a. The Xerox LFG Grammar Writer's Workbench is a complete parsing implementation of the LFG syntactic formalism, including various features introduced since the original KB82 paper (functional uncertainty, functional precedence, generalization for coordination, multiple projections, etc.). Runs under DOS on PCs and on most Unix systems. Contact Ron Kaplan (kaplan
parc.xerox.com) b. Avery Andrews has written a small LFG system that runs on PC's (XT's, in fact), that is basically orientated towards producing small fragments to illustrate aspects of grammatical analysis in basic LFG. Contact: Avery.Andrews
anu.edu.au Available from: http://www.anu.edu.au/linguistics/software/lfg20.exe c. Charon is available from ftp.ims.uni-stuttgart.de in the directory /pub/Charon. Requires Unix and a Prolog which conforms to the Edinburgh syntax. Contact: Dieter Kohl (dieter
ims.uni-stuttgart.de) d. The Konstanz LFG Workbench is a simple 'LFG-Workbench' that is used for an introductory LFG course. It accepts syntax rules in very nearly conventional notation with simple functional equations (no boolean operators) and constraint equations (=c), and allows you to project an f-structure to a set of semantic implications via lexical semantics written in a Prolog-like notation. Contact Bruce Mayo (bruce.mayo
pan.rz.uni-konstanz.de) - - XGrammarTool (from GMD IPSI) is a Smalltalk-based toolkit and has a general top-down parser for user-specified grammars which can be written in a BNF-like language. It hasn't really been used in teaching, though. It needs VisualWorks (Smalltalk) from ParcPlace, either 2.0 or 2.5. A short description can be found as part of a paper in Electronic Publishing vol 6, 1993, pp. 495-505. Contact rostek
de.gmd.darmstadt. - - See COLING 1996 Proceedings (vol2, pp 1057-60) for a description of the GATE project from Sheffield. - - Bob Carpenter's ALE is available at http://macduff.andrew.cmu.edu/ale/ Requires Quintus/Sicstus Prolog, but nothing else. This provides for writing grammars using typed feature structures, in anything from PATR to HPSG and CCG. It's very reliable, and there's a very good, 100-page long user manual. A home page is dedicated to ALE with lots of information. The manual is available in HTML. There's also a Web site titled "Course Notes on HPSG in ALE", by Colin Matheson, which can be extremely useful if you're planning to teach HPSG. The URL is: http://www.ltg.hcrc.ed.ac.uk/projects/ledtools/ale-hpsg/index.html ALE requires a fair amount of linguistic and formal sophistication, however. - - Micro-NLP is a system specifically designed to teach grammar development and aspects of parsing. It is written by Charles Boisvert at UMIST. The answers to my questions (see above) are: 1. Contact charles
ccl.umist.ac.uk. Also http://www.ccl.umist.ac.uk/charles/micro-nlp.html 2. Runs on SICStus 2.1#7 on SUNs 3. Context free unification grammars, with Prolog term or feature:value set unification. Disjunction and negation of (atomic) values. Several example grammars are included: - a DCG - a context-free grammar using features - a grammar with a slash feature for gap-threading - a lexical-based grammar with 2 non-terminals (and a head-corner generator) 4. At a basic level, traces can be switched off and controlled (like Prolog traces). There are 2 parsers and lots of different ways to print feature structures. A Prolog programmer could re-use elements and make e.g. their own parsers, or use the feature structures for completely different purposes. 5. Grammars are edited in a text editor and saved/consulted like programs. There is no need to compile the grammar before testing. For feature sets, I follow the syntax of Gazdar & Mellish, which is described in their textbook. Testing is similar to what you obtain from Prolog, so you can analyse strings, generate strings with given characteristics, look at alternative parses/generated phrases. 6. It is text rather than graphics, but there is a very robust routine for pretty printing mixtures of terms and feature structures. 7. The tracers let you step through a top-down or a left-corner parser, which is a good way to identify bugs (if using one parser gives no clue, try the other one). The possibility to generate phrases is also useful. Last, because the code is interpreted, it is easy to make small changes at a time, and debug progressively, in a edit-save-consult-test iterative process. 8. Efficiency has not been my main concern with this system. The parsers were written to step through them and the grammars are interpreted, so it is not surprising if it is slow. On my toy grammars, which have in the order of 20-30 lexical entries and up to ten rules, parsing times vary from 0.1 to several seconds. 9. Micro-NLP isn't a mouse-and-graphic system, but its flexibility is user-friendliness: easy to write grammars, easy to run, easy to read the results and the processes, easy to relate to Prolog if required. Because of that flexibility, a user who has done a bit of Prolog would feel at home, but vice-versa, because of the good presentation of data structures, a user could also start with Micro-NLP, become familiar with search techniques thanks to the high level tracing and good presentation, then move on to Prolog. - - GTU (``Grammatik Testumgebung'') is a large GDE for German written at the University of Koblenz. See:
InProceedings{Volk95b, author = "M. Volk and M. Jung and D. Richarz", title = "{GTU - A workbench for the development of natural language grammars}", booktitle = "Proc. of the Conference on Practical Applications of Prolog", year = 1995, pages = {637-660}, address = "Paris" } Questionnaire answers: 1. GTU can be obtained for a nominal fee from the University of Koblenz. Contact Dirk Richarz at richarz
informatik.uni-koblenz.de 2. GTU runs on SUN Workstations. It is compiled SICStus-Prolog code. 3. DCG (with feature structures), ID/LP (with feature structures), LFG, GPSG 4. Many features can be switched on/off. Among them: - checking selectional restrictions - computing logical forms - access to 3 different lexicon accesses - access to two different test suite interfaces 5. flexible lexicon interface (three different lexicons can be hooked to any grammar), test suite administration tool, special hypertext help 6. Graphic display of c-structure-trees, f-structures, feature structures 7. - GTU includes static grammar checks (e.g. for circular LP-rules). - GTU has a tool to compare parse-trees with previously computed parse-trees. - The test suite can be partioned and fed into any of the parsers. 8. Very reliable. Very fast for small grammars and small lexicons. 9. Generally regarded as high. Although GTU has now grown into a complex system that takes some time to learn. 10. CL students in the course ``Methods of syntax analysis'' are asked to write grammars (or compile test sentences) for special syntactic phenomena. - - LRP/C - Masaru Tomita's Parser Compiler environment. Here (abbreviated by me) are the questionnaire answers from Shuly Wintner: 1. Not clear - should be available from CMU - or check with Alon Lavie, alavie+
f.gp.cs.cmu.edu. 2. The system is written in Lisp. You should have a common lisp environment. I managed to run it on various versions of Sun machines, under various versions of Unix. 3. LRP/C was designed with LFG in mind, but it can be used for a wide range of phrase structure grammars. It has built-in unification and a hook to Lisp. 4. Source code available. 5. Not bad. A simple but useful debugging tool, tracer etc. 6. Poor - all output is linearly displayed, as (very long) lists. 7 and 8. Performance wasn't bad, as far as I remember, but the system used to behave strangely on very complex grammars (I have written an extensive grammar for Hebrew using it). For teaching purposes I can't foresee any problem at all. 9. So-so. There used to be a user's manual and a technical report describing the system. No on-line help of any kind. 10. Mostly undergraduate CS students with no knowledge of linguistics. === Paul Bennett UMIST