LINGUIST List 7.1633

Wed Nov 20 1996

Sum: Subject Deletion

Editor for this issue: Ljuba Veselinova <>


  1. Dawn Harvie, Sum: Subject Deletion

Message 1: Sum: Subject Deletion

Date: Sat, 16 Nov 1996 13:21:58 EST
From: Dawn Harvie <>
Subject: Sum: Subject Deletion

I recently submitted a query to Linguist List regarding (1)
references for overt subject vs null subject in English and (2)
the differences/similarities between GOLDVARB and VARBRUL.

I am grateful to the following people for all their help:

Catherine N. Ball <>
Julia Barron <>
Richard Cameron <U17819UICVM.UIC.EDU>
Sharon Cote <>
Paul Hirschb=FChler <>
Elsa Lattey <>
Judith Liskin-Gasparro <>
Susan Pintzuk <>
Shana Poplack <>
Patrick Schindler <>
Robert Sigley <>
Carmen Silva-Corvalan <>

I would especially like to thank Sharon Cote for sending me a
draft of her dissertation which is not yet on-line and for useful


With regard to null subjects in English, references include:

Cameron, Richard. 1996. A community-based test of a linguistic
hypothesis. Language in Use, 25, 61-111.

Cote, Sharon. 1996. Grammatical and Discourse Properties of
Null Arguments in English. Ph.D. dissertation. University of
Pennsylvania. This will soon be available at: =

Haegeman, Liliane. 1990. Non-overt subjects in diary contexts.
In Mascaro, J. and M. Nespor (eds.). Grammar in Progress: GLOW
Essays for Henk van Riemsdijk. Dordrecht: Foris Publications.

Lattey, Elsa. 1980. Grammatical Systems Across Languages: A
Study of Participation in English, German and Spanish.
Ph.D. dissertation. City University of New York. University
Microfilms 8023716.

Massam, Diane. 1989. Middles, Tough and Recipe Constructions:
Licensing of Null Objects and Non-Thematic Subjects. Ms.,
University of Toronto.

Rizzi, Luigi. 1992. Early Null Subjects and Root Null
Subjects. Ms., University of Geneva.

Roberge, Yves. 1990. The Syntactic Recoverability of Null
Arguments. Montreal: McGill-Queen's University Press.

Silva-Corvalan, Carmen. 1982. Subject expression and placement
in Mexican-American Spanish. In Amastae, J. and
L. Elias-Olivares. Spanish in the United States: Sociolinguistic
Aspects. Cambridge, NY: Cambridge University Press.

Silva-Corvalan, Carmen. 1994. Language Contact and Change:
Spanish in Los Angeles. Oxford: Clarendon Press.


With regard to VARBRUL and GOLDVARB:

**Thanks to Robert Sigley for his detailed description:

There is not much difference between GoldVarb (which I've been
using) and VARBRUL 2S (which I've seen in action), mainly because
that version of VARBRUL was used to make GoldVarb.

GoldVarb has the following properties and limitations:
* comparatively user-friendly
* one-level analysis (calculates factor effects)
* step-up/step-down regression analysis (identifies significant
 factor groups)
* doesn't appear to be subject to VARBRUL's limits on data
 size. (At least, I have yet to come up against any such limit,
 and I'm dealing with 15000 tokens.)


* limited to binary rules

In practice, this should not affect you at all, as you are
dealing with a binary opposition (overt/ null subject).

* slow compared to VARBRUL
Running speed depends on model complexity (how many factor groups
you add in) and hardware.

As a sample: I have been running pathologically complicated
models (15-20 factor groups, over 100 different factors, over
3000 cells).

Mac SE - runs either don't complete or take months (literally) to run
 - even one-level runs can take a day or so

Mac LC - runs complete in a week or so

 - one-level runs take up to an hour

Mac LC475 - runs complete overnight (I discovered this by
accident near the end of my research )-:
Early Powermacs don't do significantly better than a plain LC.

VARBRUL 2S can do a one-level model of rules with 3 or 4 output
values, but it can't do step-up/step-down regression on these (so
you can't find out which group effects are significant, and which
are not). It is otherwise identical to GoldVarb other than in
running faster and in being harder to use. (A compromise can be
achieved by using GoldVarb to produce a cell file, and then
running this through VARBRUL.) However, the most widely available
versions have hard-coded limits on the complexity of your data
(the number of cells possible).

VARBRUL 3 - if you can get hold of it - reportedly has the following
additional features:

* calculates effects for *continuous* as well as categorical
 factors There is nothing mathematically complicated about this,
 as logistic regression is *supposed* to be able to handle a
 mixture of categorical and continuous factor groups, and this
 is its main advantage over log-linear modelling. I remain
 confused as to why earlier versions of VARBRUL *didn't* allow

* (possibly?) includes some sort of cluster analysis algorithm to
 identify subsets of the population who behave similarly
 (Rousseau & Sankoff's (1978) account of this is not clear)

However, it is not in wide use, perhaps because of system
requirements (it presumably runs considerably slower than 2S when
performing these functions?)

Data entry into VARBRUL 3 must also be a much more harrowing
business than into GoldVarb. Whereas GoldVarb can accept
unformatted ASCII text (so in fact you could do all of your
coding within your raw dataset and import the lot into GoldVarb
if you wanted to), this probably isn't true of a program which
recognises continuous numerical data (at the very least you'd
have to mark off fields for each factor).

**Thanks to Susan Pintzuk <>, who wrote:

Re your inquiry to the Linguist List: VARBRUL is the term used
for any of the versions of the variable rule program first
developed, implemented, and used by David Sankoff, Bill Labov,
Pascale Rousseau, and Henrietta Cedergren, among very many
others. GOLDVARB is the Macintosh version, IVARB is the PC
version (it runs under DOS but not Windows). IVARB and GOLDVARB
do the same thing on different machines, so neither program is
more "sophisticated" than the other, although GOLDVARB is more

Below are the instructions for retrieving IVARB by anonymous ftp;
the same instructions have been made available by David Rand (of
GOLDVARB fame) over the World Wide Web:

Varbrul via anonymous ftp

The complete varbrul package for MS-DOS (executable code plus
documentation) is now available via anonymous ftp, as follows:

 Connect to
 Go to directory pub/ldc/misc_sw
 Set transfer mode to binary
 Get file varbrul.tar.Z

This is a compressed UNIX tar file. When you run the command

 uncompress varbrul.tar.Z

an uncompressed version, varbrul.tar, will be created. When you run the

 tar xf varbrul.tar

This will create a "varbrul" directory in your current working
directory, which will contain all the files (20 of them) in the

**Judy Liskin-Gasparro <> wrote:

The person to write to re. VARBRUL is Dennis Preston at Michigan STate

**Dr. Shana Poplack <> uses GOLDVARB, as
do the graduate students at the University of Ottawa. It is
considered more user friendly than VARBRUL. Dr. David Sankoff
and Dr. David Rand (e-mail: CRMCC.UMontreal.CA - I don't know if
this is a current address) have used VARBRUL. Dr. Sankoff has a
1987 article in Sociolinguistics: An international handbook of
the science of language and society edited by Ulrich Ammon,
Norbert Dittmar and Klauss J. Mattheier regarding Variable Rules
and a brief discussion about GOLDVARB and VARBRUL.

Thank you all!

Dawn Harvie,
University of Ottawa,
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue