Featured Linguist!

Jost Gippert: Our Featured Linguist!

"Buenos dias", "buenas noches" -- this was the first words in a foreign language I heard in my life, as a three-year old boy growing up in developing post-war Western Germany, where the first gastarbeiters had arrived from Spain. Fascinated by the strange sounds, I tried to get to know some more languages, the only opportunity being TV courses of English and French -- there was no foreign language education for pre-teen school children in Germany yet in those days. Read more



Donate Now | Visit the Fund Drive Homepage

Amount Raised:

$34724

Still Needed:

$40276

Can anyone overtake Syntax in the Subfield Challenge ?

Grad School Challenge Leader: University of Washington


Publishing Partner: Cambridge University Press CUP Extra Publisher Login
amazon logo
More Info


New from Oxford University Press!

ad

What is English? And Why Should We Care?

By: Tim William Machan

To find some answers Tim Machan explores the language's present and past, and looks ahead to its futures among the one and a half billion people who speak it. His search is fascinating and important, for definitions of English have influenced education and law in many countries and helped shape the identities of those who live in them.


New from Cambridge University Press!

ad

Medical Writing in Early Modern English

Edited by Irma Taavitsainen and Paivi Pahta

This volume provides a new perspective on the evolution of the special language of medicine, based on the electronic corpus of Early Modern English Medical Texts, containing over two million words of medical writing from 1500 to 1700.


Summary Details


Query:   GoldVarb (addendum)
Author:  Robert Sigley
Submitter Email:  click here to access email
Linguistic LingField(s):   Linguistic Theories

Summary:   In writing to Mario, I referred to (and included a copy of) Ch.7 of my
PhD thesis [Sigley, R. 1997. Choosing Your Relatives: Relative Clauses
in New Zealand English. PhD thesis, Victoria University of Wellington,
New Zealand.] This chapter compares logistic/Varbrul analysis with
more ordinary chi-squared tests on crosstabulated data; it's intended
as a practical guide to interpreting the GoldVarb output.

My email to Marco was a summary of that material, with additional
speculations, one of which was certainly wrong as stated (see below).
I write now so that anyone wishing to discuss details with me can do so
directly (email: Sigley@ic.daito.ac.jp).

(i) The number of degrees of freedom in a logistic or loglinear model =
(the number of independently estimated parameters - the number of fixed
parameters).

Question: Is this equal to (number of factors) - (number of factor groups),
as Avila states, or to (number of factors + 1) - (number of factor groups)?
In other words, does the 'input weight' (which is also iteratively
estimated) count?

(ii) The comment I made in parentheses below is inaccurate.

>It is possible to use this method to incorporate several interaction
>effects into the model -- but it quickly becomes rather cumbersome, as you
>will often have to collapse distinctions in order to include the
>crossproduct factor group, and things get really messy when you need to
>consider several interactions involving the same factor group. (I think the
>best way to treat these is stepwise: if the most significant interaction is
>between groups 1 and 2, and you suspect there's also an interaction between
>groups 1 and 3, you can only approach it indirectly by comparing models
>containing 1*2, 3, 4,...n and 1*2*3, 4,...n. By contrast, if you try
>constructing a model containing 1*2, 1*3, 4,...n then you've effectively
>encoded the distinctions from group 1 twice, which means your model has
>redundant parameters and could produce unreliable results.)

Here I was trying to reconcile differences between what I know in theory
and what seems to work in practice, and managed a rather garbled account; a
fuller explanation follows.

Suppose we're comparing the models:

(a) 1*2, 3, 4, ... , n (a model containing the interaction effect between
groups 1 and 2, but treating every other factor group as independent)

(b) 1*2, 1*3, 4, ... , n ( a model containing independent interactions
between groups 1 and 2, and groups 1 and 3)

(c) 1*2, 1*3, 2*3, 4, ... , n (containing independent 2-way interactions
for groups 1 and 2, 1 and 3, 2 and 3)

(d) 1*2*3, 4, ... , n (containing the 3-way interaction for groups 1, 2 and 3)

In theory:

To test the significance of adding the 1*3 interaction to a model
containing the 1*2 interaction, you should compare models (a) and (b).

To test the significance of further adding the 2*3 interaction, you should
compare models (b) and (c).

To test the significance of the 3-way 1*2*3 interaction, you should compare
models (c) and (d).

These models show increasing complexity, and an increasing number of
independently-estimated parameters, from (a) < (b) < (c) < (d).

In practice: this doesn't always work, for several reasons.

* Crossproducts often contain many apparently categorical environments
('knockouts') -- mostly because of low cell occupancy, but also because
of systematic gaps -- which must be excluded or collapsed for analysis.
Performing these simplifications sometimes produces nonsensical results.
I've often found that a model containing a 3-way interaction contains
*fewer* independently-estimated parameters than the supposedly
'simpler' model containing the 3 2-way interactions -- once
knockouts are excluded. Thus *in some cases* you won't be able to use
the recommended model test, and some more indirect approach will be
necessary.

* Crossproducts often contain a large number of factors. This may mean that
the overall model has a higher number of parameters than is justified by
the number of tokens in the dataset. Thus, accidental redundancy (where
several combinations of factors describe the same set of tokens) may
result. This is particularly likely when you include two factor groups
based partly on the same distinctions (eg the 1*2, 1*3 crossproducts,
which will both partition the dataset along the divisions from the
original group 1). I must emphasise that including such crossproducts of
shared factor groups does not necessarily result in redundancy (in contrast
to what my original statement implied) -- but it does make it more likely.

Cheers,
Robert Sigley.
+-----------------------------------------------+
| Robert Sigley, Foreign Languages Dept |
| (English Division), Daito Bunka University, |
| 1-9-1 Takashimadaira, Itabashi-ku, Tokyo 175 |
+-----------------------------------------------+

LL Issue: 9.1476
Date Posted: 22-Oct-1998
Original Query: Read original query


Back

Sums main page