LINGUIST List 14.2302

Tue Sep 2 2003

Qs: Markup Examples; Copula/Non-canonical Subjects

Editor for this issue: Naomi Fox <>

We'd like to remind readers that the responses to queries are usually best posted to the individual asking the question. That individual is then strongly encouraged to post a summary to the list. This policy was instituted to help control the huge volume of mail on LINGUIST; so we would appreciate your cooperating with it whenever it seems appropriate. In addition to posting a summary, we'd like to remind people that it is usually a good idea to personally thank those individuals who have taken the trouble to respond to the query. To post to LINGUIST, use our convenient web form at


  1. peetm, Markup Examples
  2. J-C Khalifa, Qs: non-canonical subjetcts in corpora (2), BE

Message 1: Markup Examples

Date: Mon, 1 Sep 2003 19:24:47 +0100
From: peetm <>
Subject: Markup Examples


I'm really interested in seeing alternative mark-ups of the following

"Time flies like an arrow whereas fruit flies like a banana"

I know that 'accurate' is entirely subjective - and down to the tagger
- but - I'd like to see samples of mark-ups produced by this
sentence, 'accurate' or not (preferably with an explanation of the
mark-up used: methododology/tag set - or with links to the same).

Many thanks,



addr: Computational Linguistics Group
 University of Oxford
 The Clarendon Institute
 Walton Street
 OX1 2HG

Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: Qs: non-canonical subjetcts in corpora (2), BE

Date: Tue, 02 Sep 2003 13:48:15 +0200
From: J-C Khalifa <>
Subject: Qs: non-canonical subjetcts in corpora (2), BE

It is well-known in logico-linguistic literature that the copula BE
may take at least 3 values: identification (i.e. Venus is the morning
star), belonging to a set (i.e. Venus is a planet) and inclusion of
one set in another (i.e. dogs are mammals). My question is, are there
any languages that have 3 different morphemes more or less covering
the 3 values? Spanish has 2, ser and estar, but I don't know of any
examples of languages having 3 "verb BE". Can anyone direct me to
references on this ?

I'm also repeating, if I may, a question I sent early in the summer,
and which, to my surprise and dismay, was left unanswered. Maybe too
few people got to read it on account of holidays, or maybe it was
answered after all, but replies were bounced, indeed I had quite a lot
of computer/server woes during the summer, so I'm trying again on the
off chance... Here it was:

>I'm starting a piece of work on non-canonical subjects in English (i.e. 
>mainly finite & non-finite clauses, PPs and the like). I was wondering 
>whether anyone could direct me to published or unpublished studies on the 
>frequency of such subjects in corpora. I'm quite sure there must have been 
>some work on the relative frequency of subjects by type (pronouns, NPs, 
>complex NPs, and hopefully non-canonical ones), but I must admit I'm quite 
>lost in the spate of corpus linguistics studies that have been published 
>in the past few years. Any tips on this? I'll post a summary if that 
>proves useful.

All the very best,

 Jean-Charles Khalifa 
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue