Featured Linguist!

Jost Gippert: Our Featured Linguist!

"Buenos dias", "buenas noches" -- this was the first words in a foreign language I heard in my life, as a three-year old boy growing up in developing post-war Western Germany, where the first gastarbeiters had arrived from Spain. Fascinated by the strange sounds, I tried to get to know some more languages, the only opportunity being TV courses of English and French -- there was no foreign language education for pre-teen school children in Germany yet in those days. Read more



Donate Now | Visit the Fund Drive Homepage

Amount Raised:

$34168

Still Needed:

$40832

Can anyone overtake Syntax in the Subfield Challenge ?

Grad School Challenge Leader: University of Washington


Publishing Partner: Cambridge University Press CUP Extra Publisher Login
amazon logo
More Info

Software Details

Title: Final Second HAREM Resources available
Submitter: Hugo Gonçalo Oliveira
Description: [Portuguese below]

Dear colleagues,

We are happy to announce that the resources created in the scope of the
Second HAREM (www.linguateca.pt/HAREM/), a joint evaluation contest for
named entity recognition in Portuguese, are now available at
http://www.linguateca.pt/HAREM/PacoteRecursosSegundoHAREM.zip, and include:

- The Second HAREM collection and its metadata (1,040 documents in
Portuguese, from Brazil and Portugal)

- The three golden collections created

Second HAREM GC: 129 documents from the HAREM collection whose 7,747 named
entities were manually annotated according to HAREM guidelines (10 categories)

TEMPO GC (a subset of Second HAREM GC): 30 documents with 1,490 NEs that,
in addition to the Second HAREM GC information, have also been manually
annotated according to the TEMPO guidelines for finer analysis and temporal
normalization

ReRelEM GC (a subset of TEMPO GC): 12 documents, whose 572 NEs, in addition
to the two types of annotation just mentioned, have also been manually
annotated with semantic relations between named entities, according to the
ReRelEM guidelines

- The evaluation programs developed

- The runs by the participating systems

All these resources are available at the HAREM website, and they can be
used in the SA(H)ARA web service (http://www.linguateca.pt/HAREM -- click
in 'Avaliador'), which allows the remote evaluation of new runs.

Your feedback is welcome!

The Second HAREM organization

Diana Santos, Cláudia Freitas, Hugo Oliveira, Paula Carvalho and Cristina Mota

--------------------

[Caros colegas,

É com enorme satisfação que anunciamos a disponibilização da Lâmpada, o
pacote de recursos finais criados no âmbito do Segundo HAREM, a segunda
edição da avaliação conjunta em reconhecimento de entidades mencionadas em
português (http://www.linguateca.pt/HAREM).

A Lâmpada, acessível de
http://www.linguateca.pt/HAREM/PacoteRecursosSegundoHAREM.zip, compreende:

A - a Colecção HAREM e respectivos metadados, constituída por 1.040
documentos

B - as três colecções douradas (subconjuntos da Colecção HAREM),
designadamente:

1) a colecção dourada do HAREM clássico, com 129 documentos e 7.747
EM, manualmente anotadas de acordo com as directivas do HAREM (numa grelha
de 10 categorias e respectivos tipos e subtipos)

2) a colecção dourada do TEMPO, um subconjunto da CD anterior, com 30
documentos e 1.490 EM, que, além dos atributos da CD do HAREM clássico, têm
ainda associada informação sobre normalização temporal e outros atributos
temporais mais finos, manualmente anotados de acordo com as directivas do TEMPO

3) a colecção dourada do ReRelEM, um subconjunto da CD anterior, com 12
documentos e 572 EM, que, além dos atributos das CD anteriormente
referidas, têm anotadas as relações que as diferentes EM podem estabelecer
entre si, de acordo com as directivas do ReRelEM

C - os programas de avaliação desenvolvidos para o Segundo HAREM

D - as corridas produzidas pelos sistemas participantes

Todos estes recursos estão naturalmente disponíveis no sítio do HAREM,
juntamente com o serviço SA(H)ARA (http://www.linguateca.pt/HAREM -
escolher 'Avaliador'), que permite a avaliação remota de novas participações.

Agradecemos, desde já, todo o retorno que nos possam dar!

A organização do Segundo HAREM]
Linguistic Field(s): Computational Linguistics

LL Issue: 19.3587
Date Posted: 23-Nov-2008

Search Again

Back to Software Index