Publishing Partner: Cambridge University Press CUP Extra Wiley-Blackwell Publisher Login
amazon logo
More Info


New from Oxford University Press!

ad

Language Planning as a Sociolinguistic Experiment

By: Ernst Jahr

Provides richly detailed insight into the uniqueness of the Norwegian language development. Marks the 200th anniversary of the birth of the Norwegian nation following centuries of Danish rule


New from Cambridge University Press!

ad

Acquiring Phonology: A Cross-Generational Case-Study

By Neil Smith

The study also highlights the constructs of current linguistic theory, arguing for distinctive features and the notion 'onset' and against some of the claims of Optimality Theory and Usage-based accounts.


New from Brill!

ad

Language Production and Interpretation: Linguistics meets Cognition

By Henk Zeevat

The importance of Henk Zeevat's new monograph cannot be overstated. [...] I recommend it to anyone who combines interests in language, logic, and computation [...]. David Beaver, University of Texas at Austin


Query Details


Query Subject:   Index of synthesis data
Author:   Hugo Cesar de Castro Carneiro
Submitter Email:  click here to access email

Linguistic LingField(s):  Morphology
Syntax

Query:   My M.Sc. thesis is called ''The function of the index of synthesis of the
languages in part-of-speech tagging with weightless artificial neural
networks''.

In this thesis my motivation is based on ''like vs. gostam (Portuguese for
''they like'')'' paradigm. In which ''like'' has an ambiguous part of
speech, as it can be a preposition, a conjunction, a verb or even other
part of speech, needing to have a word like ''they'' adjacent to it in
order to help readers to know that it is a ''verb'' (in this context). On
the other hand, ''gostam'' in Portuguese is always a verb, as the ''-am''
suffix informs the reader that ''gostam'' is really a verb.

So, I am testing a system I've developed in 5 languages: Mandarin Chinese,
English, Portuguese, German and Turkish (from the most isolating language
to the most synthetic). And when I get the information I need from these 5
languages, I will test the system in 4 others: Thai (more synthetic than
Mandarin Chinese and more isolating than English), Japanese (more synthetic
than English and more isolating than Portuguese), Italian (more synthetic
than Portuguese and more isolating than German) and Russian (more synthetic
than German and more isolating than Turkish).

But I have one problem: The indices of synthesis of these languages are
only estimated by me, and maybe even their order is somewhat wrong (is
Portuguese or German the most synthetic?).

I would like to know if someone can help me find an index of synthesis of
these languages? Or where can I get a text in each of these languages with
all words with each of their morphemes separated?

I am concluding my master studies this year, but I need to send a paper to
a journal before I get my M.Sc. in Computer Science degree.
LL Issue: 22.4036
Date posted: 15-Oct-2011



Back

Sums main page