Featured Linguist!

Jost Gippert: Our Featured Linguist!

"Buenos dias", "buenas noches" -- this was the first words in a foreign language I heard in my life, as a three-year old boy growing up in developing post-war Western Germany, where the first gastarbeiters had arrived from Spain. Fascinated by the strange sounds, I tried to get to know some more languages, the only opportunity being TV courses of English and French -- there was no foreign language education for pre-teen school children in Germany yet in those days. Read more

Donate Now | Visit the Fund Drive Homepage

Amount Raised:


Still Needed:


Can anyone overtake Syntax in the Subfield Challenge ?

Grad School Challenge Leader: University of Washington

Publishing Partner: Cambridge University Press CUP Extra Publisher Login

Discussion Details

Title: Language Immersion for Chrome and Alternatives
Submitter: Ziyuan Yao
Description: Google's ''Language Immersion for Chrome''

Recently a Chrome browser extension called ''Language Immersion for
Chrome'' has been much publicized. Developed by ''Use All Five Inc.''
on behalf of Google, the extension translates certain words and
phrases on the Web page you're browsing to a foreign language via
Google Translate, for the purpose of helping you learn that foreign
language while browsing the Web.

I have been researching this kind of thing for years, and one of my
main standpoints is machine translation shouldn't be used in serious
language learning as it is error-prone: it takes a learner a great effort
to memorize a piece of erroneous knowledge, another great effort to
''unlearn'' this wrong knowledge and yet another great effort to
''relearn'' the right knowledge.

But I do understand online machine translation services like Google
Translate and Bing Translator are so readily available that directly
using them to do the translation can minimize development costs. Upon
seeing this news, I asked myself: ''Can we use a kind of freely
available, manually prepared data, instead of machine translation, to
do this better?'' And the answer is YES!

A Better Idea

Imagine if we have a database of manually-translated bilingual
sentence pairs (such as those multilingual movie subtitle files on those
subtitle websites), e.g.

(German) Er ist ein guter Schüler.
(English) He is a good student.

Now if a German wants to learn English, and he happens to be
browsing a German Web page that contains the German word
''Schüler'' (student), and the computer finds out that this German word
also occurs in a bilingual sentence pair like the above. Now, the
computer can teach English for this German word, by inserting the
above bilingual sentence pair into that Web page, like an embedded
advertisement. This way, the German will learn the English word
''student'', and better yet, learn it in a bilingual sentence pair! This
means he will not only learn the word ''student'' alone, but also its
syntax, semantics and pragmatics, all implied by this example
sentence. As to phonetics, the computer can use text-to-speech to
read aloud the English sentence, or display some kind of pronunciation
guide above or alongside the English sentence (see my recent project
''Phonetically Intuitive English'' for such a pronunciation aid:

That's the basic idea. But of course we can further refine this idea. For
example, if there are multiple bilingual sentence pairs containing
''Schüler'', the computer can prefer a pair that contains words that
appear near ''Schüler'' on the Web page (i.e. context words). This
would be very useful if the word in question (Schüler) is ambiguous.

Besides bilingual sentence pairs, we may also explore multilingual data
from Wiktionary and Wikipedia, although their usage may not be as
straightforward as the model discussed above. I leave this as
homework for the reader.

I also intend to develop a Chrome extension based on the idea
discussed above :-). I would be interested in hearing other's viewpoints
and perspectives on this concept and its development.

Best Regards,
Ziyuan Yao
Date Posted: 16-May-2012
Linguistic Field(s): Computational Linguistics
Language Acquisition
LL Issue: 23.2347
Posted: 16-May-2012

Search Again

Back to Discussions Index