* * * * * * * * * * * * * * * * * * * * * * * *
LINGUIST List logo Eastern Michigan University Wayne State University *
* People & Organizations * Jobs * Calls & Conferences * Publications * Language Resources * Text & Computer Tools * Teaching & Learning * Mailing Lists * Search *
* *
LINGUIST List 21.1623

Sun Apr 04 2010

Jobs: English & Computational Linguistics: Scientist, Intelius, Inc.

Editor for this issue: Erin Smith <erinlinguistlist.org>

The LINGUIST List strongly encourages employers to use non-discriminatory standards in hiring policy. In particular we urge that employers do not discriminate on the grounds of race, ethnicity, nationality, age, religion, gender, or sexual orientation. However, we have no means of enforcing these standards.

Job seekers should pay special attention to language in ads regarding employment requirements and are encouraged to consult our international employment page at http://linguistlist.org/jobs/jobnet.html. This page has been set up so that people can report on the employment standards of various countries.

To post to LINGUIST, use our convenient web form at http://linguistlist.org/posttolinguist.cfm
        1.    Andrew Borthwick, English & Computational Linguistics: Scientist, Intelius, Inc., Bellevue, Washington, USA

Message 1: English & Computational Linguistics: Scientist, Intelius, Inc., Bellevue, Washington, USA
Date: 02-Apr-2010
From: Andrew Borthwick <aborthwickgmail.com>
Subject: English & Computational Linguistics: Scientist, Intelius, Inc., Bellevue, Washington, USA
E-mail this message to a friend

University or Organization: Intelius, Inc.
Department: Data Research
Job Location: Washington, USA
Web Address: http://www.intelius.com
Job Rank: Scientist

Specialty Areas: Computational Linguistics; Record Linkage

Required Language(s): English (eng)


Record Linkage and Information Extraction Engineer/Scientist

Your first task will be to develop and deploy advanced algorithms for
determining which profiles in Intelius' database of over 300 million
records refer to the same individual. For instance, which of the over 3,000
'Richard Jones' profiles on our site are about the same person? This
challenging problem is variously known as person matching, record linkage,
data deduplication, and entity resolution.

Secondly, you will work to enhance and extend Intelius' system for doing
information extraction from biographical paragraphs. An example problem
would be to extract education, job title, image, and descriptive snippets
for each individual from a page such as

This is an exciting opportunity to work with a very large database, a large
crawl of the web, ample hardware, and a great team of engineers focused on
these problems. The position will be at our headquarters in Bellevue,
Washington and will offer a competitive salary and great benefits package.

Research and develop massively scalable algorithms to reduce the degree of
duplication in Intelius' profile database while minimizing false positives.
Enhance and extend Intelius' person information extraction algorithms. This
task requires high-precision solutions in a range of natural language
processing tasks including named entity identification, attribute
extraction, and intra-document coreference resolution. The candidate will
deploy this system to run on a massive web crawl.

Required Skills:
* Masters degree in computer science
* Experience in working with text data
* Strong hands-on skills in Java or Python
* Experience with complex regular expressions
* Detail oriented
* Track record of making sound assumptions in order to constrain and solve
ill-defined and complex problems

Desired Skills:
* Ph.D. in computer science or computational linguistics
* Experience in entity resolution, a.k.a. record linkage, cross-document
coreference, duplicate record detection
* Machine learning
* Experience with GATE or other open source NLP toolkits
* Prior startup experience
* Experience with Hadoop, web crawlers

We offer:
* Competitive Compensation
* 401K with Employer Match
* PPO Medical, Dental and Vision insurance plan for you & dependents
* Paid Vacation, Sick Leave and Holidays

Application Deadline:

Web Address for Applications: http://bit.ly/a8EQxy
Contact Information:
Metina Lidnin
Email: mlidninintelius.com

This Year the LINGUIST List hopes to raise $65,000. This money will go to help 
keep the List running by supporting all of our Student Editors for the coming year.

See below for donation instructions, and don't forget to check out our Space Fund 
Drive 2010 and join us for a great journey!


There are many ways to donate to LINGUIST!

You can donate right now using our secure credit card form at  

Alternatively you can also pledge right now and pay later. To do so, go to: 

For all information on donating and pledging, including information on how to 
donate by check, money order, or wire transfer, please visit: 

The LINGUIST List is under the umbrella of Eastern Michigan University and as 
such can receive donations through the EMU Foundation, which is a registered 
501(c) Non Profit organization. Our Federal Tax number is 38-6005986. These 
donations can be offset against your federal and sometimes your state tax return 
(U.S. tax payers only). For more information visit the IRS Web-Site, or contact 
your financial advisor.

Many companies also offer a gift matching program, such that they will match 
any gift you make to a non-profit organization. Normally this entails your 
contacting your human resources department and sending us a form that the 
EMU Foundation fills in and returns to your employer. This is generally a simple 
administrative procedure that doubles the value of your gift to LINGUIST, without 
costing you an extra penny. Please take a moment to check if your company 
operates such a program.

Thank you very much for your support of LINGUIST!

Read more issues|LINGUIST home page|Top of issue

Please report any bad links or misclassified data

LINGUIST Homepage | Read LINGUIST | Contact us

NSF Logo

While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.