LINGUIST List 21.207|
Wed Jan 13 2010
Internships: Comp Ling: Information Extraction, Intelius, California, USA
Editor for this issue: Matthew Lahrman
The LINGUIST List strongly encourages employers to use non-discriminatory standards in hiring policy. In particular we urge that employers do not discriminate on the grounds of race, ethnicity, nationality, age, religion, gender, or sexual orientation. However, we have no means of enforcing these standards.
To submit an internship, use our convenient web form at http://linguistlist.org/internship
Computational Linguistics: Information Extraction Graduate Summer Intern, Intelius, Redwood City, California, USA
Message 1: Computational Linguistics: Information Extraction Graduate Summer Intern, Intelius, Redwood City, California, USA
From: Andrew Borthwick <aborthwickintelius.com>
Subject: Computational Linguistics: Information Extraction Graduate Summer Intern, Intelius, Redwood City, California, USA
E-mail this message to a friend
University or Organization: Intelius, Inc.
Department: Data Research Team
Web Address: http://www.intelius.com
Type of Work: NLP
Duration: 01-Jun-2010 to 31-Aug-2010
Compensation: Paid: Competitive
Internship Location: Redwood City, California, USA
Minimum Education Level: BA
Special Qualifications: Current graduate student
The successful candidate will work to enhance and extend Intelius' system for doing information extraction from biographical data. An example problem would be to extract education, job title, image, and descriptive snippets for each individual from a page such as http://www.google.com/intl/en/corporate/execs.html. This task is similar to the WePS-2 attribute extraction task (see http://nlp.uned.es/weps/weps2/papers/weps2-ae-task-description.pdf)
This is an exciting opportunity to work with a large crawl of the web, ample hardware, and a team of engineers focused on the problem. The internship will be at our Silicon Valley office in Redwood City, California and will offer a competitive salary.
* Primary goal is to enhance and extend Intelius' information extraction algorithms. This task encompasses work in high-precision named entity identification, attribute extraction, intra-document coreference resolution, tokenization, and sentence boundary detection. The candidate will then test these algorithms by running experiments on a massive scale.
* Graduate student working on an M.S. or Ph.D. in computer science, computational linguistics, or related field
* Thesis focus on entity resolution or information extraction preferred.
* Strong hands-on skills in Java or Python
* Experience with complex regular expressions
* Familiarity with entity resolution literature
* Experience with GATE or other NLP toolkits
* Experience with Hadoop
Application Deadline: Open until filled.
Web Address for Applications: http://tbe.taleo.net/NA6/ats/careers/requisition.jsp?org=INTELIUSCORP&cws=1&rid=31
Andrew Borthwick, Ph.D.
Read more issues|LINGUIST home page|Top of issue
Please report any bad links or misclassified data
LINGUIST Homepage | Read
LINGUIST | Contact us
While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.