LINGUIST List 29.1876

Thu May 03 2018

Support: Computational Linguistics; Discourse Analysis: PhD, Université de Lorraine

Editor for this issue: Becca Morris <>

Date: 02-May-2018
From: Mathilde Dargnat <>
Subject: Computational Linguistics; Discourse Analysis: PhD, Université de Lorraine, France
E-mail this message to a friend

Institution/Organization: Université de Lorraine - France
Department: Computational Linguistics

Level: PhD

Duties: Research

Specialty Areas: Computational Linguistics; Discourse Analysis


Text mining is widely used in many different domains in order to classify
opinions, to analyse sentiments, to acquire and represent knowledge or to
understand complex processes. One common feature to these approaches is that
they should be appliable to a large amount of real raw texts. At present, we
may distinguish two main types of approaches. One concerns mainly
classification of texts into categories, typically identifying if a text
correspond to a positive or a negative opinion. The second type concerns the
extraction of knowledge from texts. It usually requires several steps like
information extraction (identification of domain entities, relations between
them) and then a conceptualisation step to organise information into knowledge
units (data mining tools).

There is one very challenging dimension that has always been neglicted in text
mining: the disourse level. And we claim that this is the next step to
properly understand the content of documents. So what means “discourse level”
and what could it be used for? There exist several discourse theories in
computational linguistics but for sake of simplicity, we will consider here
that the discourse level relates some parts of a text (discourse units) with
some others, of the same text, making explicit the kind of relation between
them: one sentence may elaborate on the previous one, another sentence gives
the cause of a previous event. . . In other words, discourse structures make
texts different from a simple juxtaposition of sentences.

Discourse relations can thus be used to better understand causes,
consequences, temporal order between events. . . Today, many companies crawl
the web to collect reviews of products or services. While sentiment analysis
or opinion mining currently assign a positive or a negative flag and provide
some keywords to explain the result, discourse may explain what are the main
arguments, what is the sequence of events or what are the main reasons that
make the customer positive or negative. In a scientific domain (ex. medical
domain), discourse structure enables a better understanding of the temporal
order of symptoms or the onset of diseases, the effects and side effects of a

Recent research advances in linguistics, in natural language processing, in
classification, graph mining and in (deep) learning, all contribute to define
a new paradigm to propose new methods for mining texts at the discourse level.
The thesis should thus explore different formalisms, specify the goal of
discourse mining and combine methods coming from several domains. Indeed,
discourse representations are complex structures that can be compared (to
extract similar parts of texts) or classified.

Application Deadline: 15-May-2018

Web Address for Applications:

Contact Information:
Pr Yannick Toussaint

Page Updated: 03-May-2018