LINGUIST List 32.1193

Fri Apr 02 2021

FYI: Call for Participation - ACM Multimedia 2021 Grand Challenge

Editor for this issue: Everett Green <>

Date: 01-Apr-2021
From: Dominike Thomas <>
Subject: Call for Participation - ACM Multimedia 2021 Grand Challenge
E-mail this message to a friend

== MultiMediate: Multi-modal Group Behaviour Analysis for Artificial Mediation ==
== ==
Artificial mediators are a promising approach to support group conversations, but at present, their abilities are limited by insufficient progress in group behaviour sensing and analysis. The MultiMediate challenge is designed to work towards the vision of effective artificial mediators by facilitating and measuring progress on key group behaviour sensing and analysis tasks. This year, the challenge focuses on eye contact detection and next speaker prediction.

== Eye contact detection sub-challenge ==
Eye contact detection is a fundamental task in group behaviour analysis because eye contact is related to many important aspects of behaviour, including turn-taking, interpretation of emotional expressions, engagement, and dominance. This sub-challenge focuses on eye contact detection in group interactions from multiple ambient RGB cameras. We define eye contact as a discrete indication of whether a participant is looking at another participants’ face, and if so, who this other participant is. Video and audio recordings over a 10-second context window will be provided as input to provide temporal context for the classification decision. Eye contact has to be detected for the last frame of this context window.

== Next speaker prediction sub-challenge ==
The ability to predict who will speak next will allow artificial mediators to engage in seamless turn-taking behaviour and even intervene proactively to influence the conversation. In this sub-challenge, competitors need to predict which members of the group will be speaking at a future point in time. Similar to the eye contact detection sub-challenge, video and audio recordings over a 10 second context window will be provided as input. Based on this information, approaches need to predict the speaking status of each participant at one second after the end of the context window.

== Dataset ==
For training and evaluation, MultiMediate makes use of the MPIIGroupInteraction dataset consisting of 22 three- to four-person discussions and of an unpublished test set of six additional discussions. The dataset consists of frame-synchronised video recordings of all participants as well as audio recordings of the interactions. Participants will upload their code to an evaluation server where the methods will be evaluated on an unpublished test set.

== How to Participate ==
Instructions are available at
Data for the eye contact detection sub-challenge is already available and data for next speaker prediction will be released soon.
Paper submission deadline: 11 July, 2021

== Organisers ==
Philipp Müller (DFKI GmbH)
Dominik Schiller (Augsburg University)
Dominike Thomas (University of Stuttgart)
Guanhua Zhang (University of Stuttgart)
Michael Dietz (Augsburg University)
Patrick Gebhard (DFKI GmbH)
Elisabeth André (Augsburg University)
Andreas Bulling (University of Stuttgart)

Linguistic Field(s): Computational Linguistics; Discourse Analysis

Subject Language(s): German (deu)

Page Updated: 02-Apr-2021