About

I’am a PhD student at the laboratory of signals and information processing, Telecom-ParisTech. Here I work with a small research group of about ten members for the Greta virtual agent and the Nao humanoid robotic agent. The research is directed by Professor Catherine Pelachaud. I am a computer scientist, and my specialties include data mining, AI, human-robot interaction (HRI).

My PhD research focuses on expressive behaviors, especially on expressive gestures for storytelling humanoid agents (i.e. Greta and Nao). The work is done in the GVLEX project. The aim of this multidisciplinary project is to design and test a storytelling humanoid robot. Ideally, the robot would be able to process automatically a given tale or short story, and to play it for a child audience. Such a project is by nature interdisciplinary and involves: text analysis (discourse, expression, characters), expressive text-to-speech synthesis (particularly expressive prosodic synthesis), expressive posture and gesture synthesis and the coordination, between all these levels and aspects. The robot is used is Nao, a medium scale autonomous humanoid robot depicted in Fingure bellow.

The control of the robot  behavior is done through the real-time platform GRETA, designed to control the multi-modal behavior of embodied conversational agent. It follows the SAIBA flow (Kopp et al, 2006). It takes as input what the robot or agent aims to communicate and outputs the corresponding multi-modal nonverbal behaviors. The input text is augmented with communicative and emotional information encoded through FML (Function Markup Language) (Heylen et al, 2008), while the output behavior is represented with BML (Behavior Markup Language) (Vilhjalmson et al., 2007). Both FML and BML are XML languages. BML is body-independent, i.e. it is not constraint by a particular body type or animation parameters. We use BML to present the behavior of the Nao robot and of the virtual agent. Thus the flow of our algorithm is as follow: it takes as input the story to be told augmented with communicative functions and prosodic tags. The GRETA system calculates the synchronized nonverbal behavior to be played by the robot or by the virtual agent. So for the same FML input, the behavior of the virtual agent and of the robot should convey similar meanings. The difficulty arises as the robot Nao and the virtual agent Greta have not the same modalities. For example, Nao has not facial expression (and limited gaze abilities (Mutlu et al, 2006)) and almost no finger while Greta does not walk. As a consequence, several BML tags outputted by the system cannot be displayed either by the robot or by the agent. It could result in different meanings conveyed by the robot and the agent animations. Our solution is to use two lexicons, one for the robot and one for the agent where their respective entries should convey similar meaning, as illustrated in Fig below. To build these lexicons, we rely on the notion of gesture variant and gesture family introduced by Calbris (1990). A gesture family encompasses several instances of behaviors, which may differ in shape, but convey similar meanings. Thus the entries in the lexicons of the robot and of the virtual agent are part of the same gesture family, even if they differ in shape.

The diagram below presents an overview of the proposed system. A bit more detail can be found in the article “Expressive Gesture for Storytelling Humanoid Agent”.

Below is one very early result video:

 

Comments are closed.