“Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction”                               

Sponsored by:

Keynote Speaker: Carlos Busso (University of Texas, Dallas)

Generating data-driven human-like behaviors for conversational agents


Nonverbal behaviors externalized through head, face and body movements for conversational agents (CAs) play an important role in human computer interaction (HCI). Believable movements for CAs have to be meaningful and natural. Previous studies mainly relied on rule-based or speech-driven approaches. This presentation will discuss our effort to bridge the gap between these two approaches overcoming their limitations. These models, implemented with dynamic Bayesian networks (DBNs), recurrent deep neural networks (DNNs) or conditional Generative adversarial networks (GANs), have opened opportunities to generate characteristic behaviors associated with a given discourse class learning the rules from the data. These models capture principled temporal relationships and dependencies between speech and gestures that are carefully taken into account. The presentation will discuss effective ways to create behaviors that are targeted to discourse classes or prototypical gestures such as head nods. The advances in this area will lead to CAs that can express meaningful human-like gestures that are timely synchronized with speech, enabling novel venues for artificial agents in human-machine interaction.


Carlos Busso received his BS and MS degrees with high honors in electrical engineering from the University of Chile, Santiago, Chile, in 2000 and 2003, respectively, and his PhD degree (2008) in electrical engineering from the University of Southern California (USC), Los Angeles, in 2008. He is an associate professor at the Electrical Engineering Department of The University of Texas at Dallas (UTD). He was selected by the School of Engineering of Chile as the best electrical engineer graduated in 2003 across Chilean universities. At USC, he received a provost doctoral fellowship from 2003 to 2005 and a fellowship in Digital Scholarship from 2007 to 2008. At UTD, he leads the Multimodal Signal Processing (MSP) laboratory [http://msp.utdallas.edu]. He is a recipient of an NSF CAREER Award. In 2014, he received the ICMI Ten-Year Technical Impact Award. In 2015, his student received the third prize IEEE ITSS Best Dissertation Award (N. Li). He also received the Hewlett Packard Best Paper Award at the IEEE ICME 2011 (with J. Jain), and the Best Paper Award at the AAAC ACII 2017 (with Yannakakis and Cowie). He is the co-author of the winner paper of the Classifier Sub-Challenge event at the Interspeech 2009 emotion challenge. His research interests include digital signal processing, speech and video processing, and multimodal interfaces. His current research includes the broad areas of affective computing, multimodal human-machine interfaces, modeling and synthesis of verbal and nonverbal behaviors, sensing human interaction, in-vehicle active safety system, and machine learning methods for multimodal processing. He was the general chair of ACII 2017. He is a member of ISCA, AAAC, and ACM, and a senior member of the IEEE.

By Taken by Jesse Varner. Modified by AzaToth. - Self-made photo.Originally uploaded on 2006-04-19 by Molas. Uploaded edit 2007-12-23 by AzaToth., CC BY-SA 2.5, https://commons.wikimedia.org/w/index.php?curid=3267545

Satellite workshop of ICMI 2018

MA³HMI 2018

By Taken by Jesse Varner. Modified by AzaToth.