IVML  
  about | r&d | publications | courses | people | links
   

L. Kessous, G. Castellano, G. Caridakis
Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis
Journal on Multimodal User Interfaces, 3(1), 33-48, Springer, DOI 10.1007/s12193-009-0025-5
ABSTRACT
In this paper a study on multimodal automatic emotion recognition during a speech-based interaction is presented. A database was constructed consisting of people pronouncing a sentence in a scenario where they interacted with an agent using speech. Ten people pronounced a sentence corresponding to a command while making 8 different emotional expressions. Gender was equally represented, with speakers of several different native languages including French, German, Greek and Italian. Facial expression, gesture and acoustic analysis of speech were used to extract features relevant to emotion. For the automatic classification of unimodal data, bimodal data and multimodal data, a system based on a Bayesian classifier was used. After performing an automatic classification of each modality, the different modalities were combined using a multimodal approach. Fusion of the modalities at the feature level (before running the classifier) and at the results level (combining results from classifier from each modality) were compared. Fusing the multimodal data resulted in a large increase in the recognition rates in comparison to the unimodal systems: the multimodal approach increased the recognition rate by more than 10% when compared to the most successful unimodal system. Bimodal emotion recognition based on all combinations of the modalities (i.e., ˇface-gesture˘, ˇface-speech˘ and ˇgesture-speech˘) was also investigated. The results show that the best pairing is ˇgesture-speech˘. Using all three modalities resulted in a 3.3% classification improvement over the best bimodal results.
04 July , 2010
L. Kessous, G. Castellano, G. Caridakis, "Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis", Journal on Multimodal User Interfaces, 3(1), 33-48, Springer, DOI 10.1007/s12193-009-0025-5
[ save PDF] [ BibTex] [ Print] [ Back]

© 00 The Image, Video and Multimedia Systems Laboratory - v1.12