Abstract of the CSCC'99 paper: 'Comparing Template-based, Feature-based and Supervised Classification of Facial Expressions from Static Images'

CSCC'99 PAPER ABSTRACT

"Comparing Template-based, Feature-based and Supervised Classification of Facial Expressions from Static Images"

W. A. Fellenz, J. G. Taylor (UK), N. Tsapatsoulis and S. Kollias (Greece)

Abstract: We compare the performance and generalization capabilities of different low-dimensional representations for facial emotion classification from static face images showing happy, angry, sad, and neutral expressions. Three general strategies are compared: The first approach uses the average face for each class as a generic template and classifies the individual facial expressions according to the best match of each template. The second strategy uses a multi-layered perceptron trained with the backpropagation of error algorithm on a subset of all facial expressions and subsequently tested on unseen face images. The third approach introduces a preprocessing step prior to the learning of an internal representation by the perceptron. The feature extraction stage computes the oriented response to six odd-symmetric and six even-symmetric Gabor-filters at each pixel position in the image. The template-based approach reached up to 75% correct classification, which corresponds to the correct recognition of three out of four expressions. However, the generalization performance only reached about 50%. The multi-layered perceptron trained on the raw face images almost always reached a classification performance of 100% on the test-set, but the generalization performance on new images varied from 40% to 80% correct recognition, depending on the choice of the test images. The introduction of the preprocessing stage was not able to improve the generalization performance but slowed down the learning by a factor of ten. We conclude, that a template-based approach for emotion classification from static images has only very limited recognition and generalization capabilities. This poor performance can be attributed to the smoothing of facial detail caused by small misalignments of the faces and the large inter-personal differences of facial expressions exposed in the data set. Although the nonlinear extraction of appropriate key features from facial expressions by the multi-layered perceptron is able to maximize classification performance, the generalization performance usually reaches only 60%.

Key-Words: facial analysis, emotion recognition, static face images, MLP
CSCC'99 Proc.Pages: 5331-5336