For a machine to have efficient human-computer interaction, relevant information about the subject is needed. As the automation of an increasing number of tasks is being developed, the need for artificial emotion recognition solutions is becoming more apparent. In this work, we focus on the solutions that can measure human emotions from a distance by using a camera. We developed one that uses detected facial expressions to classify emotions. It employs OpenFace to measure present facial action units which are in turn used as the input to a Support Vector Machine classifier. We tested our method’s performance against Noldus FaceReader by using metrics of precision, recall, accuracy, and F1 score. For testing, we used two multi-modal datasets with emotion-annotated video files: (1) Bahcesehir University Multimodal Face Database of Affective and Mental States (BAUM-1, which has two parts – BAUM-1s and BAUM-1a) and (2) Geneva Multimodal Emotion Portrayals Core Set (GEMEP); we also added a dataset of emotion-annotated images that we compiled by manual selection from the BAUM-1s database. While GEMEP and BAUM-1a databases contain acted emotional expressions, BAUM-1s is only composed of videos with spontaneous emotional expressions. We compared the classifiers’ performance while using different interpretations of probabilistic classifications, analyzed the differences based on the used input datasets, and discussed factors responsible for the measured outcomes. While FaceReader seemed to perform more consistently, our classification method achieved a better mean in F1 scores, even though the difference was not statistically significant. The pros and cons of each classifier and possible classifier upgrades are discussed. A high enough performance of such emotion-recognizing systems would enable the development of useful applications for various purposes, such as security, healthcare, administration, marketing, and many others.
|