Affect2MM: Affective Analysis of Multimedia Content Using Emotion Causality


We present Affect2MM, a learning method for time-series emotion prediction for multimedia content. Our goal is to automatically capture the varying emotions depicted by characters in real-life human-centric situations and behaviors. We use the ideas from emotion causation theories to computationally model and determine the emotional state evoked in clips of movies. Affect2MM explicitly models the temporal causality using attention-based methods and Granger causality. We use a variety of components like facial features of actors involved, scene understanding, visual aesthetics, action/situation description, and movie script to obtain an affective-rich representation to understand and perceive the scene. We use an LSTM-based learning model for emotion perception. To evaluate our method, we analyze and compare our performance on three datasets, SENDv1, MovieGraphs, and the LIRIS-ACCEDE dataset, and observe an average of 10-15% increase in the performance over SOTA methods for all three datasets.



Affect2MM: Affective Analysis of Multimedia Content Using Emotion Causality.
Trisha Mittal, Puneet Mathur, Aniket Bera, Dinesh Manocha

Please cite our work if you find it useful in your research:

  title={M3ER: Multiplicative Multimodal Emotion Recognition Using Facial, Textual, and Speech Cues},
  author={Mittal, Trisha and Bhattacharya, Uttaran and Chandra, Rohan and Bera, Aniket and Manocha, Dinesh},
  journal={arXiv preprint arXiv:1911.05659},