Full Text:   <361>

Summary:  <127>

CLC number: TP391

On-line Access: 2018-08-06

Received: 2017-04-19

Revision Accepted: 2017-08-31

Crosschecked: 2018-06-12

Cited: 0

Clicked: 724

Citations:  Bibtex RefMan EndNote GB/T7714


Guo-peng Xu


-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2018 Vol.19 No.6 P.783-795


Affective rating ranking based on face images in arousal-valence dimensional space

Author(s):  Guo-peng Xu, Hai-tang Lu, Fei-fei Zhang, Qi-rong MAO

Affiliation(s):  School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China

Corresponding email(s):   gpxu@ujs.edu.cn, 1406404872@qq.com, susanzhang1231@sina.com, mao_qr@mail.ujs.edu.cn

Key Words:  Ordinal ranking, Dimensional affect recognition, Valence, Arousal, Facial image processing

Guo-peng Xu, Hai-tang Lu, Fei-fei Zhang, Qi-rong MAO. Affective rating ranking based on face images in arousal-valence dimensional space[J]. Frontiers of Information Technology & Electronic Engineering, 2018, 19(6): 783-795.

@article{title="Affective rating ranking based on face images in arousal-valence dimensional space",
author="Guo-peng Xu, Hai-tang Lu, Fei-fei Zhang, Qi-rong MAO",
journal="Frontiers of Information Technology & Electronic Engineering",
publisher="Zhejiang University Press & Springer",

%0 Journal Article
%T Affective rating ranking based on face images in arousal-valence dimensional space
%A Guo-peng Xu
%A Hai-tang Lu
%A Fei-fei Zhang
%A Qi-rong MAO
%J Frontiers of Information Technology & Electronic Engineering
%V 19
%N 6
%P 783-795
%@ 2095-9184
%D 2018
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1700270

T1 - Affective rating ranking based on face images in arousal-valence dimensional space
A1 - Guo-peng Xu
A1 - Hai-tang Lu
A1 - Fei-fei Zhang
A1 - Qi-rong MAO
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 19
IS - 6
SP - 783
EP - 795
%@ 2095-9184
Y1 - 2018
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1700270

In dimensional affect recognition, the machine learning methods, which are used to model and predict affect, are mostly classification and regression. However, the annotation in the dimensional affect space usually takes the form of a continuous real value which has an ordinal property. The aforementioned methods do not focus on taking advantage of this important information. Therefore, we propose an affective rating ranking framework for affect recognition based on face images in the valence and arousal dimensional space. Our approach can appropriately use the ordinal information among affective ratings which are generated by discretizing continuous annotations. Specifically, we first train a series of basic cost-sensitive binary classifiers, each of which uses all samples relabeled according to the comparison results between corresponding ratings and a given rank of a binary classifier. We obtain the final affective ratings by aggregating the outputs of binary classifiers. By comparing the experimental results with the baseline and deep learning based classification and regression methods on the benchmarking database of the AVEC 2015 Challenge and the selected subset of SEMAINE database, we find that our ordinal ranking method is effective in both arousal and valence dimensions.


概要:在维度情感识别领域,分类和回归通常被用来对情感的机器学习进行建模和预测。然而,在维度情感空间中,情感标注通常是一个连续的实数值,拥有有序属性。而前面所提的两种方法并没有考虑并利用这一重要信息。因此,我们提出一个在激活和效价维度空间下的基于脸部图像的情感等级排序框架。我们的方法能够通过离散化连续的情感标注得到情感等级,并恰当地利用它们之间的有序信息。确切地说,首先训练一系列基本误差敏感二分类器,每个二分类器都使用经过二值重新标注的全部样本。依据样本对应的情感等级与给定二分类器对应的情感等级的比较结果,对二值进行重新标注。然后通过聚合所有二分类器的输出结果,可以得到样本最终的情感等级预测结果。在AVEC 2015挑战赛标准数据集和SEMAINE子集数据集上对所提方法与基本的和基于深度学习的分类和回归方法进行比较。实验结果表明,所提出的基于排序的情感识别方法在激活和效价两个维度上都是有效的。


Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article


[1]Abousaleh F, Lim T, Cheng W, et al., 2016. A novel comparative deep learning framework for facial age estimation. EURASIP J Image Video Process, 2016(1):47.

[2]Baron-Cohen S, 2004. Mind Reading: the Interactive Guide to Emotions. Jessica Kingsley Publishers.

[3]Bruna J, Mallat S, 2013. Invariant scattering convolution networks. IEEE Trans Patt Anal Mach Intell, 35(8):1872-1886.

[4]Caridakis G, Malatesta L, Kessous L, et al., 2006. Modeling naturalistic affective states via facial and vocal expressions recognition. 8th Int Conf on Multim Interfaces, p.146-154.

[5]Chang C, Lin C, 2011. LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol, 2(3):27.

[6]Chang K, Chen C, 2015. A learning framework for age rank estimation based on face images with scattering transform. IEEE Trans Image Process, 24(3):785-798.

[7]Chang K, Chen C, Hung Y, 2010. A ranking approach for human ages estimation based on face images. 20th Int Conf on Pattern Recognition, p.3396-3399.

[8]Feng S, Lang C, Feng J, et al., 2017. Human facial age estimation by cost-sensitive label ranking and trace norm regularization. IEEE Trans Multim, 19(1):136-148.

[9]Geng X, Zhou Z, Smith-Miles K, 2007. Automatic age estimation based on facial aging patterns. IEEE Trans Patt Anal Mach Intell, 29(12):2234-2240.

[10]Glowinski D, Camurri A, Volpe G, et al., 2008. Technique for automatic emotion recognition by body gesture analysis. Int Conf on Computer Vision and Pattern Recognition Workshops, p.1-6.

[11]Gunes H, Pantic M, 2010. Automatic, dimensional and continuous emotion recognition. Int J Synth Emot, 1(1):68-99.

[12]He L, Jiang D, Yang L, et al., 2015. Multimodal affective dimension prediction using deep bidirectional long short-term memory recurrent neural networks. 5th Int Workshop on Audio/Visual Emotion Challenge, p.73-80.

[13]Ioannou S, Raouzaiou A, Tzouvaras V, et al., 2005. Emotion recognition through facial expression analysis based on a neurofuzzy network. Neur Netw, 18(4):423-435.

[14]Joachims T, 2002. Optimizing search engines using clickthrough data. 8th ACM Int Conf on Knowledge Discovery and Data Mining, p.133-142.

[15]Levi G, Hassncer T, 2015. Age and gender classification using convolutional neural networks. Int Conf on Computer Vision and Pattern Recognition Workshops, p.34-42.

[16]Li L, Lin H, 2006. Ordinal regression by extended binary classification. Advances in Neural Information Processing Systems, p.865-872.

[17]Lim T, Hua K, Wang H, et al., 2015. VRank: voting system on ranking model for human age estimation. 17th IEEE Int Workshop on Multimedia Signal Processing, p.1-6.

[18]Liu T, 2011. Learning to Rank for Information Retrieval. Springer-Verlag Berlin Heidelberg.

[19]Martinez H, Yannakakis G, Hallam J, 2014. Don’t classify ratings of affect; rank them! IEEE Trans Affect Comput, 5(3):314-326.

[20]McDuff D, El Kaliouby R, Kassam K, et al., 2010. Affect valence inference from facial action unit spectrograms. Int Conf on Computer Vision and Pattern Recognition Workshops, p.17-24.

[21]Nicolaou M, Gunes H, Pantic M, 2010. Audio-visual classification and fusion of spontaneous affective data in likelihood space. 20th Int Conf on Pattern Recognition, p.3695-3699.

[22]Nicolaou M, Gunes H, Pantic M, 2011. Continuous prediction of spontaneous affect from multiple cues and modalities in valence-arousal space. IEEE Trans Affect Comput, 2(2):92-105.

[23]Ringeval F, Sonderegger A, Sauer J, et al., 2013. Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions. 10th IEEE Int Conf on Automatic Face and Gesture Recognition Workshops, p.1-8.

[24]Ringeval F, Schuller B, Valstar M, et al., 2015. AVEC 2015: the 5th International Audio/Visual Emotion Challenge and Workshop. 23rd ACM Int Conf on Multimedia, p.1335-1336.

[25]Russell J, 1980. A circumplex model of affect. J Pers Soc Psychol, 39(6):1161-1178.

[26]Scherer K, 2000. Psychological models of emotion. In: Borod J (Ed.), The Neuropsychology of Emotion. Oxford University Press, New York, USA.

[27]Scherer K, Schorr A, Johnstone T, 2001. Appraisal Processes in Emotion: Theory, Methods, Research. Oxford University Press, New York, USA.

[28]Schuller B, Vlasenko B, Eyben F, et al., 2009. Acoustic emotion recognition: a benchmark comparison of performances. IEEE Workshop on Automatic Speech Recognition and Understanding, p.552-557.

[29]Senechal T, Rapp V, Salam H, et al., 2012. Facial action recognition combining heterogeneous features via multikernel learning. IEEE Trans Syst Man Cybern Part B (Cybern), 42(4):993-1005.

[30]Wöllmer M, Eyben F, Reiter S, et al., 2008. Abandoning emotion classes–-towards continuous emotion recognition with modelling of long-range dependencies. Interspeech, p.597-600.

[31]Wöllmer M, Metallinou A, Eyben F, et al., 2010a. Context-sensitive multimodal emotion recognition from speech and facial expression using bidirectional LSTM modeling. Interspeech, p.2362-2365.

[32]Wöllmer M, Schuller B, Eyben F, et al., 2010b. Combining long short-term memory and dynamic Bayesian networks for incremental emotion-sensitive artificial listening. IEEE J Sel Top Signal Process, 4(5):867-881.

[33]Xu J, Li H, 2007. AdaRank: a boosting algorithm for information retrieval. 30th ACM Int Conf on Research and Development in Information Retrieval, p.391-398.

[34]Yang Y, Chen H, 2011. Ranking-based emotion recognition for music organization and retrieval. IEEE Trans Audio Speech Lang Process, 19(4):762-774.

[35]Yu C, Aoki P, Woodruff A, 2004. Detecting user engagement in everyday conversations. 8th Int Conf on Spoken Language Processing, p.1329-1332.

Open peer comments: Debate/Discuss/Question/Opinion


Please provide your name, email address and a comment

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - Journal of Zhejiang University-SCIENCE