CLC number: TN919.8
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2009-10-18
Cited: 1
Clicked: 5978
Hua ZHANG, Xiang TIAN, Yao-wu CHEN. A video structural similarity quality metric based on a joint spatial-temporal visual attention model[J]. Journal of Zhejiang University Science A, 2009, 10(12): 1696-1704.
@article{title="A video structural similarity quality metric based on a joint spatial-temporal visual attention model",
author="Hua ZHANG, Xiang TIAN, Yao-wu CHEN",
journal="Journal of Zhejiang University Science A",
volume="10",
number="12",
pages="1696-1704",
year="2009",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.A0920035"
}
%0 Journal Article
%T A video structural similarity quality metric based on a joint spatial-temporal visual attention model
%A Hua ZHANG
%A Xiang TIAN
%A Yao-wu CHEN
%J Journal of Zhejiang University SCIENCE A
%V 10
%N 12
%P 1696-1704
%@ 1673-565X
%D 2009
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.A0920035
TY - JOUR
T1 - A video structural similarity quality metric based on a joint spatial-temporal visual attention model
A1 - Hua ZHANG
A1 - Xiang TIAN
A1 - Yao-wu CHEN
J0 - Journal of Zhejiang University Science A
VL - 10
IS - 12
SP - 1696
EP - 1704
%@ 1673-565X
Y1 - 2009
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.A0920035
Abstract: Objective video quality assessment plays a very important role in multimedia signal processing. Several extensions of the structural similarity (SSIM) index could not predict the quality of the video sequence effectively. In this paper we propose a structural similarity quality metric for videos based on a spatial-temporal visual attention model. This model acquires the motion attended region and the distortion attended region by computing the motion features and the distortion contrast. It mimics the visual attention shifting between the two attended regions and takes the burst of error into account by introducing the non-linear weighting functions to give a much higher weighting factor to the extremely damaged frames. The proposed metric based on the model renders the final object quality rating of the whole video sequence and is validated using the 50 Hz video sequences of Video Quality Experts Group Phase I test database.
[1] Aziz, M.Z., Mertsching, B., 2008. Fast and robust generation of feature maps for region-based visual attention. IEEE Trans. Image Process., 17(5):633-644.
[2] Brooks, A.C., Zhao, X.N., Pappas, T.N., 2008. Structural similarity quality metrics in a coding context: exploring the space of realistic distortions. IEEE Trans. Image Process., 17(8):1261-1273.
[3] Chen, Q.Q., Chen, Z.B., Gu, X.D., Wang, C., 2007. Attention-based adaptive intra refresh for error-prone video transmission. IEEE Commun. Mag., 45(1):52-60.
[4] Grill-Spector, K., Malach, R., 2004. The human visual cortex. Ann. Rev. Neurosci., 27:649-677.
[5] Itti, L., 2005. Quantifying the contribution of low-level saliency to human eye movements in dynamic scenes. Vis. Cogn., 12(6):1093-1123.
[6] Itti, L., Baldi, P., 2005. A Principled Approach to Detecting Surprising Events in Video. IEEE Int. Conf. on Computer Vision and Pattern Recognition, p.631-637.
[7] Lu, Z.K., Liu, W.S., Yang, X.K., Ong, E.P., Yao, S.S., 2005. Modeling visual attention’s modulatory aftereffects on visual sensitivity and quality evaluation. IEEE Trans. Image Process., 14(11):1928-1942.
[8] Martinez-Rach, M., Lopez, O., Pinol, P., Malumbres, M.P., Oliver, J., 2006. A Study of Objective Quality Assessment Metrics for Video Codec Design and Evaluation. Eighth IEEE Int. Symp. on Multimedia, p.517-524.
[9] Seshadrinathan, K., Bovik, A.C., 2007. A Structural Similarity Metric for Video Based on Motion Models. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, p.I-869-I-872.
[10] Sheikh, H.R., Sabir, M.F., Bovik, A.C., 2006. A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans. Image Process., 15(11):3440-3451.
[11] Tang, C.W., 2007. Spatiotemporal visual considerations for video coding. IEEE Trans. Multim., 9(2):231-238.
[12] VQEG, 2000. Final Report from the Video Quality Expert Group on the Validation of Objective Models of Video Quality Assessment. Video Quality Expert Group. Available from http://www.vqeg.org [Accessed on Aug. 22, 2008].
[13] Wang, Z., Li, Q., 2007. Video quality assessment using a statistical model of human visual speed perception. J. Opt. Soc. Am. A, 24:B61-B69.
[14] Wang, Z., Simoncelli, E.P., 2005. Translation Insensitive Image Similarity in Complex Wavelet Domain. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, p.573-576.
[15] Wang, Z., Sheikh, H.R., Bovik, A.C., 2003. Objective Video Quality Assessment. In: Furht, B., Marques, O. (Eds.), The Handbook of Video Databases: Design and Applications. CRC Press, Florida, USA, p.1041-1078.
[16] Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., 2004a. Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process., 13(4):600-612.
[17] Wang, Z., Lu, L., Bovik, A.C., 2004b. Video quality assessment based on structural distortion measurement. Signal Process.: Image Commun., 19(2):121-132.
[18] Zheng, Y.Y., 2008. Research on H.264 Region-of-Interest Coding Based on Visual Perception. PhD Thesis, Zhejiang University, Hangzhou, China (in Chinese).
Open peer comments: Debate/Discuss/Question/Opinion
<1>