Full Text:   <182>

CLC number: 

On-line Access: 2023-12-01

Received: 2022-12-12

Revision Accepted: 2023-06-27

Crosschecked: 0000-00-00

Cited: 0

Clicked: 245

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
Open peer comments

Journal of Zhejiang University SCIENCE C 1998 Vol.-1 No.-1 P.


FaSRnet: feature and semantics refinement network for human pose estimation

Author(s):  Yuanhong ZHONG, Qianfeng XU, Daidi ZHONG, Xun YANG, Shanshan WANG

Affiliation(s):  The School of Microelectronics and Communication Engineering, Chongqing University; more

Corresponding email(s):   zhongyh@cqu.edu.cn

Key Words:  Human pose estimation, Multi-frame refinement, Heatmap and offset estimation, Feature alignment, Multi-person

Yuanhong ZHONG, Qianfeng XU, Daidi ZHONG, Xun YANG, Shanshan WANG. FaSRnet: feature and semantics refinement network for human pose estimation[J]. Frontiers of Information Technology & Electronic Engineering, 1998, -1(-1): .

@article{title="FaSRnet: feature and semantics refinement network for human pose estimation",
author="Yuanhong ZHONG, Qianfeng XU, Daidi ZHONG, Xun YANG, Shanshan WANG",
journal="Frontiers of Information Technology & Electronic Engineering",
publisher="Zhejiang University Press & Springer",

%0 Journal Article
%T FaSRnet: feature and semantics refinement network for human pose estimation
%A Yuanhong ZHONG
%A Qianfeng XU
%A Daidi ZHONG
%A Shanshan WANG
%J Journal of Zhejiang University SCIENCE C
%V -1
%N -1
%@ 2095-9184
%D 1998
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2200639

T1 - FaSRnet: feature and semantics refinement network for human pose estimation
A1 - Yuanhong ZHONG
A1 - Qianfeng XU
A1 - Daidi ZHONG
A1 - Xun YANG
A1 - Shanshan WANG
J0 - Journal of Zhejiang University Science C
VL - -1
IS - -1
SP -
EP -
%@ 2095-9184
Y1 - 1998
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2200639

Due to factors such as motion blur, video out-of-focus, and occlusion, multi-frame human pose estimation is a challenging task. Exploiting temporal consistency between consecutive frames is an efficient approach to address this issue. Currently, most methods explore temporal consistency through refinements of the final heatmaps. The heatmaps contain the semantic information of key points, which can improve the detection quality to a certain extent. However, they are generated by features, and feature-level refinements are rarely considered. In this paper, we propose a human pose estimation framework with refinements at the feature and semantic levels. We align auxiliary features with the features of the current frame to reduce the loss caused by different feature distributions. An attention mechanism is then used to fuse auxiliary features with current features. In terms of semantics, we use the difference information between adjacent heatmaps as auxiliary features to refine the current heatmaps. The method was validated on the large-scale benchmark datasets PoseTrack2017 and PoseTrack2018, and the results demonstrated the effectiveness of our method.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Open peer comments: Debate/Discuss/Question/Opinion


Please provide your name, email address and a comment

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE