JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering

Accepted manuscript available online (unedited version)

RFPose-OT: RF-based 3D human pose estimation via optimal transport theory

Author(s): Cong YU, Dongheng ZHANG, Zhi WU, Zhi LU, Chunyang XIE, Yang HU, Yan CHEN
Affiliation(s): School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China; more
Corresponding email(s): congyu@std.uestc.edu.cn, eecyan@ustc.edu.cn
Key Words: Radio frequency sensing; Human pose estimation; Optimal transport; Deep learning

Share this article to： More <<< Previous Paper \|Next Paper >>>

Cong YU, Dongheng ZHANG, Zhi WU, Zhi LU, Chunyang XIE, Yang HU, Yan CHEN. RFPose-OT: RF-based 3D human pose estimation via optimal transport theory[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2200550

@article{title="RFPose-OT: RF-based 3D human pose estimation via optimal transport theory",
author="Cong YU, Dongheng ZHANG, Zhi WU, Zhi LU, Chunyang XIE, Yang HU, Yan CHEN",
journal="Frontiers of Information Technology & Electronic Engineering",
year="in press",
publisher="Zhejiang University Press & Springer",
doi="https://doi.org/10.1631/FITEE.2200550"
}

%0 Journal Article
%T RFPose-OT: RF-based 3D human pose estimation via optimal transport theory
%A Cong YU
%A Dongheng ZHANG
%A Zhi WU
%A Zhi LU
%A Chunyang XIE
%A Yang HU
%A Yan CHEN
%J Frontiers of Information Technology & Electronic Engineering
%P 1445-1457
%@ 2095-9184
%D in press
%I Zhejiang University Press & Springer
doi="https://doi.org/10.1631/FITEE.2200550"

TY - JOUR
T1 - RFPose-OT: RF-based 3D human pose estimation via optimal transport theory
A1 - Cong YU
A1 - Dongheng ZHANG
A1 - Zhi WU
A1 - Zhi LU
A1 - Chunyang XIE
A1 - Yang HU
A1 - Yan CHEN
J0 - Frontiers of Information Technology & Electronic Engineering
SP - 1445
EP - 1457
%@ 2095-9184
Y1 - in press
PB - Zhejiang University Press & Springer
ER -
doi="https://doi.org/10.1631/FITEE.2200550"

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: This paper introduces a novel framework, i.e., RFPose-OT, to enable three-dimensional (3D) human pose estimation from radio frequency (RF) signals. Different from existing methods that predict human poses from RF signals at the signal level directly, we consider the structure difference between the RF signals and the human poses, propose a transformation of the RF signals to the pose domain at the feature level based on the optimal transport (OT) theory, and generate human poses from the transformed features. To evaluate RFPose-OT, we build a radio system and a multi-view camera system to acquire the RF signal data and the ground-truth human poses. The experimental results in a basic indoor environment, an occlusion indoor environment, and an outdoor environment demonstrate that RFPose-OT can predict 3D human poses with higher precision than state-of-the-art methods.

RFPose-OT：基于最优传输理论的无线三维人体姿态估计

俞聪¹，张东恒²，武治²，卢智²，解春阳¹，胡洋³，陈彦²
¹电子科技大学信息与通信工程学院，中国成都市，611731
²中国科学技术大学网络空间安全学院，中国合肥市，230026
³中国科学技术大学信息科学技术学院，中国合肥市，230026
摘要：本文提出一个新颖的RFPose-OT模型框架以实现从无线射频信号中估计三维人体姿态。与现有直接从射频信号中预测人体姿态方法不同，本文考虑射频信号与人体姿态之间的结构特征差异，提出基于最优传输理论在特征空间上将射频信号变换到人体姿态域，再根据变换后的特征预测人体姿态。为评估RFPose-OT模型，本文构建了一个无线电系统和一个多视角相机系统获取无线信号数据以及真实的人体姿态标签。在室内基本环境、室内遮挡环境以及室外环境中的实验结果表明，RFPose-OT模型能精确地估计三维人体姿态，优于现有方法。

关键词组：无线射频感知；人体姿态估计；最优传输；深度学习

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Bonneel N, van de Panne M, Paris S, et al., 2011. Displacement interpolation using Lagrangian mass transport. Proc SIGGRAPH Asia Conf, p.1-12.

[2]Cao Z, Simon T, Wei SE, et al., 2017. Realtime multi-person 2D pose estimation using part affinity fields. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.7291-7299.

[3]Chen JB, Zhang DH, Wu Z, et al., 2022. Contactless electro-cardiogram monitoring with millimeter wave radar. IEEE Trans Mob Comput, early access.

[4]Chen Y, Su X, Hu Y, et al., 2020. Residual carrier frequency offset estimation and compensation for commodity WiFi. IEEE Trans Mob Comput, 19(12):2891-2902.

[5]Chen Y, Deng HY, Zhang DH, et al., 2021. SpeedNet: indoor speed estimation with radio signals. IEEE Int Things J, 8(4):2762-2774.

[6]Conte E, Filippi A, Tomasin S, 2010. Ml period estimation with application to vital sign monitoring. IEEE Signal Process Lett, 17(11):905-908.

[7]Fang HS, Xie SQ, Tai YW, et al., 2017. RMPE: regional multi-person pose estimation. Proc IEEE Int Conf on Computer Vision, p.2334-2343.

[8]He KM, Gkioxari G, Dollár P, et al., 2017. Mask R-CNN. Proc IEEE Int Conf on Computer Vision, p.2961-2969.

[9]He Y, Chen Y, Hu Y, et al., 2020. WiFi vision: sensing, recognition, and detection with commodity MIMO-OFDM WiFi. IEEE Int Things J, 7(9):8296-8317.

[10]Hsu CY, Hristov R, Lee GH, et al., 2019. Enabling identification and behavioral sensing in homes using radio reflections. Proc CHI Conf on Human Factors in Computing Systems, p.1-13.

[11]Ito N, Godsill S, 2020. A multi-target track-before-detect particle filter using superpositional data in non-Gaussian noise. IEEE Signal Process Lett, 27:1075-1079.

[12]Ji HR, Hou CP, Yang Y, et al., 2021. A one-class classification method for human gait authentication using micro-Doppler signatures. IEEE Signal Process Lett, 28:2182-2186.

[13]Jiang WJ, Xue HF, Miao CL, et al., 2020. Towards 3D human pose construction using WiFi. Proc 26^th Annual Int Conf on Mobile Computing and Networking, p.1-14.

[14]Kantorovich LV, 1942. On the translocation of masses. Dokl Akad Nauk USSR, 37:199-201 (in Russian).

[15]Kim HI, Park RH, 2018. Residual LSTM attention network for object tracking. IEEE Signal Process Lett, 25(7):1029-1033.

[16]Kotaru M, Joshi K, Bharadia D, et al., 2015. SpotFi: decimeter level localization using WiFi. Proc ACM Conf on Special Interest Group on Data Communication, p.269-282.

[17]LeCun Y, Bengio Y, Hinton G, 2015. Deep learning. Nature, 521(7553):436-444.

[18]Li J, 2018. Cyber security meets artificial intelligence: a survey. Front Inform Technol Electron Eng, 19(12):1462-1474.

[19]Li TH, Fan LJ, Zhao MM, et al., 2019. Making the invisible visible: action recognition through walls and occlusions. Proc IEEE/CVF Int Conf on Computer Vision, p.872-881.

[20]Li YD, Zhang DH, Chen JB, et al., 2021. Towards domain-independent and real-time gesture recognition using mmWave signal. IEEE Trans Mob Comput, early access.

[21]Liu SP, Tian GH, Cui YC, et al., 2022. A deep Q-learning network based active object detection model with a novel training algorithm for service robots. Front Inform Technol Electron Eng, 23(11):1673-1683.

[22]Ma L, Zhong QY, Zhang YY, et al., 2021. Associative affinity network learning for multi-object tracking. Front Inform Technol Electron Eng, 22(9):1194-1206.

[23]Majeed K, Sorour S, Al-Naffouri TY, et al., 2016. Indoor localization and radio map estimation using unsupervised manifold alignment with geometry perturbation. IEEE Trans Mob Comput, 15(11):2794-2808.

[24]Martinez J, Hossain R, Romero J, et al., 2017. A simple yet effective baseline for 3D human pose estimation. Proc IEEE Int Conf on Computer Vision, p.2640-2649.

[25]Monge G, 1781. Mémoire sur la théorie des déblais et des remblais. Mémoires de Mathématique et de Physique, Presentés à l’Académie Royale des Sciences, p.666-704 (in French).

[26]Niu K, Zhang FS, Wang XZ, et al., 2022. Understanding WiFi signal frequency features for position-independent gesture sensing. IEEE Trans Mob Comput, 21(11):4156-4171.

[27]Patwari N, Wilson J, Ananthanarayanan S, et al., 2014. Monitoring breathing via signal strength in wireless networks. IEEE Trans Mob Comput, 13(8):1774-1786.

[28]Qian K, Wu CS, Yang Z, et al., 2018. Enabling contactless detection of moving humans with dynamic speeds using CSI. ACM Trans Embed Comput Syst, 17(2):1-18.

[29]Qiu CR, Zhang DH, Hu Y, et al., 2022. Radio-assisted human detection. IEEE Trans Multim, 25:2613-2623.

[30]Rampa V, Savazzi S, Nicoli M, et al., 2015. Physical modeling and performance bounds for device-free localization systems. IEEE Signal Process Lett, 22(11):1864-1868.

[31]Sengupta A, Jin F, Zhang RY, et al., 2020. mm-Pose: real-time human skeletal posture estimation using mmWave radars and CNNs. IEEE Sens J, 20(17):10032-10044.

[32]Song RY, Zhang DH, Wu Z, et al., 2022. RF-URL: unsupervised representation learning for RF sensing. Proc 28^th Annual Int Conf on Mobile Computing and Networking, p.282-295.

[33]Wang F, Zhou S, Panev S, et al., 2019. Person-in-WiFi: fine-grained person perception using WiFi. Proc IEEE/CVF Int Conf on Computer Vision, p.5452-5461.

[34]Wang L, Sun K, Dai HP, et al., 2021. WiTrace: centimeter-level passive gesture tracking using OFDM signals. IEEE Trans Mob Comput, 20(4):1730-1745.

[35]Wei SE, Ramakrishna V, Kanade T, et al., 2016. Convolutional pose machines. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.4724-4732.

[36]Wu Z, Zhang DH, Xie CY, et al., 2022. RFMask: a simple baseline for human silhouette segmentation with radio signals. IEEE Trans Multim, early access.

[37]Xu XY, Yu JD, Chen YY, 2022. Leveraging acoustic signals for fine-grained breathing monitoring in driving environments. IEEE Trans Mob Comput, 21(3):1018-1033.

[38]Yang Y, Zhuang YT, Pan YH, 2021. Multiple knowledge representation for big data artificial intelligence: framework, applications, and case studies. Front Inform Technol Electron Eng, 22(12):1551-1558.

[39]Yu C, Wu Z, Zhang DH, et al., 2022. RFGAN: RF-based human synthesis. IEEE Trans Multim, 25:2926-2938.

[40]Yue SC, He H, Wang H, et al., 2018. Extracting multi-person respiration from entangled RF signals. Proc ACM Interact Mob Wearab Ubiq Technol, 2(2):1-22.

[41]Zeng YZ, Pathak PH, Mohapatra P, 2016. WiWho: WiFi-based person identification in smart spaces. Proc 15^th ACM/IEEE Int Conf on Information Processing in Sensor Networks, p.1-12.

[42]Zhang BB, Zhang DH, Li YD, et al., 2021. Unsupervised domain adaptation for device-free gesture recognition. https://arxiv.org/abs/2111.10602v1

[43]Zhang DH, He Y, Gong XY, et al., 2018. Multitarget AOA estimation using wideband LFMCW signal and two receiver antennas. IEEE Trans Veh Technol, 67(8):7101-7112.

[44]Zhang DH, Hu Y, Chen Y, et al., 2019. BreathTrack: tracking indoor human breath status via commodity WiFi. IEEE Int Things J, 6(2):3899-3911.

[45]Zhang DH, Hu Y, Chen Y, et al., 2020. Calibrating phase offsets for commodity WiFi. IEEE Syst J, 14(1):661-664.

[46]Zhang DH, Hu Y, Chen Y, 2021. MTrack: tracking multi-person moving trajectories and vital signs with radio signals. IEEE Int Things J, 8(5):3904-3914.

[47]Zhang F, Chen C, Wang BB, et al., 2018. WiSpeed: a statistical electromagnetic approach for device-free indoor speed estimation. IEEE Int Things J, 5(3):2163-2177.

[48]Zhang QS, Zhu SC, 2018. Visual interpretability for deep learning: a survey. Front Inform Technol Electron Eng, 19(1):27-39.

[49]Zhang Z, 2000. A flexible new technique for camera calibration. IEEE Trans Patt Anal Mach Intell, 22(11):1330-1334.

[50]Zhao MM, Yue SC, Katabi D, et al., 2017. Learning sleep stages from radio signals: a conditional adversarial architecture. Proc 34^th Int Conf on Machine Learning, p.4100-4109.

[51]Zhao MM, Tian YL, Zhao H, et al., 2018a. RF-based 3D skeletons. Proc Conf of the ACM Special Interest Group on Data Communication, p.267-281.

[52]Zhao MM, Li TH, Abu Alsheikh M, et al., 2018b. Through-wall human pose estimation using radio signals. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.7356-7365.

[53]Zheng C, Zhu SJ, Mendieta M, et al., 2021. 3D human pose estimation with spatial and temporal transformers. Proc IEEE/CVF Int Conf on Computer Vision, p.11656-11665.

[54]Zhou L, Chen YY, Gao YZ, et al., 2020. Occlusion-aware Siamese network for human pose estimation. Proc 16^th European Conf on Computer Vision, p.396-412.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

- Go to

RFPose-OT：基于最优传输理论的无线三维人体姿态估计

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference