CLC number:
On-line Access: 2024-11-05
Received: 2024-05-12
Revision Accepted: 2024-09-18
Crosschecked: 0000-00-00
Cited: 0
Clicked: 193
Deng LI, Peng LI, Aming WU, Yahong HAN. Prototype-guided cross-task knowledge distillation[J]. Frontiers of Information Technology & Electronic Engineering, 1998, -1(-1): .
@article{title="Prototype-guided cross-task knowledge distillation",
author="Deng LI, Peng LI, Aming WU, Yahong HAN",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="-1",
number="-1",
pages="",
year="1998",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2400383"
}
%0 Journal Article
%T Prototype-guided cross-task knowledge distillation
%A Deng LI
%A Peng LI
%A Aming WU
%A Yahong HAN
%J Journal of Zhejiang University SCIENCE C
%V -1
%N -1
%P
%@ 2095-9184
%D 1998
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2400383
TY - JOUR
T1 - Prototype-guided cross-task knowledge distillation
A1 - Deng LI
A1 - Peng LI
A1 - Aming WU
A1 - Yahong HAN
J0 - Journal of Zhejiang University Science C
VL - -1
IS - -1
SP -
EP -
%@ 2095-9184
Y1 - 1998
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2400383
Abstract: Recently, large-scale pretrained models have revealed their benefits in various tasks. However, due to the enormous computation complexity and storage demands, it is challenging to apply large-scale models to real scenarios. Existing knowledge distillation methods mainly require the teacher model and the student model to share the same label space, which restricts its application in the real scenario. To alleviate the constraint of different label spaces, we propose a prototype-guided cross-task knowledge distillation (ProC-KD) method to migrate the intrinsic local-level object knowledge of the teacher network to various task scenarios. First, to better learn the generalized knowledge in cross-task scenarios, we present a prototype learning module to learn the invariant intrinsic local representation of objects from the teacher network. Secondly, for diverse downstream tasks, a task-adaptive feature augmentation module is proposed to enhance the student network features with the learned generalization prototype representations and guide the learning of the student network to improve its generalization ability. The experimental results on various visual tasks demonstrate the effectiveness of our approach for cross-task knowledge distillation scenarios.
Open peer comments: Debate/Discuss/Question/Opinion
<1>