Full Text:   <1699>

CLC number: 

On-line Access: 2021-03-29

Received: 2020-08-17

Revision Accepted: 2021-02-14

Crosschecked: 0000-00-00

Cited: 0

Clicked: 2782

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
Open peer comments

Journal of Zhejiang University SCIENCE C 1998 Vol.-1 No.-1 P.

http://doi.org/10.1631/FITEE.2000417


One-against-all-based Hellinger distance decision tree for multi-class imbalanced learning


Author(s):  Minggang DONG, Ming LIU, Chao JING

Affiliation(s):  School of Information Science and Engineering, Guilin University of Technology, Guilin 541004, China; more

Corresponding email(s):   jingchao@glut.edu.cn

Key Words:  Decision trees, Multi-class imbalanced learning, Node splitting criterion, Hellinger distance, One-against-all scheme


Minggang DONG, Ming LIU, Chao JING. One-against-all-based Hellinger distance decision tree for multi-class imbalanced learning[J]. Frontiers of Information Technology & Electronic Engineering, 1998, -1(-1): .

@article{title="One-against-all-based Hellinger distance decision tree for multi-class imbalanced learning",
author="Minggang DONG, Ming LIU, Chao JING",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="-1",
number="-1",
pages="",
year="1998",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2000417"
}

%0 Journal Article
%T One-against-all-based Hellinger distance decision tree for multi-class imbalanced learning
%A Minggang DONG
%A Ming LIU
%A Chao JING
%J Journal of Zhejiang University SCIENCE C
%V -1
%N -1
%P
%@ 2095-9184
%D 1998
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2000417

TY - JOUR
T1 - One-against-all-based Hellinger distance decision tree for multi-class imbalanced learning
A1 - Minggang DONG
A1 - Ming LIU
A1 - Chao JING
J0 - Journal of Zhejiang University Science C
VL - -1
IS - -1
SP -
EP -
%@ 2095-9184
Y1 - 1998
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2000417


Abstract: 
Since traditional machine learning methods are sensitive to skewed distribution and do not consider the characteristics in multiclass imbalance problems, the skewed distribution of multiclass data poses a major challenge for machine learning algorithms. To tackle such issues, we propose a new splitting criterion of the decision tree based on the one-against-all-based hellinger distance (OAHD). Two crucial elements are included in the OAHD. First, the one-against-all scheme has been integrated into the computing process of the hellinger distance in the OAHD, thereby extending the hellinger distance decision tree to cope with the multiclass imbalance problem. Second, for the multiclass imbalance problem, the distribution and the number of distinct classes have been taken into account, and a modified Gini index has been designed. Moreover, we give theoretical proofs for the properties of OAHD, including skew insensitivity and the ability of seeking a purer node in the decision tree. Finally, we have collected 20 public real-world imbalanced data sets from the Knowledge Extraction based on Evolutionary Learning and University of California, Irvine repositories. Experimental and statistical results show that OAHD significantly has improved performance compared with the other five well-known decision trees in terms of precision, F-measure, and multiclass area under the receiver operating characteristic curve. Moreover, through statistical analysis, the Friedman and Nemenyi tests have been used to prove the performance advantage of OAHD in comparison with the other five decision trees.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - Journal of Zhejiang University-SCIENCE