Full Text:   <1051>

Summary:  <864>

CLC number: TP391

On-line Access: 2018-07-02

Received: 2017-01-04

Revision Accepted: 2017-08-21

Crosschecked: 2018-05-10

Cited: 0

Clicked: 4624

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Ke Guo

http://orcid.org/0000-0002-9278-4046

-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2018 Vol.19 No.5 P.639-650

http://doi.org/10.1631/FITEE.1700007


A new constrained maximum margin approach to discriminative learning of Bayesian classifiers


Author(s):  Ke Guo, Xia-bi Liu, Lun-hao Guo, Zong-jie Li, Zeng-min Geng

Affiliation(s):  Beijing Laboratory of Intelligent Information Technology, School of Computer Science, Beijing Institute of Technology, Beijing 100081, China; more

Corresponding email(s):   guoke@bit.edu.cn, liuxiabi@bit.edu.cn, guolunhao@bit.edu.cn, leezongjie@163.com, jsjgzm@bift.edu.cn

Key Words:  Discriminative learning, Statistical modeling, Bayesian pattern classifiers, Gaussian mixture models, UCI datasets


Ke Guo, Xia-bi Liu, Lun-hao Guo, Zong-jie Li, Zeng-min Geng. A new constrained maximum margin approach to discriminative learning of Bayesian classifiers[J]. Frontiers of Information Technology & Electronic Engineering, 2018, 19(5): 639-650.

@article{title="A new constrained maximum margin approach to discriminative learning of Bayesian classifiers",
author="Ke Guo, Xia-bi Liu, Lun-hao Guo, Zong-jie Li, Zeng-min Geng",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="19",
number="5",
pages="639-650",
year="2018",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1700007"
}

%0 Journal Article
%T A new constrained maximum margin approach to discriminative learning of Bayesian classifiers
%A Ke Guo
%A Xia-bi Liu
%A Lun-hao Guo
%A Zong-jie Li
%A Zeng-min Geng
%J Frontiers of Information Technology & Electronic Engineering
%V 19
%N 5
%P 639-650
%@ 2095-9184
%D 2018
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1700007

TY - JOUR
T1 - A new constrained maximum margin approach to discriminative learning of Bayesian classifiers
A1 - Ke Guo
A1 - Xia-bi Liu
A1 - Lun-hao Guo
A1 - Zong-jie Li
A1 - Zeng-min Geng
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 19
IS - 5
SP - 639
EP - 650
%@ 2095-9184
Y1 - 2018
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1700007


Abstract: 
We propose a novel discriminative learning approach for Bayesian pattern classification, called ‘constrained maximum margin (CMM)’. We define the margin between two classes as the difference between the minimum decision value for positive samples and the maximum decision value for negative samples. The learning problem is to maximize the margin under the constraint that each training pattern is classified correctly. This nonlinear programming problem is solved using the sequential unconstrained minimization technique. We applied the proposed CMM approach to learn Bayesian classifiers based on gaussian mixture models, and conducted the experiments on 10 UCI datasets. The performance of our approach was compared with those of the expectation-maximization algorithm, the support vector machine, and other state-of-the-art approaches. The experimental results demonstrated the effectiveness of our approach.

基于带约束最大间隔的贝叶斯分类器判别学习方法

摘要:提出一种新的面向贝叶斯模式分类的判别学习方法,称作"带约束的最大间隔(CMM)方法"。通过计算正样本最小决策值和负样本最大决策值的差异,定义类别之间的类别间隔。基于该类别间隔和正确分类的约束,将间隔函数学习问题转化为最大化类别间隔问题。利用序列无约束最小化技术解决该非线性规划问题。运用CMM方法得到基于高斯混合模型的贝叶斯分类器,并在10个UCI数据集上进行实验。结果表明,利用CMM方法得到的分类器分类性能,明显优于代表性的生成式学习方法期望最大化(EM)和判别式学习方法支持向量机(SVM),并且在多个数据集上取得了相比之前最优结果更好的效果。分类实验和分类器对比实验证明,CMM方法有效,具有一定应用前景。

关键词:判别学习;统计建模;贝叶斯分类器;高斯混合模型;UCI数据集

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Alcalá-Fdez J, Sanchez L, Garcia S, et al., 2009. KEEL: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput, 13(3):307-318.

[2]Alcalá-Fdez J, Fernández A, Luengo J, et al., 2011. KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multi-Valued Log Soft Comput, 17(2-3):255-287.

[3]Bredensteiner EJ, Bennett KP, 1999. Multicategory classification by support vector machines. In: Pang JS (Ed.), Computational Optimization. Springer US, New York, p.53-79.

[4]Dempster AP, Laird NM, Rubin DB, 1977. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B, 39(1):1-38.

[5]Demšar J, 2006. Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res, 7(Jan):1-30.

[6]Dong W, Zhou M, 2014. Gaussian classifier-based evolutionary strategy for multimodal optimization. IEEE Trans Neur Netw Learn Syst, 25(6):1200-1216.

[7]Dvořák J, Savický P, 2007. Softening splits in decision trees using simulated annealing. Int Conf on Adaptive and Natural Computing Algorithms, p.721-729.

[8]Fiacco AV, McCormick GP, 1990. Nonlinear Programming: Sequential Unconstrained Minimization Techniques. SIAM, Philadelphia.

[9]Forsythe GE, Malcolm MA, Moler CB, 1977. Computer Methods for Mathematical Computations (1st Ed.). Prentice Hall, New Jersey.

[10]Friedman N, Geiger D, Goldszmidt M, 1997. Bayesian network classifiers. Mach Learn, 29(2-3):131-163.

[11]Gorman RP, Sejnowski TJ, 1988. Analysis of hidden units in a layered network trained to classify sonar targets. Neur Netw, 1(1):75-89.

[12]Hall M, Frank E, Holmes G, et al., 2009. The WEKA data mining software: an update. ACM SIGKDD Explor Newsl, 11(1):10-18.

[13]Jiang H, 2010. Discriminative training of HMMs for automatic speech recognition: a survey. Comput Speech Lang, 24(4):589-608.

[14]Jiang L, Zhang H, Cai Z, 2009. A novel Bayes model: hidden naïve Bayes. IEEE Trans Knowl Data Eng, 21(10): 1361-1371.

[15]Jiang L, Zhang H, Cai Z, et al., 2012. Weighted average of one-dependence estimators. J Exp Theor Artif Intell, 24(2):219-230.

[16]Jiang Y, Zhou ZH, 2004. Editing training data for kNN classifiers with neural network ensemble. Advances in Neural Networks—Int Symp on Neural Networks, p.356-361.

[17]Juang BH, Katagiri S, 1992. Discriminative learning for minimum error classification (pattern recognition). IEEE Trans Signal Process, 40(12):3043-3054.

[18]Karabatak M, 2015. A new classifier for breast cancer detection based on naïve Bayesian. Measurement, 72:32-36.

[19]Kim BH, Pfister HD, 2011. An iterative joint linear-programming decoding of LDPC codes and finite-state channels. IEEE Conf on Communications, p.1-6.

[20]Kwok JTY, 1999. Moderating the outputs of support vector machine classifiers. IEEE Trans Neur Netw, 10(5): 1018-1031.

[21]Moerland P, 1999. A comparison of mixture models for density estimation. 9th Int Conf on Artificial Neural Networks, p.25-30.

[22]Nádas A, 1983. A decision theoretic formulation of a training problem in speech recognition and a comparison of training by unconditional versus conditional maximum likelihood. IEEE Trans Audio Speech Signal Process, 31(4):814-817.

[23]OpenCV Team, 2015. Open Source Computer Vision Library. http://opencv.org [Accessed on July 15, 2016].

[24]Pernkopf F, Wohlmayr M, 2010. Large margin learning of Bayesian classifiers based on Gaussian mixture models. Joint European Conf on Machine Learning and Knowledge Discovery in Databases, p.50-66.

[25]Pernkopf F, Wohlmayr M, Tschiatschek S, 2012. Maximum margin Bayesian network classifiers. IEEE Trans Patt Anal Mach Intell, 34(3):521-532.

[26]Povey D, Woodland PC, 2002. Minimum phone error and I-smoothing for improved discriminative training. IEEE Int Conf on Acoustics, p.105-108.

[27]University of California, 2013. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml [Accessed on Aug. 10, 2016].

[28]Vapnik V, 2013. The Nature of Statistical Learning Theory (2nd Ed.). Springer-Verlag, New York.

[29]Vlassis N, Likas A, 1999. A kurtosis-based dynamic approach to Gaussian mixture modeling. IEEE Trans Syst Man Cybern A, 29(4):393-399.

[30]Webb GI, Boughton JR, Wang Z, 2005. Not so naïve Bayes: aggregating one-dependence estimators. Mach Learn, 58(1):5-24.

[31]Woodland PC, Povey D, 2002. Large scale discriminative training of hidden Markov models for speech recognition. Comput Speech Lang, 16(1):25-47.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - Journal of Zhejiang University-SCIENCE