JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering

Accepted manuscript available online (unedited version)

Fractional-order global optimal backpropagation machine trained by an improved fractional-order steepest descent method

Author(s): Yi-fei Pu, Jian Wang
Affiliation(s): College of Computer Science, Sichuan University, Chengdu 610065, China; more
Corresponding email(s): puyifei@scu.edu.cn, wangjiannl@upc.edu.cn
Key Words: Fractional calculus, Fractional-order backpropagation algorithm, Fractional-order steepest descent method, Mean square error, Fractional-order multi-scale global optimization

Share this article to： More \|Next Paper >>>

Yi-fei Pu, Jian Wang. Fractional-order global optimal backpropagation machine trained by an improved fractional-order steepest descent method[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.1900593

@article{title="Fractional-order global optimal backpropagation machine trained by an improved fractional-order steepest descent method",
author="Yi-fei Pu, Jian Wang",
journal="Frontiers of Information Technology & Electronic Engineering",
year="in press",
publisher="Zhejiang University Press & Springer",
doi="https://doi.org/10.1631/FITEE.1900593"
}

%0 Journal Article
%T Fractional-order global optimal backpropagation machine trained by an improved fractional-order steepest descent method
%A Yi-fei Pu
%A Jian Wang
%J Frontiers of Information Technology & Electronic Engineering
%P 809-833
%@ 2095-9184
%D in press
%I Zhejiang University Press & Springer
doi="https://doi.org/10.1631/FITEE.1900593"

TY - JOUR
T1 - Fractional-order global optimal backpropagation machine trained by an improved fractional-order steepest descent method
A1 - Yi-fei Pu
A1 - Jian Wang
J0 - Frontiers of Information Technology & Electronic Engineering
SP - 809
EP - 833
%@ 2095-9184
Y1 - in press
PB - Zhejiang University Press & Springer
ER -
doi="https://doi.org/10.1631/FITEE.1900593"

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: We introduce the fractional-order global optimal backpropagation machine, which is trained by an improved fractional-order steepest descent method (FSDM). This is a fractional-order backpropagation neural network (FBPNN), a state-of-the-art fractional-order branch of the family of backpropagation neural networks (BPNNs), different from the majority of the previous classic first-order BPNNs which are trained by the traditional first-order steepest descent method. The reverse incremental search of the proposed FBPNN is in the negative directions of the approximate fractional-order partial derivatives of the square error. First, the theoretical concept of an FBPNN trained by an improved FSDM is described mathematically. Then, the mathematical proof of fractional-order global optimal convergence, an assumption of the structure, and fractional-order multi-scale global optimization of the FBPNN are analyzed in detail. Finally, we perform three (types of) experiments to compare the performances of an FBPNN and a classic first-order BPNN, i.e., example function approximation, fractional-order multi-scale global optimization, and comparison of global search and error fitting abilities with real data. The higher optimal search ability of an FBPNN to determine the global optimal solution is the major advantage that makes the FBPNN superior to a classic first-order BPNN.

用改进的分数阶最速下降法训练分数阶全局最优反向传播机

蒲亦非¹，王健²
¹四川大学计算机学院，中国成都市，610065
²中国石油大学（华东）理学院，中国青岛市，266580

摘要：本文介绍采用改进的分数阶最速下降法（FSDM）训练分数阶全局最优反向传播机。该反向传播机是一种分数阶反向传播神经网络（FBPNN）。分数阶反向传播神经网络是反向传播神经网络（BPNNs）大家族中一个先进的分数阶分支，它不同于绝大多数传统一阶最速下降法训练的经典一阶BPNNs。本文提出的FBPNN反向增量搜索在其均方误差近似分数阶偏导数的负方向进行。首先，从数学上描述用改进FSDM训练的FBPNN理论概念。然后，详细给出FBPNN分数阶全局最优收敛性的数学证明，分析神经网络结构构建以及分数阶多尺度全局寻优问题。最后，通过实验比较FBPNN和经典一阶BPNN的性能：包括函数逼近、分数阶多尺度全局寻优以及基于实际数据的全局搜索和误差拟合能力比对。相比经典一阶BPNN，FBPNN最主要优点是具有更高效的全局寻优能力，能够判定全局最优解。

关键词组：分数阶微积分；分数阶反向传播算法；分数阶最速下降法；均方误差；分数阶多尺度全局优化

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Andramonov M, Rubinov A, Glover B, 1999. Cutting angle methods in global optimization. Appl Math Lett, 12(3): 95-100.

[2]Barnard E, 1992. Optimization for training neural nets. IEEE Trans Neur Netw, 3(2):232-240.

[3]Barron AR, 1993. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans Inform Theory, 39(3):930-945.

[4]Battiti R, 1992. First- and second-order methods for learning: between steepest descent and Newton’s method. Neur Comput, 4(2):141-166.

[5]Browne CB, Powley E, Whitehouse D, et al., 2012. A survey of Monte Carlo tree search methods. IEEE Trans Comput Intell AI Games, 4(1):1-43.

[6]Cantu-Paz E, Kamath C, 2005. An empirical comparison of combinations of evolutionary algorithms and neural networks for classification problems. IEEE Trans Syst Man Cybern, 35(5):915-927.

[7]Charalambous C, 1992. Conjugate gradient algorithm for efficient training of artificial neural networks. IEE Proc G, 139(3):301-310.

[8]Chuang CC, Su SF, Hsiao CC, 2000. The annealing robust backpropagation (ARBP) learning algorithm. IEEE Trans Neur Netw, 11(5):1067-1077.

[9]Cybenko G, 1989. Approximation by superpositions of a sigmoidal function. Math Contr Signals Syst, 2(4):303- 314.

[10]Elwakil AS, 2010. Fractional-order circuits and systems: an emerging interdisciplinary research area. IEEE Circ Syst Mag, 10(4):40-50.

[11]Hagan MT, Menhaj MB, 1994. Training feedforward networks with the Marquardt algorithm. IEEE Trans Neur Netw, 5(6):989-993.

[12]Heymans N, Podlubny I, 2006. Physical interpretation of initial conditions for fractional differential equations with Riemann-Liouville fractional derivatives. Rheol Acta, 45(5):765-771.

[13]Hornik K, Stinchcombe M, White H, 1989. Multilayer feedforward networks are universal approximators. Neur Netw, 2(5):359-366.

[14]Jacobs RA, 1988. Increased rates of convergence through learning rate adaptation. Neur Netw, 1(4):295-307.

[15]Kaslik E, Sivasundaram S, 2011. Dynamics of fractional-order neural networks. Proc Int Joint Conf on Neural Networks, p.1375-1380.

[16]Koeller RC, 1984. Applications of fractional calculus to the theory of viscoelasticity. J Appl Mech, 51(2):299-307.

[17]LeCun Y, 1985. Une procedure d’apprentissage pour reseau a seuil assymetrique. Proc Cogn, 85:599-604 (in French).

[18]Leung FHF, Lam HK, Ling SH, et al., 2003. Tuning of the structure and parameters of a neural network using an improved genetic algorithm. IEEE Trans Neur Netw, 14(1):79-88.

[19]Ludermir TB, Yamazaki A, Zanchettin C, 2006. An optimization methodology for neural network weights and architectures. IEEE Trans Neur Netw, 17(6):1452-1459.

[20]Manabe S, 2002. A suggestion of fractional-order controller for flexible spacecraft attitude control. Nonl Dynam, 29(1-4):251-268.

[21]Maniezzo V, 1994. Genetic evolution of the topology and weight distribution of neural networks. IEEE Trans Neur Netw, 5(1):39-53.

[22]Nikolaev NY, Iba H, 2003. Learning polynomial feedforward neural networks by genetic programming and backpropagation. IEEE Trans Neur Netw, 14(2):337-350.

[23]Oldham KB, Spanier J, 1974. The Fractional Calculus: Integrations and Differentiations of Arbitrary Order. Academic Press, New York, USA, p.1-234.

[24]Özdemir N, Karadeniz D, 2008. Fractional diffusion-wave problem in cylindrical coordinates. Phys Lett A, 372(38): 5968-5972.

[25]Palmes PP, Hayasaka T, Usui S, 2005. Mutation-based genetic neural network. IEEE Trans Neur Netw, 16(3):587-600.

[26]Parker DB, 1985. Learning-Logic: Casting the Cortex of the Human Brain in Silicon. Technical Report, No. TR-47, Center for Computational Research in Economics and Management Science, MIT, USA.

[27]Petráš I, 2011. Fractional-Order Nonlinear Systems: Modeling, Analysis and Simulation. Springer Berlin Heidelberg, Berlin, Germany, p.1-218.

[28]Podlubny I, 1998. Fractional Differential Equations: an Introduction to Fractional Derivatives, Fractional Differential Equations, to Methods of Their Solution and Some of Their Applications. Academic Press, San Diego, USA, p.1-340.

[29]Podlubny I, Petráš I, Vinagre BM, et al., 2002. Analogue realizations of fractional-order controllers. Nonl Dynam, 29(1-4):281-296.

[30]Pu YF, Zhou JL, Yuan X, 2010. Fractional differential mask: a fractional differential-based approach for multiscale texture enhancement. IEEE Trans Image Process, 19(2): 491-511.

[31]Pu YF, Zhou JL, Zhang Y, et al., 2015. Fractional extreme value adaptive training method: fractional steepest descent approach. IEEE Trans Neur Netw Learn Syst, 26(4): 653-662.

[32]Pu YF, Yi Z, Zhou JL, 2016. Defense against chip cloning attacks based on fractional Hopfield neural networks. Int J Neur Syst, 27(4):1750003.

[33]Pu YF, Yi Z, Zhou JL, 2017. Fractional Hopfield neural networks: fractional dynamic associative recurrent neural networks. IEEE Trans Neur Netw Learn Syst, 28(10): 2319-2333.

[34]Pu YF, Yuan X, Yu B, 2018a. Analog circuit implementation of fractional-order memristor: arbitrary-order lattice scaling fracmemristor. IEEE Trans Circ Syst I, 65(9): 2903-2916.

[35]Pu YF, Siarry P, Chatterjee A, et al., 2018b. A fractional-order variational framework for retinex: fractional-order partial differential equation-based formulation for multi-scale nonlocal contrast enhancement with texture preserving. IEEE Trans Image Process, 27(3):1214-1229.

[36]Rigler AK, Irvine JM, Vogl TP, 1991. Rescaling of variables in back propagation learning. Neur Netw, 4(2):225-229.

[37]Rossikhin YA, Shitikova MV, 1997. Applications of fractional calculus to dynamic problems of linear and nonlinear hereditary mechanics of solids. Appl Mech Rev, 50(1): 15-67.

[38]Rumelhart DE, Hinton GE, Williams RJ, 1986a. Learning representations by back-propagating errors. Nature, 323(6088):533-536.

[39]Rumelhart DE, McClelland JL, PDP Research Group, 1986b. Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Vol. 1, MIT Press, Cambridge, USA, p.547-611.

[40]Shanno DF, 1990. Recent advances in numerical techniques for large-scale optimization. In: Miller WT, Sutton RS, Werbos PJ (Eds.), Neural Networks for Control. MIT Press, Cambridge, USA, p.171-178.

[41]Sontag ED, 1992. Feedback stabilization using two-hidden- layer nets. IEEE Trans Neur Netw, 3(6):981-990.

[42]Tollenaere T, 1990. SuperSAB: fast adaptive back propagation with good scaling properties. Neur Netw, 3(5):561-573.

[43]Treadgold NK, Gedeon TD, 1998. Simulated annealing and weight decay in adaptive learning: the SARPROP algorithm. IEEE Trans Neur Netw, 9(4):662-668.

[44]Vogl TP, Mangis JK, Rigler AK, et al., 1988. Accelerating the convergence of the back-propagation method. Biol Cybern, 59(4-5):257-263.

[45]Werbos PJ, 1974. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD Thesis, Harvard University, Cambridge, USA.

[46]Yeh WC, 2013. New parameter-free simplified swarm optimization for artificial neural network training and its application in the prediction of time series. IEEE Trans Neur Netw Learn Syst, 24(4):661-665.

[47]Zanchettin C, Ludermir TB, Almeida LM, 2011. Hybrid training method for MLP: optimization of architecture and training. IEEE Trans Syst Man Cybern, 41(4):1097- 1109.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

- Go to

用改进的分数阶最速下降法训练分数阶全局最优反向传播机

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference