Full Text:   <1657>

CLC number: TP391

On-line Access: 2013-11-06

Received: 2013-04-25

Revision Accepted: 2013-09-16

Crosschecked: 2013-10-15

Cited: 6

Clicked: 2867

Citations:  Bibtex RefMan EndNote GB/T7714

-   Go to

Article info.
1. Reference List
Open peer comments

Journal of Zhejiang University SCIENCE C 2013 Vol.14 No.11 P.845-858

http://doi.org/10.1631/jzus.C1300109


A mixture of HMM, GA, and Elman network for load prediction in cloud-oriented data centers


Author(s):  Da-yu Xu, Shan-lin Yang, Ren-ping Liu

Affiliation(s):  MOE Key Laboratory of Process Optimization and Intelligent Decision-Making, Hefei University of Technology, Hefei 230009, China; more

Corresponding email(s):   xdyhfut@163.com

Key Words:  Cloud computing, Load prediction, Hidden Markov model, Genetic algorithm, Elman network


Da-yu Xu, Shan-lin Yang, Ren-ping Liu. A mixture of HMM, GA, and Elman network for load prediction in cloud-oriented data centers[J]. Journal of Zhejiang University Science C, 2013, 14(11): 845-858.

@article{title="A mixture of HMM, GA, and Elman network for load prediction in cloud-oriented data centers",
author="Da-yu Xu, Shan-lin Yang, Ren-ping Liu",
journal="Journal of Zhejiang University Science C",
volume="14",
number="11",
pages="845-858",
year="2013",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.C1300109"
}

%0 Journal Article
%T A mixture of HMM, GA, and Elman network for load prediction in cloud-oriented data centers
%A Da-yu Xu
%A Shan-lin Yang
%A Ren-ping Liu
%J Journal of Zhejiang University SCIENCE C
%V 14
%N 11
%P 845-858
%@ 1869-1951
%D 2013
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.C1300109

TY - JOUR
T1 - A mixture of HMM, GA, and Elman network for load prediction in cloud-oriented data centers
A1 - Da-yu Xu
A1 - Shan-lin Yang
A1 - Ren-ping Liu
J0 - Journal of Zhejiang University Science C
VL - 14
IS - 11
SP - 845
EP - 858
%@ 1869-1951
Y1 - 2013
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.C1300109


Abstract: 
The rapid growth of computational power demand from scientific, business, and Web applications has led to the emergence of cloud-oriented data centers. These centers use pay-as-you-go execution environments that scale transparently to the user. load prediction is a significant cost-optimal resource allocation and energy saving approach for a cloud computing environment. Traditional linear or nonlinear prediction models that forecast future load directly from historical information appear less effective. Load classification before prediction is necessary to improve prediction accuracy. In this paper, a novel approach is proposed to forecast the future load for cloud-oriented data centers. First, a hidden Markov model (HMM) based data clustering method is adopted to classify the cloud load. The Bayesian information criterion and Akaike information criterion are employed to automatically determine the optimal HMM model size and cluster numbers. Trained HMMs are then used to identify the most appropriate cluster that possesses the maximum likelihood for current load. With the data from this cluster, a genetic algorithm optimized elman network is used to forecast future load. Experimental results show that our algorithm outperforms other approaches reported in previous works.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Andersson, S., Yamagishi, J., Clark, R.A.J., 2012. Synthesis and evaluation of conversational characteristics in HMM-based speech synthesis. Speech Commun., 54(2):175-188.

[2]Arasu, A., Manku, G.S., 2004. Approximate Counts and Quantiles over Sliding Windows. 23rd ACM SIGMOD-SIGACT-SIGART Symp. on Principles of Database Systems, p.286-296.

[3]Ardagna, D., Casolari, S., Colajanni, M., Panicucci, B., 2012. Dual time-scale distributed capacity allocation and load redirect algorithms for cloud systems. J. Parall. Distr. Comput., 72(6):796-808.

[4]Armbrust, M., Fox, A., Griffith, R., Joseph, A.D., Katz, R.H., Konwinski, A., Lee, G., Patterson, D.A., Rabkin, I.S.A., Zaharia, M., 2009. Above the Clouds: a Berkeley View of Cloud Computing. Available from http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdf.

[5]Atique, M., Ali, M.S., 2007. A Novel Adaptive Neuro Fuzzy Inference System Based CPU Scheduler for Multimedia Operating System. Int. Conf. on Neural Networks, p.1002-1007.

[6]Bauer, E., Adams, R., 2012. Reliability and Availability of Cloud Computing. Wiley-IEEE Press, New Jersey, USA.

[7]Bennani, M.N., Menasce, D.A., 2005. Resource Allocation for Autonomic Data Centers Using Analytic Performance Models. 2nd IEEE Int. Conf. on Autonomic Computing, p.229-240.

[8]Benson, T., Akella, A., Maltz, D.A., 2010. Network Traffic Characteristics of Data Centers in the Wild. 10th ACM SIGCOMM Conf. on Internet Measurement, p.267-280.

[9]Bilmes, J., 1997. A Gentle Tutorial on the EM Algorithm and Its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models. Technical Report ICSI-TR-97-02, University of Berkeley, CA.

[10]Bozdogan, H., 1987. Model selection and Akaike’s information criterion (AIC): the general theory and its analytical extensions. Psychometrika, 52(3):345-370.

[11]Burnham, K.P., Anderson, D.R., 2004. Multimodel inference understanding AIC and BIC in model selection. Sociol. Methods Res., 33(2):261-304.

[12]Calheiros, R.N., Ranjan, R., Buyya, R., 2011. Virtual Machine Provisioning Based on Analytical Performance and QoS in Cloud Computing Environments. Int. Conf. on Parallel Processing, p.295-304.

[13]Caron, E., Desprez, F., Muresan, A., 2010. Forecasting for Grid and Cloud Computing On-Demand Resources Based on Pattern Matching. 2nd IEEE Int. Conf. on Cloud Computing Technology and Science, p.456-463.

[14]Di, S., Kondo, D., Cirne, W., 2012a. Characterization and Comparison of Cloud versus Grid Workloads. IEEE Int. Conf. on Cluster Computing, p.230-238.

[15]Di, S., Kondo, D., Cirne, W., 2012b. Host Load Prediction in a Google Compute Cloud with a Bayesian Model. IEEE Int. Conf. on High Performance Computing, Networking, Storage and Analysis, p.1-11.

[16]Duy, T.V.T., Sato, Y., Inoguchi, Y., 2010. Performance Evaluation of a Green Scheduling Algorithm for Energy Savings in Cloud Computing. IEEE Int. Symp. on Parallel & Distributed Processing, Workshops and PhD Forum, p.1-8.

[17]Duy, T.V.T., Sato, Y., Inoguchi, Y., 2011. Improving accuracy of host load predictions on computational grids by artificial neural networks. Int. J. Parall. Emerg. Distr. Syst., 26(4):275-290.

[18]Elman, J., 1990. Finding structure in time. Cogn. Sci., 14(2):179-211.

[19]Hassan, M.R., Nath, B., Kirley, M., 2007. A fusion model of HMM, ANN and GA for stock market forecasting. Expert Syst. Appl., 33(1):171-180.

[20]Hassan, R., Nath, B., Kirley, M., Kamruzzaman, J., 2012. A hybrid of multiobjective evolutionary algorithm and HMM-fuzzy model for time series prediction. Neurocomputing, 81:1-11.

[21]Hirose, A., 2012. Complex-Valued Neural Networks, Vol. 400. Springer, New York, Dordrecht, Heidelberg, London.

[22]Islam, S., Keung, J., Lee, K., Liu, A., 2012. Empirical prediction models for adaptive resource provisioning in the cloud. Future Gener. Comput. Syst., 28(1):155-162.

[23]Kaur, P., Soni, A.K., Gosain, A., 2013. A robust kernelized intuitionistic fuzzy c-means clustering algorithm in segmentation of noisy medical images. Pattern Recogn. Lett., 34(2):163-175.

[24]Khan, A., Yan, X., Tao, S., Anerousis, N., 2012. Workload Characterization and Prediction in the Cloud: a Multiple Time Series Approach. 3rd IEEE Int. Workshop on Cloud Management, p.1287-1294.

[25]Khashei, M., Zeinal Hamadani, A., Bijari, M., 2012. A novel hybrid classification model of artificial neural networks and multiple linear regression models. Expert Syst. Appl., 39(3):2606-2620.

[26]Li, A., Yang, X., Kandula, S., Zhang, M., 2010. CloudCmp: Comparing Public Cloud Providers. 10th ACM SIGCOMM Conf. on Internet Measurement, p.1-14.

[27]Liang, K.C., Wang, X.D., Anastassiou, D., 2007. Bayesian basecalling for DNA sequence analysis using hidden Markov models. IEEE Trans. Comput. Biol. Bioinf., 4(3):430-440.

[28]Marston, S., Li, Z., Bandyopadhyay, S., Zhang, J., Ghalsasi, A., 2011. Cloud computing: the business perspective. Decis. Support Syst., 51(1):176-189.

[29]Mishra, A.K., Hellerstein, J.L., Cirne, W., Das, C.R., 2010. Towards characterizing cloud backend workloads: insights from Google compute clusters. ACM SIGMETRICS Perform. Eval. Rev., 37(4):34-41.

[30]Niu, D.X., Kou, B.E., Zhang, Y.Y., 2009. Mid-long Term Load Forecasting Using Hidden Markov Model. 3rd Int. Symp. on Intelligent Information Technology Application, p.481-483.

[31]Palit, A.K., Popovic, D., 2005. Computational Intelligence in Time Series Forecasting: Theory and Engineering Applications. In: Advances in Industrial Control. Springer-Verlag New York, Inc., Secaucus, NJ.

[32]Prevost, J.J., Nagothu, K., Kelley, B., Jamshidi, M., 2011. Prediction of Cloud Data Center Networks Loads Using Stochastic and Neural Models. 6th Int. Conf. on System of Systems Engineering, p.276-281.

[33]Rabiner, L.R., 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE, 77(2):257-286.

[34]Reiss, C., Tumanov, A., Ganger, G.R., Katz, R.H., Kozuch, M.A., 2012. Towards Understanding Heterogeneous Clouds at Scale: Google Trace Analysis. Proc. 3rd ACM Symp. on Cloud Computing, p.7.

[35]Saripalli, P., Kiran, G.V.R., Shankar, R.R., Narware, H., Bindal, N., 2011. Load Prediction and Hot Spot Detection Models for Autonomic Cloud Computing. 4th IEEE Int. Conf. on Utility and Cloud Computing, p.397-402.

[36]Vaquero, L.M., Rodero-Merino, L., Caceres, J., Lindner, M., 2009. A break in the clouds: towards a cloud definition. ACM SIGCOMM Comput. Commun. Rev., 39(1):50-55.

[37]Wang, P., Wang, H., Wang, W., 2011. Finding Semantics in Time Series. ACM SIGMOD Int. Conf. on Management of Data, p.385-396.

[38]Weakliem, D.L., 1999. A critique of the Bayesian information criterion for model selection. Sociol. Methods Res., 27(3):359-397.

[39]Yu, L., Wang, S., Lai, K.K., 2008. Credit risk assessment with a multistage neural network ensemble learning approach. Expert Syst. Appl., 34(2):1434-1444.

[40]Zhang, G.P., 2003. Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, 50:159-175.

[41]Zhang, Y., 2004. Prediction of Financial Time Series with Hidden Markov Models. PhD Thesis, Simon Fraser University, Burnaby, Canada.

[42]Zhang, Y., Sun, W., Inoguchi, Y., 2006. CPU Load Predictions on the Computational Grid. 6th IEEE Int. Symp. on Cluster Computing and the Grid, p.321-326.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - Journal of Zhejiang University-SCIENCE