Full Text:   <2781>

Summary:  <1958>

CLC number: TP338.6

On-line Access: 2016-11-07

Received: 2016-06-15

Revision Accepted: 2016-10-05

Crosschecked: 2016-10-25

Cited: 0

Clicked: 5885

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Wei Hu

http://orcid.org/0000-0002-8839-7748

-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2016 Vol.17 No.11 P.1154-1175

http://doi.org/10.1631/FITEE.1601336


Storage wall for exascale supercomputing


Author(s):  Wei Hu, Guang-ming Liu, Qiong Li, Yan-huang Jiang, Gui-lin Cai

Affiliation(s):  College of Computer, National University of Defense Technology, Changsha 410073, China; more

Corresponding email(s):   huwei@nscc-tj.gov.cn, liugm@nscc-tj.gov.cn, qiong_joan_li@aliyun.com, yhjiang@nudt.edu.cn, cc_cai@163.com

Key Words:  Storage-bounded speedup, Storage wall, High performance computing, Exascale computing


Wei Hu, Guang-ming Liu, Qiong Li, Yan-huang Jiang, Gui-lin Cai. Storage wall for exascale supercomputing[J]. Frontiers of Information Technology & Electronic Engineering, 2016, 17(11): 1154-1175.

@article{title="Storage wall for exascale supercomputing",
author="Wei Hu, Guang-ming Liu, Qiong Li, Yan-huang Jiang, Gui-lin Cai",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="17",
number="11",
pages="1154-1175",
year="2016",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1601336"
}

%0 Journal Article
%T Storage wall for exascale supercomputing
%A Wei Hu
%A Guang-ming Liu
%A Qiong Li
%A Yan-huang Jiang
%A Gui-lin Cai
%J Frontiers of Information Technology & Electronic Engineering
%V 17
%N 11
%P 1154-1175
%@ 2095-9184
%D 2016
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1601336

TY - JOUR
T1 - Storage wall for exascale supercomputing
A1 - Wei Hu
A1 - Guang-ming Liu
A1 - Qiong Li
A1 - Yan-huang Jiang
A1 - Gui-lin Cai
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 17
IS - 11
SP - 1154
EP - 1175
%@ 2095-9184
Y1 - 2016
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1601336


Abstract: 
The mismatch between compute performance and I/O performance has long been a stumbling block as supercomputers evolve from petaflops to exaflops. Currently, many parallel applications are I/O intensive, and their overall running times are typically limited by I/O performance. To quantify the I/O performance bottleneck and highlight the significance of achieving scalable performance in peta/exascale supercomputing, in this paper, we introduce for the first time a formal definition of the ‘storage wall’ from the perspective of parallel application scalability. We quantify the effects of the storage bottleneck by providing a storage-bounded speedup, defining the storage wall quantitatively, presenting existence theorems for the storage wall, and classifying the system architectures depending on I/O performance variation. We analyze and extrapolate the existence of the storage wall by experiments on Tianhe-1A and case studies on Jaguar. These results provide insights on how to alleviate the storage wall bottleneck in system design and achieve hardware/software optimizations in peta/exascale supercomputing.

面向E级高性能计算存储墙问题研究

概要:I/O性能与计算性能的不匹配是超级计算机从P级向E级发展的主要障碍之一。当前,I/O密集型应用迅速增加,其运行效率受到了I/O性能的制约。为进一步量化I/O性能瓶颈,分析其对大规模并行应用可扩展性的重要影响,本文在定义和分析存储受限加速比的基础上,第一次从并行应用可扩展性的角度提出"存储墙"的定义,量化和讨论了"存储墙"的存在性,并基于存储墙特性对现有高性能计算系统结构提出了新的分类方法和可扩展性分析方法。通过基于"天河1A"超级计算机的实验和基于"美洲虎"超级计算机的案例分析,本文验证了存储墙的存在性并分析了其变化特性。这些实验和分析结果对缓解和消除未来高性能计算机的存储墙瓶颈,实现平衡的软硬件设计具有重要指导意义。

关键词:存储受限加速比;存储墙;高性能计算;E级计算

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Agarwal, S., Garg, R., Gupta, M.S., et al., 2004. Adaptive incremental checkpointing for massively parallel systems. Proc. 18th Annual Int. Conf. on Supercomputing, p.277-286.

[2]Agerwala, T., 2010. Exascale computing: the challenges and opportunities in the next decade. IEEE 16th Int. Symp. on High Performance Computer Architecture.

[3]Alam, S.R., Kuehn, J.A., Barrett, R.F., et al., 2007. Cray XT4: an early evaluation for petascale scientific simulation. Proc. ACM/IEEE Conf. on Supercomputing, p.1-12.

[4]Ali, N., Carns, P.H., Iskra, K., et al., 2009. Scalable I/O forwarding framework for high-performance computing systems. IEEE Int. Conf. on Cluster Computing and Workshops, p.1-10,

[5]Amdahl, G.M., 1967. Validity of the single processor approach to achieving large scale computing capabilities. Proc. Spring Joint Computer Conf., p.483-485.

[6]Bent, J., Gibson, G., Grider, G., et al., 2009. PLFS: a checkpoint file system for parallel applications. Proc. Conf. on High Performance Computing Networking, Storage and Analysis, p.21.

[7]Cappello, F., Geist, A., Gropp, B., et al., 2009. Toward exascale resilience. Int. J. High Perform. Comput. Appl., 23(4):374-388.

[8]Carns, P., Harms, K., Allcock, W., et al., 2011. Understanding and improving computational science storage access through continuous characterization. ACM Trans. Stor., 7(3):1-26.

[9]Chen, J., Tang, Y.H., Dong, Y., et al., 2016. Reducing static energy in supercomputer interconnection networks using topology-aware partitioning. IEEE Trans. Comput., 65(8):2588-2602.

[10]Culler, D.E., Singh, J.P., Gupta, A., 1998. Parallel Computer Architecture: a Hardware/Software Approach. Morgan Kaufmann Publishers Inc., San Francisco, USA.

[11]Egwutuoha, I.P., Levy, D., Selic, B., et al., 2013. A survey of fault tolerance mechanisms and checkpoint/restart implementations for high performance computing systems. J. Supercomput., 65(3):1302-1326.

[12]Elnozahy, E.N., Plank, J.S., 2004. Checkpointing for peta-scale systems: a look into the future of practical rollback-recovery. IEEE Trans. Depend. Secur. Comput., 1(2):97-108.

[13]Elnozahy, E.N., Alvisi, L., Wang, Y.M., et al., 2002. A survey of rollback-recovery protocols in message-passing systems. ACM Comput. Surv., 34(3):375-408.

[14]Fahey, M., Larkin, J., Adams, J., 2008. I/O performance on a massively parallel cray XT3/XT4. IEEE Int. Symp. on Parallel and Distributed Processing, p.1-12.

[15]Ferreira, K.B., Riesen, R., Bridges, P., et al., 2014. Accelerating incremental checkpointing for extreme-scale computing. Fut. Gener. Comput. Syst., 30:66-77.

[16]Frasca, M., Prabhakar, R., Raghavan, P., et al., 2011. Virtual I/O caching: dynamic storage cache management for concurrent workloads. Proc. Int. Conf. for High Performance Computing, Networking, Storage and Analysis, p.38.

[17]Gamblin, T., de Supinski, B.R., Schulz, M., et al., 2008. Scalable load-balance measurement for SPMD codes. Proc. ACM/IEEE Conf. on Supercomputing, p.1-12.

[18]Gustafson, J.L., 1988. Reevaluating Amdahl’s law. Commun. ACM, 31(5):532-533.

[19]Hargrove, P.H., Duell, J.C., 2006. Berkeley lab checkpoint/restart (BLCR) for Linux clusters. J. Phys. Conf. Ser., 46(1):494-499.

[20]Hennessy, J.L., Patterson, D.A., 2011. Computer Architecture: a Quantitative Approach. Elsevier.

[21]HPCwire, 2010. DARPA Sets Ubiquitous HPC Program in Motion. Available from http://www.hpcwire.com/2010/08/10/darpa_sets_ubiquitous_hpc_program_in_motion/.

[22]Hu, W., Liu, G.M., Li, Q., et al., 2016. Storage speedup: an effective metric for I/O-intensive parallel application. 18th Int. Conf. on Advanced Communication Technology, p.1-2.

[23]Kalaiselvi, S., Rajaraman, V., 2000. A survey of checkpointing algorithms for parallel and distributed computers. Sadhana, 25(5):489-510.

[24]Kim, Y., Gunasekaran, R., 2015. Understanding I/O workload characteristics of a peta-scale storage system. J. Supercomput., 71(3):761-780.

[25]Kim, Y., Gunasekaran, R., Shipman, G.M., et al., 2010. Workload characterization of a leadership class storage cluster. Petascale Data Storage Workshop, p.1-5.

[26]Kotz, D., Nieuwejaar, N., 1994. Dynamic file-access characteristics of a production parallel scientific workload. Proc. Supercomputing, p.640-649.

[27]Liao, W.K., Ching, A., Coloma, K., et al., 2007. Using MPI file caching to improve parallel write performance for large-scale scientific applications. Proc. ACM/IEEE Conf. on Supercomputing, p.8.

[28]Liu, N., Cope, J., Carns, P., et al., 2012. On the role of burst buffers in leadership-class storage systems. IEEE 28th Symp. on Mass Storage Systems and Technologies, p.1-11.

[29]Liu, Y., Gunasekaran, R., Ma, X.S., et al., 2014. Automatic identification of application I/O signatures from noisy server-side traces. Proc. 12th USENIX Conf. on File and Storage Technologies, p.213-228.

[30]Lu, K., 1999. Research on Parallel File Systems Technology Toward Parallel Computing. PhD Thesis, National University of Defense Technology, Changsha, China (in Chinese).

[31]Lucas, R., Ang, J., Bergman, K., et al., 2014. DOE Advanced Scientific Computing Advisory Subcommittee (ASCAC) Report: Top Ten Exascale Research Challenges. USDOE Office of Science.

[32]Miller, E.L., Katz, R.H., 1991. Input/output behavior of supercomputing applications. Proc. ACM/IEEE Conf. on Supercomputing, p.567-576.

[33]Moreira, J., Brutman, M., Castano, J., et al., 2006. Designing a highly-scalable operating system: the blue Gene/L story. Proc. ACM/IEEE Conf. on Supercomputing, p.53-61.

[34]Oldfield, R.A., Arunagiri, S., Teller, P.J., et al., 2007. Modeling the impact of checkpoints on next-generation systems. 24th IEEE Conf. on Mass Storage Systems and Technologies, p.30-46.

[35]Pasquale, B.K., Polyzos, G.C., 1993. A static analysis of I/O characteristics of scientific applications in a production workload. Proc. ACM/IEEE Conf. on Supercomputing, p.388-397.

[36]Plank, J.S., Beck, M., Kingsley, G., et al., 1995. Libckpt: transparent checkpointing under Unix. Proc. USENIX Technical Conf., p.18.

[37]Purakayastha, A., Ellis, C., Kotz, D., et al., 1995. Characterizing parallel file-access patterns on a large-scale multiprocessor. 9th Int. Parallel Processing Symp., p.165-172.

[38]Sisilli, J., 2015. Improved Solutions for I/O Provisioning and Application Acceleration. Available from http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2015/20150811_FD11_Sisilli.pdf [Accessed on Nov. 18, 2015].

[39]Rudin, W., 1976. Principles of Mathematical Analysis. McGraw-Hill Publishing Co.

[40]Shalf, J., Dosanjh, S., Morrison, J., 2011. Exascale computing technology challenges. 9th Int. Conf. on High Performance Computing for Computational Science, p.1-25.

[41]Strohmaier, E., Dongarra, J., Simon, H., et al., 2015. TOP500 Supercomputer Sites. Available from http://www.top500.org/ [Accessed on Dec. 30, 2015].

[42]Sun, X.H., Ni, L.M., 1993. Scalable problems and memory-bounded speedup. J. Parall. Distr. Comput., 19(1): 27-37.

[43]University of California, 2007. IOR HPC Benchmark. Available from http://sourceforge.net/projects/ior-sio/ [Accessed on Sept. 1, 2014].

[44]Wang, F., Xin, Q., Hong, B., et al., 2004. File system workload analysis for large scale scientific computing applications. Proc. 21st IEEE/12th NASA Goddard Conf. on Mass Storage Systems and Technologies, p.139-152.

[45]Wang, T., Oral, S., Wang, Y.D., et al., 2014. Burstmem: a high-performance burst buffer system for scientific applications. IEEE Int. Conf. on Big Data, p.71-79.

[46]Wang, T., Oral, S., Pritchard, M., et al., 2015. Development of a burst buffer system for data-intensive applications. arXiv:{1505.01765}. Available from http://arxiv.org/abs/1505.01765

[47]Wang, Z.Y., 2009. Reliability speedup: an effective metric for parallel application with checkpointing. Int. Conf. on Parallel and Distributed Computing, Applications and Technologies, p.247-254.

[48]Xie, B., Chase, J., Dillow, D., et al., 2012. Characterizing output bottlenecks in a supercomputer. Int. Conf. for High Performance Computing, Networking, Storage and Analysis, p.1-11.

[49]Yang, X.J., Du, J., Wang, Z.Y., 2011. An effective speedup metric for measuring productivity in large-scale parallel computer systems. J. Supercomput., 56(2):164-181.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE