Full Text:  <2104>

Summary:  <1428>

CLC number: TP303

On-line Access: 2022-04-22

Received: 2018-07-11

Revision Accepted: 2018-09-07

Crosschecked: 2018-10-10

Cited: 0

Clicked: 3306

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Xiang-hui Xie

http://orcid.org/0000-0002-2661-0179

-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering 

Accepted manuscript available online (unedited version)


Exploring high-performance processor architecture beyond the exascale


Author(s):  Xiang-hui Xie, Xun Jia

Affiliation(s):  State Key Laboratory of Mathematical Engineering and Advanced Computing, Wuxi 214125, China

Corresponding email(s):  xie.xianghui@meac-skl.cn, jia.xun@meac-skl.cn

Key Words:  High-performance computing, Beyond the exascale, Processor architecture, Application-customized hardware, Distributed computational resources


Share this article to: More <<< Previous Paper|Next Paper >>>

Xiang-hui Xie, Xun Jia. Exploring high-performance processor architecture beyond the exascale[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.1800424

@article{title="Exploring high-performance processor architecture beyond the exascale",
author="Xiang-hui Xie, Xun Jia",
journal="Frontiers of Information Technology & Electronic Engineering",
year="in press",
publisher="Zhejiang University Press & Springer",
doi="https://doi.org/10.1631/FITEE.1800424"
}

%0 Journal Article
%T Exploring high-performance processor architecture beyond the exascale
%A Xiang-hui Xie
%A Xun Jia
%J Frontiers of Information Technology & Electronic Engineering
%P 1224-1229
%@ 2095-9184
%D in press
%I Zhejiang University Press & Springer
doi="https://doi.org/10.1631/FITEE.1800424"

TY - JOUR
T1 - Exploring high-performance processor architecture beyond the exascale
A1 - Xiang-hui Xie
A1 - Xun Jia
J0 - Frontiers of Information Technology & Electronic Engineering
SP - 1224
EP - 1229
%@ 2095-9184
Y1 - in press
PB - Zhejiang University Press & Springer
ER -
doi="https://doi.org/10.1631/FITEE.1800424"


Abstract: 
The ever-increasing need for high performance in scientific computation and engineering applications will push high-performance computing beyond the exascale. As an integral part of a supercomputing system, high-performance processors and their architecture designs are crucial in improving system performance. In this paper, three architecture design goals for high-performance processors beyond the exascale are introduced, including effective performance scaling, efficient resource utilization, and adaptation to diverse applications. Then a high-performance many-core processor architecture with scalar processing and application-specific acceleration (Massa) is proposed, which aims to achieve the above three goals by employing the techniques of distributed computational resources and application-customized hardware. Finally, some future research directions regarding the Massa architecture are discussed.

后E级时代高性能处理器架构的探索

摘要:科学计算与工程应用对高性能日益增长的需求将推动高性能计算进入后E级时代。高性能处理器作为超级计算系统核心部件,其架构设计对提高系统性能至关重要。首先介绍后E级时代高性能处理器架构设计的3个目标,即性能有效扩展、资源高效利用和适应多种应用。其次,提出标量运算众核主芯片连接应用加速从芯片的Massa处理器架构,通过计算资源分布和应用定制硬件的结合,满足后E级时代高性能处理器架构设计的目标。最后,讨论了Massa架构未来需要重点研究的若干问题。

关键词组:高性能计算;后E级;处理器架构;应用定制硬件;计算资源分布

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Esmaeilzadeh H, Blem E, Amant RS, et al., 2011. Dark silicon and the end of multicore scaling. 38th Annual Int Symp on Computer Architecture, p.365-376.

[2]Fang JR, Fu HH, Zhao WL, et al., 2017. swDNN: a library for accelerating deep learning applications on Sunway TaihuLight. 31st Int Parallel and Distributed Processing Symp, p.615-624.

[3]Fu HH, Liao JF, Yang JZ, et al., 2016. The Sunway TaihuLight supercomputer: system and applications. Sci China Inform Sci, 59(7):1-15.

[4]Fu HH, He CH, Chen BW, et al., 2017. 18.9-Pflops nonlinear earthquake simulation on Sunway TaihuLight: enabling depiction of 18-Hz and 8-meter scenarios. 30th Int Conf for High Performance Computing, Networking, Storage and Analysis, p.1-12.

[5]García-Flores V, Ayguade E, Pe na AJ, 2017. Efficient data sharing on heterogeneous systems. Proc 46th Int Conf on Parallel Processing, p.121-130.

[6]Hemmert S, 2016. Green HPC: from nice to necessity. Comput Sci Eng, 12(6):8-10.

[7]Jia X, Wu GM, Xie XH, 2017. A high-performance accelerator for floating-point matrix multiplication. 15th Int Symp on Parallel and Distributed Processing with Applicatons, p.396-402.

[8]Jouppi NP, Young C, Patil N, et al., 2017. In-datacenter performance analysis of a tensor processing unit. 44th Annual Int Symp on Computer Architecture, p.1-12.

[9]Lin H, Tang XC, Yu BW, et al., 2017. Scalable graph on Sunway TaihuLight with ten million cores. 31st Int Parallel and Distributed Processing Symp, p.635-645.

[10]Ozdal MM, Yesil S, Kim T, et al., 2016. Energy efficient architecture for graph analytics accelerators. 43rd Int Symp on Computer Architecture, p.166-177.

[11]Pedram A, Gerstlauer A, van de Geijn RA, 2011. A high-performance, low-power linear algebra core. 22nd Int Conf on Application-specific System, Architecture and Processors, p.35-42.

[12]Schulte MJ, Ignatowski M, Loh GH, et al., 2015. Achieving exascale capabilities through heterogeneous computing. IEEE Micro, 35(4):26-36.

[13]Shalf JM, Leland R, 2015. Computing beyond Moore's law. Computer, 48(12):14-23.

[14]Silbertstein M, 2017. OmniX: an accelerator-centric OS for omni-programmable systems. 16$^rm th$ Workshop on Hot Topics in Operating Systems, p.69-75.

[15]Williams RS, 2017. What's next? [The end of Moore's law] Comput Sci Eng, 19(2):7-13.

[16]Xu ZG, Lin J, Matsuoka S, 2017. Benchmarking SW26010 many-core processor. 31st Int Conf on Parallel and Distributed Processing Symp Workshops, p.743-752.

[17]Yang C, Xue W, Fu HH, et al., 2016. 10m-core scalable fully-implicit solver for nonhydrostatic atmospheric dynamics. 29th Int Conf for High Performance Computing, Networking, Storage and Analysis, p.57-68.

[18]Zhao B, Gao W, Zhao RC, et al., 2015. Performance evaluation of NPB and SPEC CPU2006 on various SIMD extensions. 1st Int Conf on Big Data Computing and Communications, p.257-272.

[19]Zheng F, Zhang K, Wu GM, et al., 2014. Architecture techniques of many-core processor for energy-efficient in high performance computing. Chin J Comput, 37(10):2176-2186 (in Chinese).

[20]Zheng F, Li HL, Lv H, et al., 2015. Cooperative computing techniques for a deeply fused and heterogeneous many-core processor architecture. J Comput Sci Technol, 30(1):145-162.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE