Full Text:   <4>

Summary:  <2>

CLC number: TP393.03

On-line Access: 2026-03-23

Received: 2025-08-29

Revision Accepted: 2026-02-02

Crosschecked: 2026-03-23

Cited: 0

Clicked: 8

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Shuaikang HOU

https://orcid.org/0009-0000-4973-0563

Qinrang LIU

https://orcid.org/0000-0002-9957-7365

Wenbo ZHANG

https://orcid.org/0009-0000-6542-9797

Ping LV

https://orcid.org/0009-0008-1608-6597

Peijie LI

https://orcid.org/0009-0002-6280-7857

Wei GUO

https://orcid.org/0000-0002-1023-7277

-   Go to

Article info.
Open peer comments

ENGINEERING Information Technology & Electronic Engineering  2026 Vol.27 No.3 P.1-20

http://doi.org/10.1631/ENG.ITEE.2025.0005


FTHOE: a Hamiltonian-driven fault-tolerant routing algorithm for wafer-scale interconnection networks


Author(s):  Shuaikang HOU, Qinrang LIU, Wenbo ZHANG, Ping LV, Peijie LI, Wei GUO

Affiliation(s):  1. Information Engineering University, Zhengzhou 450001, China more

Corresponding email(s):   qinrangliu@sina.com

Key Words:  Wafer-scale system, Fault-tolerant, Hamiltonian path, Odd–, even turn model, Load balancing


Shuaikang HOU, Qinrang LIU, Wenbo ZHANG, Ping LV, Peijie LI, Wei GUO. FTHOE: a Hamiltonian-driven fault-tolerant routing algorithm for wafer-scale interconnection networks[J]. Journal of Zhejiang University Science C, 2026, 27(3): 1-20.

@article{title="FTHOE: a Hamiltonian-driven fault-tolerant routing algorithm for wafer-scale interconnection networks",
author="Shuaikang HOU, Qinrang LIU, Wenbo ZHANG, Ping LV, Peijie LI, Wei GUO",
journal="Journal of Zhejiang University Science C",
volume="27",
number="3",
pages="1-20",
year="2026",
publisher="Zhejiang University Press & Springer",
doi="10.1631/ENG.ITEE.2025.0005"
}

%0 Journal Article
%T FTHOE: a Hamiltonian-driven fault-tolerant routing algorithm for wafer-scale interconnection networks
%A Shuaikang HOU
%A Qinrang LIU
%A Wenbo ZHANG
%A Ping LV
%A Peijie LI
%A Wei GUO
%J Frontiers of Information Technology & Electronic Engineering
%V 27
%N 3
%P 1-20
%@ 1869-1951
%D 2026
%I Zhejiang University Press & Springer
%DOI 10.1631/ENG.ITEE.2025.0005

TY - JOUR
T1 - FTHOE: a Hamiltonian-driven fault-tolerant routing algorithm for wafer-scale interconnection networks
A1 - Shuaikang HOU
A1 - Qinrang LIU
A1 - Wenbo ZHANG
A1 - Ping LV
A1 - Peijie LI
A1 - Wei GUO
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 27
IS - 3
SP - 1
EP - 20
%@ 1869-1951
Y1 - 2026
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/ENG.ITEE.2025.0005


Abstract: 
As application scenarios continue to grow in complexity, wafer-scale systems impose increasingly stringent requirements on the reliability of interconnection networks. Under inevitable process-induced manufacturing defects and environmental disturbances, node and link faults occur frequently in wafer-scale interconnection networks, making fault tolerance a key factor in improving overall system reliability. To address chiplet node faults and link faults in wafer-scale interconnection networks, this paper proposes a load-balancing virtual-channel-less fault-tolerant routing algorithm, termed FTHOE. The proposed algorithm is based on a Hamiltonian routing strategy and the odd–;even turn model. By exploiting local fault vector information at the current node, FTHOE dynamically adjusts the output port selection priority, thereby shortening detour paths around faulty regions while effectively reducing the probability of packets being trapped in fault neighborhoods. At the same time, FTHOE preserves a relatively high degree of minimal path diversity by retaining the adaptiveness of Hamiltonian-based routing under fault conditions, thereby enhancing network load-balancing and overall communication performance. Simulation results demonstrate that, compared with existing fault-tolerant routing algorithms, FTHOE significantly reduces average network latency and improves throughput, exhibiting robust fault tolerance and load-balancing performance under complex fault scenarios.

FTHOE:一种面向晶圆级互连网络的哈密顿驱动容错路由算法

侯帅康1,刘勤让2,张文博1,吕平1,李沛杰1,郭威1
1信息工程大学,中国郑州市,450001
2复旦大学大数据研究院,中国上海市,200433
摘要:随着应用场景日益复杂,晶圆级系统对互连网络可靠性提出愈发严苛的要求。在不可避免的工艺制造缺陷和环境干扰下,晶圆级互连网络中节点和链路故障频发,使得容错能力成为提升系统整体可靠性关键因素。针对晶圆级互连网络中的芯片粒节点故障和链路故障,本文提出一种名为FTHOE的负载均衡无虚通道容错路由算法。该算法基于哈密顿路由策略和奇偶转向模型,通过利用当前节点的本地故障向量信息,动态调整输出端口选择优先级,从而在绕开故障区域时缩短迂回路径,并有效降低数据包陷入故障邻域的概率。同时,FTHOE在故障条件下保留了哈密顿路由的自适应特性,维持较高的最短路径多样性,进而增强网络负载均衡能力与整体通信性能。仿真结果表明,与现有容错路由算法相比,FTHOE显著降低了平均网络延迟并提高了吞吐量,在复杂故障场景下展现出鲁棒的容错能力和负载均衡性能。

关键词:晶圆级系统;容错;哈密顿路径;奇偶转向模型;负载均衡

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Agarwal N, Krishna T, Peh LS, et al., 2009. GARNET: a detailed on-chip network model inside a full-system simulator. IEEE Int Symp on Performance Analysis of Systems and Software, p.33-42.

[2]Bahrebar P, Stroobandt D, 2015. The Hamiltonian-based odd–even turn model for maximally adaptive routing in 2D mesh networks-on-chip. Comput Electr Eng, 45:386-401.

[3]Bohr M, 2007. A 30 year retrospective on Dennard’s MOSFET scaling paper. IEEE Sol-State Circ Soc Newsl, 12(1):11-13.

[4]Charif A, Zergainoh NE, Nicolaidis M, 2016. Addressing transient routing errors in fault-tolerant networks-on-chips. 21st IEEE European Test Symp, p.1-6.

[5]Dally WJ, Seitz CL, 1987. Deadlock-free message routing in multiprocessor interconnection networks. IEEE Trans Comput, C-36(5):547-553.

[6]Daneshtalab M, Ebrahimi M, Xu TC, et al., 2011. A generic adaptive path-based routing method for MPSoCs. J Syst Architect, 57(1):109-120.

[7]Ebrahimi M, Daneshtalab M, 2015. A light-weight fault-tolerant routing algorithm tolerating faulty links and routers. Computing, 97(6):631-648.

[8]Guan J, Cai JP, Wang YQ, et al., 2023. A low-cost oblivious and fault-tolerant routing strategy for NoCs. J Air Force Eng Univ, 24(1):95-102 (in Chinese).

[9]Hu Y, Lin XH, Wang HZ, et al., 2024. Wafer-scale computing: advancements, challenges, and future perspectives. IEEE Circ Syst Mag, 24(1):52-81.

[10]Jerger NE, Kannan A, Li ZM, et al., 2014. NoC architectures for silicon interposer systems: why pay for more wires when you can get them (from your interposer) for free? 47th Annual IEEE/ACM Int Symp on Microarchitecture, p.458-470.

[11]Joshi B, Thakur MK, 2023. A traffic intensive virtual channels allocation scheme in network-on-chip. Arab J Sci Eng, 48(8):9619-9633.

[12]Lowe-Power J, Ahmad AM, Akram A, et al., 2020. The gem5 simulator: version 20.0+.

[13]Mohapatra H, Rath AK, 2019. Fault tolerance in WSN through PE-LEACH protocol. IET Wirel Sens Syst, 9(6):358-365.

[14]Moore GE, 1998. Cramming more components onto integrated circuits. Proc IEEE, 86(1):82-85.

[15]Nehnouh C, Senouci M, 2019. A new fault tolerant routing algorithm for networks on chip. Int J Embed Real-Time Commun Syst, 10(3):68-85.

[16]Pal S, Petrisko D, Tomei M, et al., 2019. Architecting waferscale processors—a GPU case study. IEEE Int Symp on High Performance Computer Architecture, p.250-263.

[17]Pal S, Liu JY, Alam I, et al., 2021. Designing a 2048-chiplet, 14336-core waferscale processor. 58th ACM/IEEE Design Automation Conf, p.1183-1188.

[18]Rahaman MM, Ghosal P, Das TS, 2019. Latency, throughput and power aware adaptive NoC routing on orthogonal convex faulty region. J Circ Syst Comput, 28(4):1950055.

[19]Renani NB, Yaghoubi E, Sadehnezhad N, et al., 2022. NLR-OP: a high-performance optical router based on North-Last turning model for multicore processors. J Supercomput, 78(2):2442-2476.

[20]Reza A, Jolani P, Reshadi M, 2019. CAFT: cost-aware and fault-tolerant routing algorithm in 2D mesh network-on-chip. J Adv Comput Eng Technol, 5(4):205-212.

[21]Wu JX, Liu QR, Shen JL, et al., 2024. From SoC to SDSoW: a new paradigm for microelectronics development. Sci Sin Inform, 54(6):1350-1368 (in Chinese).

[22]Xie RL, Cai JP, Xin X, 2016. Simple fault-tolerant method to balance load in network-on-chip. Electron Lett, 52(10):814-816.

[23]Xie RL, Cai JP, Xin X, et al., 2018. LBFT: a fault-tolerant routing algorithm for load-balancing network-on-chip based on odd–even turn model. J Supercomput, 74(8):3726-3747.

[24]Xu Z, Kong DH, Liu JX, et al., 2025. WSC-LLM: efficient LLM service and architecture co-exploration for wafer-scale chips. Proc 52nd Annual Int Symp on Computer Architecture, p.1-17.

[25]Yang QZ, Wei TQ, Guan SH, et al., 2025. PD constraint-aware physical/logical topology co-design for network on wafer. Proc 52nd Annual Int Symp on Computer Architecture, p.49-64.

[26]Yu XM, Jiang DC, Deng JY, et al., 2025. Cramming a data center into one cabinet, a co-exploration of computing and hardware architecture of waferscale chip. Proc 52nd Annual Int Symp on Computer Architecture, p.631-645.

[27]Zhang YJ, Fan WB, Han ZJ, et al., 2021. Fault-tolerant routing algorithm based on disjoint paths in 3-ary n-cube networks with structure faults. J Supercomput, 77(11):13090-13114.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2026 Journal of Zhejiang University-SCIENCE