
CLC number: TP393.02
On-line Access: 2026-03-23
Received: 2025-12-19
Revision Accepted: 2026-01-23
Crosschecked: 2026-03-23
Cited: 0
Clicked: 14
Citations: Bibtex RefMan EndNote GB/T7714
https://orcid.org/0000-0001-7456-8377
Rui ZHENG, Jianliang SHEN, Fan ZHANG, Ping LV, Peijie LI, Yu SHAO, Zhengbin ZHU. De-blocking adaptive feedback control design for shared-buffer CIOQ switching architecture[J]. Journal of Zhejiang University Science C, 2026, 27(3): 1-18.
@article{title="De-blocking adaptive feedback control design for shared-buffer CIOQ switching architecture",
author="Rui ZHENG, Jianliang SHEN, Fan ZHANG, Ping LV, Peijie LI, Yu SHAO, Zhengbin ZHU",
journal="Journal of Zhejiang University Science C",
volume="27",
number="3",
pages="1-18",
year="2026",
publisher="Zhejiang University Press & Springer",
doi="10.1631/ENG.ITEE.2025.0180"
}
%0 Journal Article
%T De-blocking adaptive feedback control design for shared-buffer CIOQ switching architecture
%A Rui ZHENG
%A Jianliang SHEN
%A Fan ZHANG
%A Ping LV
%A Peijie LI
%A Yu SHAO
%A Zhengbin ZHU
%J Frontiers of Information Technology & Electronic Engineering
%V 27
%N 3
%P 1-18
%@ 1869-1951
%D 2026
%I Zhejiang University Press & Springer
%DOI 10.1631/ENG.ITEE.2025.0180
TY - JOUR
T1 - De-blocking adaptive feedback control design for shared-buffer CIOQ switching architecture
A1 - Rui ZHENG
A1 - Jianliang SHEN
A1 - Fan ZHANG
A1 - Ping LV
A1 - Peijie LI
A1 - Yu SHAO
A1 - Zhengbin ZHU
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 27
IS - 3
SP - 1
EP - 18
%@ 1869-1951
Y1 - 2026
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/ENG.ITEE.2025.0180
Abstract: To address the issues of head-of-line (HOL) blocking at the virtual output queue (VOQ) level, packet loss, and congestion spreading caused by buffer overflow in the shared-buffer-based combined input and output queued (CIOQ) switching architecture, while enhancing its performance and stability, we propose a de-blocking adaptive feedback control (AFC) design in this study. The introduction of the credit timeout detection mechanism (CTDM) enables the CIOQ to achieve theoretical 100% non-blocking state, effectively eliminating the impact of HOL blocking. With the combined effect of the proposed VOQ dynamic regulation algorithm (VDRA) and threshold dynamic adaptive algorithm (TDAA), it can reduce the risk of congestion spreading caused by buffer overflow and consequently improve the overall performance of the system. Both theoretical analysis and experimental results demonstrate that, under typical traffic conditions, the proposed design achieves a maximum throughput of 1499.66 Gb/s and a minimum latency of 83 ns. Additionally, the effective throughput ratio reaches 96.94%, with a data link layer packet (DLLP) loss ratio of merely 0.61% and a packet loss rate as low as 0.6%. In comparison with traditional CIOQ and input queued (IQ) switch architectures, the proposed design demonstrates improvements in throughput by 15.12% and 20.55%, and forwarding latency is reduced by 26.9% and 54.7%, respectively, and the system stability is stronger, which can fully satisfy the demand for data exchange in complex situations.
[1]Bandara S, Sanaullah A, Tahir Z, et al., 2024. Performance evaluation of VirtIO device drivers for host-FPGA PCIe communication. IEEE Int Parallel and Distributed Processing Symp Workshops, p.169-176.
[2]Chen BW, Zhou WB, 2025. A review of input-queued scheduling algorithms in high-speed switching systems. Microelectr Comput, 42(5):1-8 (in Chinese).
[3]Chen C, Li H, Zhou L, 2024. A design of radar high-speed data storage module based on VPX. 4th Int Conf on Computer Science Electronic Information Engineering and Intelligent Control Technology, p.234-239.
[4]Dagli I, Belviranli ME, 2024. Shared memory-contention-aware concurrent DNN execution for diversely heterogeneous system-on-chips. Proc 29th ACM SIGPLAN Annual Symp on Principles and Practice of Parallel Programming, p.243-256.
[5]de la Rosa MS, Gomez-Lopez G, Andújar FJ, et al., 2025. Quality-of-service provision for BXIv3-based interconnection networks. J Supercomput, 81(4):601.
[6]Dong CL, Shen JL, Li PJ, et al., 2024. Multi-protocol switching circuit for software defined interconnection system. J Commun, 45(5):44-53 (in Chinese).
[7]Firoozshahian A, Manshadi V, Goel A, et al., 2007. Efficient, fully local algorithms for CIOQ switches. 26th IEEE Int Conf on Computer Communications, p.2491-2495.
[8]Hou WT, Zhang J, Wang ZK, et al., 2024. Understanding routable PCIe performance for composable infrastructures. 21st USENIX Symp on Networked Systems Design and Implementation, p.297-312.
[9]Hu B, Fan FJ, Yeung KL, et al., 2018. Highest rank first: a new class of single-iteration scheduling algorithms for input-queued switches. IEEE Access, 6:11046-11062.
[10]Hu JB, Huang JW, Wang JX, et al., 2023. A transmission control mechanism for lossless datacenter network based on direct congestion notification. Acta Electr Sin, 51(9):2355-2365 (in Chinese).
[11]Kaltenhauser A, Stefanidi E, Schöning J, 2024. Playing with perspectives and unveiling the autoethnographic kaleidoscope in HCI – a literature review of autoethnographies. Proc CHI Conf on Human Factors in Computing Systems, Article 819.
[12]Kim H, Ryu J, Lee J, 2024. TCCL: discovering better communication paths for PCIe GPU clusters. Proc 29th ACM Int Conf on Architectural Support for Programming Languages and Operating Systems, p.999-1015.
[13]Luo JF, Yu F, Li WJ, et al., 2024. A novel switch architecture for multi-die optimization with efficient connections. Electronics, 13(16):3205.
[14]Mohtavipour SM, Mollajafari M, Naseri A, 2020. A novel packet exchanging strategy for preventing HoL-blocking in fat-trees. Clust Comput, 23(2):461-482.
[15]Nag SN, 2023. Technical analysis of PCIe to PCIe 6: a next-generation interface evolution. World J Eng Technol, 11(3):504-525.
[16]Ouyang YM, Yang JF, Xing K, et al., 2018. An improved communication scheme for non-HOL-blocking wireless NoC. Integration, 60:240-247.
[17]Palnitkar SS, Kanade S, 2024. Q-memory task routing to prevent deadlocks in Ethernet control with memory crossbar switching. Opt Mem Neur Netw, 33(1):72-85.
[18]Ran C, Su H, Sun Y, et al., 2023. Research on fair scheduling algorithm of high-performance input queuing switch. Foreign Electr Meas Technol, 42(2):114-119 (in Chinese).
[19]Ran C, Su H, Sun Y, et al., 2024. Research on improved iSLIP scheduling algorithm based on CIOQ architecture. Comput Simul, 41(4):325-329, 417 (in Chinese).
[20]Shen ZJ, Gao J, Wu RG, 2018. Feedback and reverse transmission mechanism based two-stage switch architecture. J Electr Inform Technol, 40(3):697-704 (in Chinese).
[21]Shen ZJ, Tao DH, Gao J, 2019. Multichannel-feedback-based two-stage switch architecture. J Front Comput Sci Technol, 13(9):1516-1523 (in Chinese).
[22]Wu WX, Zhang T, Li Z, et al., 2025. Dynamic per-flow queues in shared buffer TSN switches. ACM Trans Des Autom Electr Syst, 30(3):38.
[23]Xu W, Dainoff MJ, Ge LZ, et al., 2023. Transitioning to human interaction with AI systems: new challenges and opportunities for HCI professionals to enable human-centered AI. Int J Hum–Comput Int, 39(3):494-518.
[24]Yébenes P, Escudero-Sahuquillo J, García PJ, et al., 2019. Head-of-line blocking avoidance in slim fly networks using deadlock-free non-minimal and adaptive routing. Concurr Comput, 31(2):e4441.
[25]Zhang YR, Wang SG, Ren FY, 2025. A review of lossless network traffic management. J Comput Res Dev, 62(5):1290-1306 (in Chinese).
[26]Zheng CM, Yao XX, Zhou F, et al., 2022. Adaption and implementation of server chipsets for the Loongson CPU. Chin J Eng, 44(7):1244-1254 (in Chinese). https:/doi.org/10.13374/j.issn2095-9389.2021.10.08.003
[27]Zheng R, Shen JL, Lv P, et al., 2026. Optimized design of non-transparent bridge for heterogeneous interconnects in hyper-converged infrastructure. J Electr Inform Technol, 48(2):567-582 (in Chinese).
[28]Zhou WW, Sheng WX, Yan BY, 2024. A single-chip wafer-level packaged SR-crossbar RF MEMS switch matrix. IEEE Electr Dev Lett, 45(7):1309-1312.
[29]Zyla K, Liess M, Wild T, et al., 2024. FlexCross: high-speed and flexible packet processing via a crosspoint-queued crossbar. 27th Euromicro Conf on Digital System Design, p.98-105.
Open peer comments: Debate/Discuss/Question/Opinion
<1>