CLC number: TP302.1
On-line Access: 2020-10-14
Received: 2020-02-26
Revision Accepted: 2020-05-13
Crosschecked: 2020-09-29
Cited: 0
Clicked: 4922
Citations: Bibtex RefMan EndNote GB/T7714
Jing Wang, Wei-wei Liang, Yue-hua Niu, Lan Gao, Wei-gong Zhang. Multi-dimensional optimization for approximate near-threshold computing[J]. Frontiers of Information Technology & Electronic Engineering, 2020, 21(10): 1426-1441.
@article{title="Multi-dimensional optimization for approximate near-threshold computing",
author="Jing Wang, Wei-wei Liang, Yue-hua Niu, Lan Gao, Wei-gong Zhang",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="21",
number="10",
pages="1426-1441",
year="2020",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2000089"
}
%0 Journal Article
%T Multi-dimensional optimization for approximate near-threshold computing
%A Jing Wang
%A Wei-wei Liang
%A Yue-hua Niu
%A Lan Gao
%A Wei-gong Zhang
%J Frontiers of Information Technology & Electronic Engineering
%V 21
%N 10
%P 1426-1441
%@ 2095-9184
%D 2020
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2000089
TY - JOUR
T1 - Multi-dimensional optimization for approximate near-threshold computing
A1 - Jing Wang
A1 - Wei-wei Liang
A1 - Yue-hua Niu
A1 - Lan Gao
A1 - Wei-gong Zhang
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 21
IS - 10
SP - 1426
EP - 1441
%@ 2095-9184
Y1 - 2020
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2000089
Abstract: The demise of Dennard’s scaling has created both power and utilization wall challenges for computer systems. As transistors operating in the near-threshold region are able to obtain flexible trade-offs between power and performance, it is regarded as an alternative solution to the scaling challenge. A reduction in supply voltage will nevertheless generate significant reliability challenges, while maintaining an error-free system that generates high costs in both performance and energy consumption. The main purpose of research on computer architecture has therefore shifted from performance improvement to complex multi-objective optimization. In this paper, we propose a three-dimensional optimization approach which can effectively identify the best system configuration to establish a balance among performance, energy, and reliability. We use a dynamic programming algorithm to determine the proper voltage and approximate level based on three predictors: system performance, energy consumption, and output quality. We propose an output quality predictor which uses a hardware/software co-design fault injection platform to evaluate the impact of the error on output quality under near-threshold computing (NTC). Evaluation results demonstrate that our approach can lead to a 28% improvement in output quality with a 10% drop in overall energy efficiency; this translates to an approximately 20% average improvement in accuracy, power, and performance.
[1]Azizi O, Mahesri A, Lee BC, et al., 2010. Energy-performance tradeoffs in processor architecture and circuit design: a marginal cost analysis. ACM SIGARCH Comput Arch News, 38(3):26-36.
[2]Carlson TE, Heirman W, Eeckhout L, 2011. Sniper: exploring the level of abstraction for scalable and accurate parallel multi-core simulation. Proc Int Conf for High Performance Computing, Networking, Storage and Analysis, p.1-12.
[3]Chippa VK, Chakradhar ST, Roy K, et al., 2013. Analysis and characterization of inherent application resilience for approximate computing. 50th ACM/EDAC/IEEE Design Automation Conf, p.1-9.
[4]Das S, Blaauw D, Bull D, et al., 2009. Addressing design margins through error-tolerant circuits. 46th ACM/IEEE Design Automation Conf, p.11-12.
[5]Esmaeilzadeh H, Sampson A, Ceze L, et al., 2012. Neural acceleration for general-purpose approximate programs. 45th Annual IEEE/ACM Int Symp on Microarchitecture, p.449-460.
[6]Ferreira K, Stearley J, Laros JH, et al., 2011. Evaluating the viability of process replication reliability for exascale systems. Proc Int Conf for High Performance Computing, Networking, Storage and Analysis, p.1-12.
[7]Grigorian B, Farahpour N, Reinman G, 2015. BRAINIAC: bringing reliable accuracy into neurally-implemented approximate computing. IEEE 21st Int Symp on High Performance Computer Architecture, p.615-626.
[8]Gupta V, Mohapatra D, Park SP, et al., 2011. IMPACT: IMPrecise adders for low-power approximate computing. IEEE/ACM Int Symp on Low Power Electronics and Design, p.409-414.
[9]Huang KH, Abraham JA, 1984. Algorithm-based fault tolerance for matrix operations. IEEE Trans Comput, C-33(6):518-528.
[10]Karpuzcu UR, Kolluru KB, Kim NS, et al., 2012. VARIUS- NTV: a microarchitectural model to capture the increased sensitivity of manycores to process variations at near- threshold voltages. IEEE/IFIP Int Conf on Dependable Systems and Networks, p.1-11.
[11]Kaul H, Anders M, Hsu S, et al., 2012. Near-threshold voltage (NTV) design—opportunities and challenges. Proc 49th Annual Design Automation Conf, p.1149-1154.
[12]Kozhikkottu V, Venkataramani S, Dey S, et al., 2014. Variation tolerant design of a vector processor for recognition, mining and synthesis. Proc Int Symp on Low Power Electronics and Design, p.239-244.
[13]Liu S, Pattabiraman K, Moscibroda T, et al., 2011. Flikker: saving DRAM refresh-power through critical data partitioning. Proc 16th Int Conf on Architectural Support for Programming Languages and Operating Systems, p.213-224.
[14]Reagen B, Gupta U, Pentecost L, et al., 2018. Ares: a framework for quantifying the resilience of deep neural networks. Proc 55th ACM/ESDA/IEEE Design Automation Conf, p.1-6.
[15]Samadi M, Jamshidi DA, Lee J, et al., 2014. Paraprox: pattern- based approximation for data parallel applications. Int Conf on Architectural Support for Programming Languages and Operating Systems, p.35-50.
[16]Sampson A, Baixo A, Ransford B, et al., 2015. ACCEPT: a Programmer-Guided Compiler Framework for Practical Approximate Computing. Technical Report No. UW-CSE-15-01, University of Washington, USA.
[17]Santriaji MH, Hoffmann H, 2016. GRAPE: minimizing energy for GPU applications with performance requirements. 49th Annual IEEE/ACM Int Symp on Microarchitecture, p.1-13.
[18]Shye A, Moseley T, Reddi VJ, et al., 2007. Using process-level redundancy to exploit multiple cores for transient fault tolerance. 37th Annual IEEE/IFIP Int Conf on Dependable Systems and Networks, p.297-306.
[19]Sidiroglou-Douskos S, Misailovic S, Hoffmann H, et al., 2011. Managing performance vs. accuracy trade-offs with loop perforation. Proc 19th ACM SIGSOFT Symp and 13th European Conf on Foundations of Software Engineering, p.124-134.
[20]Silvano C, Palermo G, Xydis S, et al., 2014. Voltage island management in near threshold manycore architectures to mitigate dark silicon. Design, Automation & Test in Europe Conf & Exhibition, p.1-6.
[21]Song W, Mukhopadhyay S, Yalamanchili S, 2015a. Architectural reliability: lifetime reliability characterization and management of many-core processors. IEEE Comput Arch Lett, 14(2):103-106.
[22]Song W, Mukhopadhyay S, Yalamanchili S, 2015b. Managing performance-reliability tradeoffs in multi-core processors. IEEE Int Reliability Physics Symp, p.3C.1.1- 3C.1.7.
[23]Sutherland M, San Miguel J, Enright Jerger N, 2015. Texture cache approximation on GPUs. University of Toronto, Toronto, Canada. http://www.eecg.toronto.edu/~enright/TexCacheApprox.pdf
[24]Tavakkoli-Moghaddam R, Safari J, Sassani F, 2008. Reliability optimization of series-parallel systems with a choice of redundancy strategies using a genetic algorithm. Reliab Eng Syst Saf, 93(4):550-556.
[25]Teodorescu R, Torrellas J, 2008. Variation-aware application scheduling and power management for chip multiprocessors. Int Symp on Computer Architecture, p.363-374.
[26]Tian Y, Zhang Q, Wang T, et al., 2015. ApproxMA: approximate memory access for dynamic precision scaling. Proc 25th Edition on Great Lakes Symp on VLSI, p.337-342.
[27]Venkatagiri R, Mahmoud A, Hari SKS, et al., 2016. Approxilyzer: towards a systematic framework for instruction- level approximate computing and its application to hardware resiliency. 49th Annual IEEE/ACM Int Symp on Microarchitecture, p.1-14.
[28]Wang L, Rivers JA, Gupta MS, et al., 2014. Resilience and real-time constrained energy optimization in embedded processor systems. 10th Workshop on Silicon Errors in Logic-System Effects.
[29]Wang L, Vega AJ, Buyuktosunoglu A, et al., 2015. Power- efficient embedded processing with resilience and real- time constraints. IEEE/ACM Int Symp on Low Power Electronics and Design, p.231-236.
[30]Wunderlich HJ, Braun C, Schöll A, 2016. Pushing the limits: how fault tolerance extends the scope of approximate computing. IEEE 22nd Int Symp on On-line Testing and Robust System Design, p.133-136.
[31]Yazdanbakhsh A, Mahajan D, Esmaeilzadeh H, et al., 2017. AxBench: a multiplatform benchmark suite for approximate computing. IEEE Des Test, 34(2):60-68.
[32]Zhang Y, Chakrabarty K, 2006. A unified approach for fault tolerance and dynamic power management in fixed- priority real-time embedded systems. IEEE Trans Comput-Aid Des Int Circ Syst, 25(1):111-125.
[33]Zhao BX, Aydin H, Zhu DK, 2008. Reliability-aware dynamic voltage scaling for energy-constrained real-time embedded systems. IEEE Int Conf on Computer Design, p.633-639.
[34]Zhong LL, 2015. BROAD: Bold and Reliable Online Approximate Computing Framework for Diverse Applications. MS Thesis, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA.
Open peer comments: Debate/Discuss/Question/Opinion
<1>