Full Text:   <1186>

Summary:  <591>

CLC number: TP36; TN47

On-line Access: 2015-01-29

Received: 2014-07-05

Revision Accepted: 2014-10-22

Crosschecked: 2014-12-30

Cited: 1

Clicked: 2548

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Kai Huang

http://orcid.org/0000-0002-5034-7171

Si-wen Xiu

http://orcid.org/0000-0003-0400-8037

-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2015 Vol.16 No.2 P.135-151

http://doi.org/10.1631/FITEE.1400239


Profiling and annotation combined method for multimedia application specific MPSoC performance estimation


Author(s):  Kai Huang, Xiao-xu Zhang, Si-wen Xiu, Dan-dan Zheng, Min Yu, De Ma, Kai Huang, Gang Chen, Xiao-lang Yan

Affiliation(s):  Institute of VLSI Design, Zhejiang University, Hangzhou 310027, China; more

Corresponding email(s):   huangk@vlsi.zju.edu.cn, xiusw@vlsi.zju.edu.cn

Key Words:  MPSoC, Gradual refinement, Native simulation, Performance estimation, Profiling, Annotation, Gcov


Kai Huang, Xiao-xu Zhang, Si-wen Xiu, Dan-dan Zheng, Min Yu, De Ma, Kai Huang, Gang Chen, Xiao-lang Yan. Profiling and annotation combined method for multimedia application specific MPSoC performance estimation[J]. Frontiers of Information Technology & Electronic Engineering, 2015, 16(2): 135-151.

@article{title="Profiling and annotation combined method for multimedia application specific MPSoC performance estimation",
author="Kai Huang, Xiao-xu Zhang, Si-wen Xiu, Dan-dan Zheng, Min Yu, De Ma, Kai Huang, Gang Chen, Xiao-lang Yan",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="16",
number="2",
pages="135-151",
year="2015",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1400239"
}

%0 Journal Article
%T Profiling and annotation combined method for multimedia application specific MPSoC performance estimation
%A Kai Huang
%A Xiao-xu Zhang
%A Si-wen Xiu
%A Dan-dan Zheng
%A Min Yu
%A De Ma
%A Kai Huang
%A Gang Chen
%A Xiao-lang Yan
%J Frontiers of Information Technology & Electronic Engineering
%V 16
%N 2
%P 135-151
%@ 2095-9184
%D 2015
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1400239

TY - JOUR
T1 - Profiling and annotation combined method for multimedia application specific MPSoC performance estimation
A1 - Kai Huang
A1 - Xiao-xu Zhang
A1 - Si-wen Xiu
A1 - Dan-dan Zheng
A1 - Min Yu
A1 - De Ma
A1 - Kai Huang
A1 - Gang Chen
A1 - Xiao-lang Yan
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 16
IS - 2
SP - 135
EP - 151
%@ 2095-9184
Y1 - 2015
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1400239


Abstract: 
Accurate and fast performance estimation is necessary to drive design space exploration and thus support important design decisions. Current techniques are either time consuming or not accurate enough. In this paper, we solve these problems by presenting a hybrid method for multimedia multiprocessor system-on-chip (MPSoC) performance estimation. A general coverage analysis tool GNU gcov is employed to profile the execution statistics during the native simulation. To tackle the complexity and keep the analysis and simulation manageable, the orthogonalization of communication and computation parts is adopted. The estimation result of the computation part is annotated to a transaction accurate model for further analysis, by which a gradual refinement of MPSoC performance estimation is supported. The implementation and its experimental results prove the feasibility and efficiency of the proposed method.

This paper describes an approach to perform fast timing estimation of the software running on an MPSoC platform. The paper is well written and early design space exploration is important to tackle every increasing development cost. The approach, even though it is based on existing tools and relies on existing technology, is meaningful.

面向多媒体特定应用的剖析和标注相结合MPSoC性能评估方法

目的:性能估计已成为异构MPSoC设计中一个非常重要且具有挑战性的任务。在设计早期进行准确快速估计性能对于设计空间探索十分必要。本文采用GCC剖析技术和代码标注技术,结合MPSoC分层抽象概念,探讨逐层次完善的性能评估技术在MPSoC体系结构探索中的应用。
创新:为面向多媒体应用的MPSoC性能估计提出一个从VA层到TA层的剖析和标注相结合流程,使性能估计可以被有效逐层完善。
方法:基于GNU gcov工具,在本机模拟过程中剖析给定应用程序代码执行的统计信息,并且支持实时性能分析,快速、准确估计VA模型的计算负载。基于VA模型得到的计算负载性能的结果标注,利用TA模型的SystemC时序精确级仿真得到通信延时结果,使TA模型性能估计更高效,完善整个MPSoC性能估计。
结论:研究一个剖析和标注技术相结合的MPSoC性能评估方法和流程。在VA层得到准确计算负载性能并标注给TA层;在TA层利用基于标注的仿真方法完善通信延时,使得性能估计更高效。通过M-JPEG和MEPG2两个典型视频多媒体应用实验,展示本文方法的高效、快速与准确。

关键词:MPSoC;逐层完善;本机模拟;性能估计;剖析;标注;Gcov

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]ARM, 2003. AMBA Axi Protocol Specification v1.0.

[2]Benini, L., Bertozzi, D., Bogliolo, A., et al., 2005. MPARM: exploring the multi-processor SoC design space with SystemC. J. VLSI Signal Process. Syst. Signal Image Video Technol., 41(2):169-182.

[3]Cesário, W.O., Nicolescu, G., Gauthier, L., et al., 2001. Colif: a design representation for application-specific multiprocessor SoCs. IEEE Des. Test Comput., 18(5):8-20.

[4]C-SKY Microsystems, 2013. Ck803 Introduction. Available from http://www.c-sky.com.

[5]Filho, S.J., Aguiar, A., Marcon, C.A., et al., 2008. High-level estimation of execution time and energy consumption for fast homogeneous MPSoCs prototyping. 19th IEEE/IFIP Int. Symp. on Rapid System Prototyping, p.27-33.

[6]Fummi, F., Martini, S., Perbellini, G., et al., 2004. Native ISS-SystemC integration for the co-simulation of multi-processor SoC. Proc. Design, Automation and Test in Europe Conf. and Exhibition, p.564-569.

[7]Gao, L., Karuri, K., Kraemer, S., et al., 2008. Multiprocessor performance estimation using hybrid simulation. Proc. 45th Annual Design Automation Conf., p.325-330.

[8]Gerin, P., Guerin, X., Pétrot, F., 2008. Efficient implementation of native software simulation for MPSoC. Proc. Design, Automation and Test in Europe, p.676-681.

[9]Gerin, P., Hamayun, M.M., Pétrot, F., 2009. Native MPSoC co-simulation environment for software performance estimation. Proc. 7th IEEE/ACM Int. Conf. on Hardware/Software Codesign and System Synthesis, p.403-412.

[10]GNU, 2013. gcov—a Test Coverage Program. Available from http://gcc.gnu.org/onlinedocs/gcc/Gcov.html.

[11]Han, S.I., Baghdadi, A., Bonaciu, M., et al., 2004. An efficient scalable and flexible data transfer architecture for multiprocessor SoC with massive distributed memory. Proc. 41st Annual Design Automation Conf., p.250-255.

[12]Han, S.I., Chae, S.I., Jarraya, A.A., 2006. Functional modeling techniques for efficient SW code generation of video codec applications. Proc. Asia and South Pacific Design Automation Conf., p.935-940.

[13]Han, S.I., Chae, S.I., Brisolara, L., et al., 2009. Simulink-based heterogeneous multiprocessor SoC design flow for mixed hardware/software refinement and simulation. Integr. VLSI J., 42(2):227-245.

[14]Henia, R., Hamann, A., Jersak, M., et al., 2005. System level performance analysis—the SymTA/S approach. IEE Proc.-Comput. Dig. Tech., 152(2):148-166.

[15]Huang, K., Han, S.I., Popovici, K., et al., 2007. Simulink-based MPSoC design flow: case study of Motion-JPEG and H.264. Proc. 44th Annual Conf. on Design Automation, p.39-42.

[16]Huang, K., Yan, X.L., Han, S.I., et al., 2009. Gradual refinement for application-specific MPSoC design from Simulink model to RTL implementation. J. Zhejiang Univ.-Sci. A, 10(2):151-164.

[17]Huang, K., Haid, W., Bacivarov, I., et al., 2012. Embedding formal performance analysis into the design cycle of MPSoCs for real-time streaming applications. ACM Trans. Embed. Comput. Syst., 11(1), Article 8.

[18]Jerraya, A., Wolf, W., 2004. Multiprocessor Systems-on-Chips. Elsevier.

[19]Jerraya, A.A., Bouchhima, A., Petrot, F., 2006. Programming models and HW-SW interfaces abstraction for multi-processor SoC. 43rd ACM/IEEE Design Automation Conf., p.280-285.

[20]Karuri, K., Al Faruque, M.A., Kraemer, S., et al., 2005. Fine-grained application source code profiling for ASIP design. Proc. 42nd Design Automation Conf., p.329-334.

[21]Keutzer, K., Newton, A.R., Rabaey, J.M., et al., 2000. System-level design: orthogonalization of concerns and platform-based design. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst., 19(12):1523-1543.

[22]Kienhuis, B., Deprettere, E., Vissers, K., et al., 1997. An approach for quantitative analysis of application-specific dataflow architectures. Proc. IEEE Int. Conf. on Application-Specific Systems, Architectures and Processors, p.338-349.

[23]Kirchsteiger, C.M., Schweitzer, H., Trummer, C., et al., 2008. A software performance simulation methodology for rapid system architecture exploration. 15th IEEE Int. Conf. on Electronics, Circuits and Systems, p.494-497.

[24]Madl, G., Dutt, N., Abdelwahed, S., 2007. Performance estimation of distributed real-time embedded systems by discrete event simulations. Proc. 7th ACM & IEEE Int. Conf. on Embedded Software, p.183-192.

[25]Oyamada, M., Wagner, F.R., Bonaciu, M., et al., 2007. Software performance estimation in MPSoC design. Proc. Asia and South Pacific Design Automation Conf., p.38-43.

[26]Oyamada, M., Zschornack, F., Wagner, F., 2008. Applying neural networks to performance estimation of embedded software. J. Syst. Archit., 54(1-2):224-240.

[27]Patel, R., Rajawat, A., 2011. A survey of embedded software profiling methodologies. Int. J. Embed. Syst. Appl., 1(2):19-40.

[28]Piscitelli, R., Pimentel, A.D., 2012. Interleaving methods for hybrid system-level MPSoC design space exploration. Int. Conf. on Embedded Computer Systems, p.7-14.

[29]Posadas, H., Herrera, F., Sanchez, P., et al., 2004. System-level performance analysis in SystemC. Proc. Design, Automation and Test in Europe Conf. and Exhibition, 1:378-383.

[30]Richter, K., Jersak, M., Ernst, R., 2003. A formal approach to MPSoC performance verification. Computer, 36(4):60-67.

[31]Schnerr, J., Bringmann, O., Viehl, A., et al., 2008. High-performance timing simulation of embedded software. Proc. 45th Annual Design Automation Conf., p.290-295.

[32]Shen, H., Hamayun, M., Petrot, F., 2012. Native simulation of MPSoC using hardware-assisted virtualization. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst., 31(7):1074-1087.

[33]Wandeler, E., Thiele, L., Verhoef, M., et al., 2006. System architecture evaluation using modular performance analysis: a case study. Int. J. Softw. Tools Technol. Transfer, 8(6):649-667.

[34]Wilhelm, R., Engblom, J., Ermedahl, A., et al., 2008. The worst-case execution-time problem—overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst., 7(3):36.

[35]Yang, H., Kim, S., Ha, S., 2010. An MILP-based performance analysis technique for non-preemptive multitasking MPSoC. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst., 29(10):1600-1613.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - Journal of Zhejiang University-SCIENCE