Full Text:   <2157>

Summary:  <1633>

CLC number: TP311

On-line Access: 2018-08-06

Received: 2016-12-06

Revision Accepted: 2017-07-12

Crosschecked: 2018-06-12

Cited: 0

Clicked: 5730

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Deng Chen

http://orcid.org/0000-0001-6359-801X

-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2018 Vol.19 No.6 P.737-754

http://doi.org/10.1631/FITEE.1601783


An oversampling approach for mining program specifications


Author(s):  Deng Chen, Yan-duo Zhang, Wei Wei, Rong-cun Wang, Xiao-lin Li, Wei Liu, Shi-xun Wang, Rui Zhu

Affiliation(s):  Hubei Provincial Key Laboratory of Intelligent Robot, Wuhan Institute of Technology, Wuhan 430205, China; more

Corresponding email(s):   dchen@wit.edu.cn

Key Words:  Object usage scenario, API protocol mining, Program temporal specification mining, Oversampling


Deng Chen, Yan-duo Zhang, Wei Wei, Rong-cun Wang, Xiao-lin Li, Wei Liu, Shi-xun Wang, Rui Zhu. An oversampling approach for mining program specifications[J]. Frontiers of Information Technology & Electronic Engineering, 2018, 19(6): 737-754.

@article{title="An oversampling approach for mining program specifications",
author="Deng Chen, Yan-duo Zhang, Wei Wei, Rong-cun Wang, Xiao-lin Li, Wei Liu, Shi-xun Wang, Rui Zhu",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="19",
number="6",
pages="737-754",
year="2018",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1601783"
}

%0 Journal Article
%T An oversampling approach for mining program specifications
%A Deng Chen
%A Yan-duo Zhang
%A Wei Wei
%A Rong-cun Wang
%A Xiao-lin Li
%A Wei Liu
%A Shi-xun Wang
%A Rui Zhu
%J Frontiers of Information Technology & Electronic Engineering
%V 19
%N 6
%P 737-754
%@ 2095-9184
%D 2018
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1601783

TY - JOUR
T1 - An oversampling approach for mining program specifications
A1 - Deng Chen
A1 - Yan-duo Zhang
A1 - Wei Wei
A1 - Rong-cun Wang
A1 - Xiao-lin Li
A1 - Wei Liu
A1 - Shi-xun Wang
A1 - Rui Zhu
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 19
IS - 6
SP - 737
EP - 754
%@ 2095-9184
Y1 - 2018
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1601783


Abstract: 
Automatic protocol mining is a promising approach for inferring accurate and complete API protocols. However, just as with any data-mining technique, this approach requires sufficient training data (object usage scenarios). Existing approaches resolve the problem by analyzing more programs, which may cause significant runtime overhead. In this paper, we propose an inheritance-based oversampling approach for object usage scenarios (OUSs). Our technique is based on the inheritance relationship in object-oriented programs. Given an object-oriented program p, generally, the OUSs that can be collected from a run of p are not more than the objects used during the run. With our technique, a maximum of n times more OUSs can be achieved, where n is the average number of super-classes of all general OUSs. To investigate the effect of our technique, we implement it in our previous prototype tool, ISpecMiner, and use the tool to mine protocols from several real-world programs. Experimental results show that our technique can collect 1.95 times more OUSs than general approaches. Additionally, accurate and complete API protocols are more likely to be achieved. Furthermore, our technique can mine API protocols for classes never even used in programs, which are valuable for validating software architectures, program documentation, and understanding. Although our technique will introduce some runtime overhead, it is trivial and acceptable.

一种用于程序约束挖掘的过采样方法

概要:自动协议挖掘是获取精确而完备的API使用协议的有效方法。然而,与其它数据挖掘应用类似,自动协议挖掘方法需要足够多训练数据(即对象使用场景)作为输入。虽然通过增加程序的规模可提取更多数量的对象使用场景,但这会导致程序分析运行时开销较大。本文针对面向对象程序提出一种基于继承关系的对象使用场景过采样方法。给定一个面向对象程序p,一般情况下,执行p所能获得的对象使用场景数不超过运行时实例化的对象数。而本文方法可获得多达上述n倍的对象使用场景,其中n为程序p中一般对象使用场景的平均父类数。为了验证效果,在前期API使用协议动态挖掘原型工具ISpecMiner中集成上述方法并开展实验研究。实验采用扩展后的ISpecMiner从多个实际的程序中挖掘API使用协议。结果显示,采用本文方法获得的对象使用场景数是一般化方法的1.95倍。不仅如此,对比实验结果表明本文方法有利于挖掘更加精确而完备的API使用协议。特别值得关注的是,本文方法适用于无法实例化的类并挖掘出其API使用协议。这类API使用协议对于验证软件架构、程序说明和理解具有重要意义。虽然本文方法会增加一定的运行开销,但其仍在可接受范围内。

关键词:对象使用场景;API协议挖掘;程序时序约束挖掘;过采样

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Alur R, Černý P, Madhusudan P, et al., 2005. Synthesis of interface specifications for Java classes. ACM SIGPLAN Not, 40(1):98-109.

[2]Ammons G, Bodík R, Larus JR, 2002. Mining specifications. ACM SIGPLAN Not, 37(1):4-16.

[3]Bruce KB, Wegner P, 1986. An algebraic model of sybtypes in object-oriented languages (draft). ACM SIGPLAN Not, 21(10):163-172.

[4]Caserta P, Zendra O, 2014. JBInsTrace: a tracer of Java and JRE classes at basic-block granularity by dynamically instrumenting bytecode. Sci Comput Program, 79:116-125.

[5]Chang RY, Podgurski A, Yang J, 2007. Finding what’s not there: a new approach to revealing neglected conditions in software. Proc Int Symp on Software Testing and Analysis, p.163-173.

[6]Chen D, Huang RB, Qu BB, et al., 2014. Improving static analysis performance using rule-filtering technique. Proc 26th Int Conf on Software Engineering and Knowledge Engineering, p.19-24.

[7]Chen D, Huang RB, Qu BB, et al., 2015a. Mining class temporal specification dynamically based on extended Markov model. Int J Softw Eng Knowl Eng, 25(3):573-604.

[8]Chen D, Zhang YD, Wang RC, et al., 2015b. Extracting more object usage scenarios for API protocol mining. Proc 27th Int Conf on Software Engineering and Knowledge Engineering, p.607-612.

[9]Chen D, Zhang YD, Wei W, et al., 2017. Efficient vulnerability detection based on an optimized rule-checking static analysis technique. Front Inform Technol Electron Eng, 18(3):332-345.

[10]Cook JE, Wolf AL, 1998. Discovering models of software processes from event-based data. ACM Trans Softw Eng Methodol, 7(3):215-249.

[11]Dai ZY, Mao XG, Lei Y, et al., 2014. Compositional mining of multiple object API protocols through state abstraction. Sci World J, Article 171 647.

[12]Dallmeier V, Lindig C, Wasylkowski A, et al., 2006. Mining object behavior with ADABU. Proc Int Workshop on Dynamic Systems Analysis, p.17-24.

[13]Dallmeier V, Knopp N, Mallon C, et al., 2012. Automatically generating test cases for specification mining. IEEE Trans Softw Eng, 38(2):243-257.

[14]Engler D, Chen DY, Hallem S, et al., 2001. Bugs as deviant behavior: a general approach to inferring errors in systems code. ACM SIGOPS Oper Syst Rev, 35(5):57-72.

[15]Ernst MD, Perkins JH, Guo PJ, et al., 2007. The Daikon system for dynamic detection of likely invariants. Sci Comput Program, 69(1-3):35-45.

[16]Kernighan BW, Ritchie DM, 1988. The C Programming Language (2nd Ed.). Prentice Hall, Englewood Cliffs, NJ.

[17]Li ZM, Zhou YY, 2005. PR-miner: automatically extracting implicit programming rules and detecting violations in large software codes. ACM SIGSOFT Softw Eng Not, 30(5):306-315.

[18]Liskov B, 1988. Keynote address—data abstraction and hierarchy. ACM SIGPLAN Not, 23(5):17-34.

[19]Lorenzoli D, Mariani L, Pezzè M, 2008. Automatic generation of software behavioral models. Proc 30th Int Conf on Software Engineering, p.501-510.

[20]Pradel M, Gross TR, 2012. Leveraging test generation and specification mining for automated bug detection without false positives. Proc 34th Int Conf on Software Engineering, p.288-298.

[21]Pradel M, Jaspan C, Aldrich J, et al., 2012. Statically checking API protocol conformance with mined multi-object specifications. Proc 34th Int Conf on Software Engineering, p.925-935.

[22]Ramanathan MK, Grama A, Jagannathan S, 2007. Static specification inference using predicate mining. ACM SIGPLAN Not, 42(6):123-134.

[23]Shoham S, Yahav E, Fink S, et al., 2007. Static specification mining using automata-based abstractions. Proc Int Symp on Software Testing and Analysis, p.174-184.

[24]Skala V, Petruska R, 2014. A new approach to hash function construction for textual data: a comparison. Proc 4th World Congress on Information and Communication Technologies, p.39-44.

[25]Tatsubori M, Sasaki T, Chiba S, et al., 2001. A bytecode translator for distributed execution of “legacy” Java software. Proc 15th European Conf on Object-Oriented Programming, p.236-255.

[26]Thummalapenta S, Xie T, 2011. Alattin: mining alternative patterns for defect detection. Autom Softw Eng, 18(3-4): 293-323.

[27]Wasylkowski A, 2007. Mining object usage models. Proc 29th Int Conf on Software Engineering, p.93-94.

[28]Wasylkowski A, Zeller A, Lindig C, 2007. Detecting object usage anomalies. Proc 6th Joint Meeting of the European Software Engineering Conf and the ACM SIGSOFT Symp on Foundations of Software Engineering, p.35-44.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE