CLC number: TP311
On-line Access: 2018-08-06
Received: 2016-12-06
Revision Accepted: 2017-07-12
Crosschecked: 2018-06-12
Cited: 0
Clicked: 5730
Deng Chen, Yan-duo Zhang, Wei Wei, Rong-cun Wang, Xiao-lin Li, Wei Liu, Shi-xun Wang, Rui Zhu. An oversampling approach for mining program specifications[J]. Frontiers of Information Technology & Electronic Engineering, 2018, 19(6): 737-754.
@article{title="An oversampling approach for mining program specifications",
author="Deng Chen, Yan-duo Zhang, Wei Wei, Rong-cun Wang, Xiao-lin Li, Wei Liu, Shi-xun Wang, Rui Zhu",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="19",
number="6",
pages="737-754",
year="2018",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1601783"
}
%0 Journal Article
%T An oversampling approach for mining program specifications
%A Deng Chen
%A Yan-duo Zhang
%A Wei Wei
%A Rong-cun Wang
%A Xiao-lin Li
%A Wei Liu
%A Shi-xun Wang
%A Rui Zhu
%J Frontiers of Information Technology & Electronic Engineering
%V 19
%N 6
%P 737-754
%@ 2095-9184
%D 2018
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1601783
TY - JOUR
T1 - An oversampling approach for mining program specifications
A1 - Deng Chen
A1 - Yan-duo Zhang
A1 - Wei Wei
A1 - Rong-cun Wang
A1 - Xiao-lin Li
A1 - Wei Liu
A1 - Shi-xun Wang
A1 - Rui Zhu
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 19
IS - 6
SP - 737
EP - 754
%@ 2095-9184
Y1 - 2018
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1601783
Abstract: Automatic protocol mining is a promising approach for inferring accurate and complete API protocols. However, just as with any data-mining technique, this approach requires sufficient training data (object usage scenarios). Existing approaches resolve the problem by analyzing more programs, which may cause significant runtime overhead. In this paper, we propose an inheritance-based oversampling approach for object usage scenarios (OUSs). Our technique is based on the inheritance relationship in object-oriented programs. Given an object-oriented program p, generally, the OUSs that can be collected from a run of p are not more than the objects used during the run. With our technique, a maximum of n times more OUSs can be achieved, where n is the average number of super-classes of all general OUSs. To investigate the effect of our technique, we implement it in our previous prototype tool, ISpecMiner, and use the tool to mine protocols from several real-world programs. Experimental results show that our technique can collect 1.95 times more OUSs than general approaches. Additionally, accurate and complete API protocols are more likely to be achieved. Furthermore, our technique can mine API protocols for classes never even used in programs, which are valuable for validating software architectures, program documentation, and understanding. Although our technique will introduce some runtime overhead, it is trivial and acceptable.
[1]Alur R, Černý P, Madhusudan P, et al., 2005. Synthesis of interface specifications for Java classes. ACM SIGPLAN Not, 40(1):98-109.
[2]Ammons G, Bodík R, Larus JR, 2002. Mining specifications. ACM SIGPLAN Not, 37(1):4-16.
[3]Bruce KB, Wegner P, 1986. An algebraic model of sybtypes in object-oriented languages (draft). ACM SIGPLAN Not, 21(10):163-172.
[4]Caserta P, Zendra O, 2014. JBInsTrace: a tracer of Java and JRE classes at basic-block granularity by dynamically instrumenting bytecode. Sci Comput Program, 79:116-125.
[5]Chang RY, Podgurski A, Yang J, 2007. Finding what’s not there: a new approach to revealing neglected conditions in software. Proc Int Symp on Software Testing and Analysis, p.163-173.
[6]Chen D, Huang RB, Qu BB, et al., 2014. Improving static analysis performance using rule-filtering technique. Proc 26th Int Conf on Software Engineering and Knowledge Engineering, p.19-24.
[7]Chen D, Huang RB, Qu BB, et al., 2015a. Mining class temporal specification dynamically based on extended Markov model. Int J Softw Eng Knowl Eng, 25(3):573-604.
[8]Chen D, Zhang YD, Wang RC, et al., 2015b. Extracting more object usage scenarios for API protocol mining. Proc 27th Int Conf on Software Engineering and Knowledge Engineering, p.607-612.
[9]Chen D, Zhang YD, Wei W, et al., 2017. Efficient vulnerability detection based on an optimized rule-checking static analysis technique. Front Inform Technol Electron Eng, 18(3):332-345.
[10]Cook JE, Wolf AL, 1998. Discovering models of software processes from event-based data. ACM Trans Softw Eng Methodol, 7(3):215-249.
[11]Dai ZY, Mao XG, Lei Y, et al., 2014. Compositional mining of multiple object API protocols through state abstraction. Sci World J, Article 171 647.
[12]Dallmeier V, Lindig C, Wasylkowski A, et al., 2006. Mining object behavior with ADABU. Proc Int Workshop on Dynamic Systems Analysis, p.17-24.
[13]Dallmeier V, Knopp N, Mallon C, et al., 2012. Automatically generating test cases for specification mining. IEEE Trans Softw Eng, 38(2):243-257.
[14]Engler D, Chen DY, Hallem S, et al., 2001. Bugs as deviant behavior: a general approach to inferring errors in systems code. ACM SIGOPS Oper Syst Rev, 35(5):57-72.
[15]Ernst MD, Perkins JH, Guo PJ, et al., 2007. The Daikon system for dynamic detection of likely invariants. Sci Comput Program, 69(1-3):35-45.
[16]Kernighan BW, Ritchie DM, 1988. The C Programming Language (2nd Ed.). Prentice Hall, Englewood Cliffs, NJ.
[17]Li ZM, Zhou YY, 2005. PR-miner: automatically extracting implicit programming rules and detecting violations in large software codes. ACM SIGSOFT Softw Eng Not, 30(5):306-315.
[18]Liskov B, 1988. Keynote address—data abstraction and hierarchy. ACM SIGPLAN Not, 23(5):17-34.
[19]Lorenzoli D, Mariani L, Pezzè M, 2008. Automatic generation of software behavioral models. Proc 30th Int Conf on Software Engineering, p.501-510.
[20]Pradel M, Gross TR, 2012. Leveraging test generation and specification mining for automated bug detection without false positives. Proc 34th Int Conf on Software Engineering, p.288-298.
[21]Pradel M, Jaspan C, Aldrich J, et al., 2012. Statically checking API protocol conformance with mined multi-object specifications. Proc 34th Int Conf on Software Engineering, p.925-935.
[22]Ramanathan MK, Grama A, Jagannathan S, 2007. Static specification inference using predicate mining. ACM SIGPLAN Not, 42(6):123-134.
[23]Shoham S, Yahav E, Fink S, et al., 2007. Static specification mining using automata-based abstractions. Proc Int Symp on Software Testing and Analysis, p.174-184.
[24]Skala V, Petruska R, 2014. A new approach to hash function construction for textual data: a comparison. Proc 4th World Congress on Information and Communication Technologies, p.39-44.
[25]Tatsubori M, Sasaki T, Chiba S, et al., 2001. A bytecode translator for distributed execution of “legacy” Java software. Proc 15th European Conf on Object-Oriented Programming, p.236-255.
[26]Thummalapenta S, Xie T, 2011. Alattin: mining alternative patterns for defect detection. Autom Softw Eng, 18(3-4): 293-323.
[27]Wasylkowski A, 2007. Mining object usage models. Proc 29th Int Conf on Software Engineering, p.93-94.
[28]Wasylkowski A, Zeller A, Lindig C, 2007. Detecting object usage anomalies. Proc 6th Joint Meeting of the European Software Engineering Conf and the ACM SIGSOFT Symp on Foundations of Software Engineering, p.35-44.
Open peer comments: Debate/Discuss/Question/Opinion
<1>