Full Text:   <5205>

Summary:  <519>

CLC number: TP311

On-line Access: 2024-08-27

Received: 2023-10-17

Revision Accepted: 2024-05-08

Crosschecked: 2022-08-29

Cited: 0

Clicked: 2978

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Zhiqiu Huang

https://orcid.org/0000-0001-6843-1892

Yichao SHAO

https://orcid.org/0000-0002-4553-5602

-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2022 Vol.23 No.8 P.1205-1216

http://doi.org/10.1631/FITEE.2100379


Fast code recommendation via approximate sub-tree matching


Author(s):  Yichao SHAO, Zhiqiu HUANG, Weiwei LI, Yaoshen YU

Affiliation(s):  School of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211100, China; more

Corresponding email(s):   shaoyichao@nuaa.edu.cn, zqhuang@nuaa.edu.cn

Key Words:  Code reuse, Code recommendation, Tree similarity, Structure information



Abstract: 
Software developers often write code that has similar functionality to existing code segments. A code recommendation tool that helps developers reuse these code fragments can significantly improve their efficiency. Several methods have been proposed in recent years. Some use sequence matching algorithms to find the related recommendations. Most of these methods are time-consuming and can leverage only low-level textual information from code. Others extract features from code and obtain similarity using numerical feature vectors. However, the similarity of feature vectors is often not equivalent to the original code’s similarity. Structural information is lost during the process of transforming abstract syntax trees into vectors. We propose an approximate sub-tree matching based method to solve this problem. Unlike existing tree-based approaches that match feature vectors, it retains the tree structure of the query code in the matching process to find code fragments that best match the current query. It uses a fast approximation sub-tree matching algorithm by transforming the sub-tree matching problem into the match between the tree and the list. In this way, the structural information can be used for code recommendation tasks that have high time requirements. We have constructed several real-world code databases covering different languages and granularities to evaluate the effectiveness of our method. The results show that our method outperforms two compared methods, SENSORY and Aroma, in terms of the recall value on all the datasets, and can be applied to large datasets.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2025 Journal of Zhejiang University-SCIENCE