Indexed by:
Abstract:
Context: Code readability, which correlates strongly with software quality, plays a critical role in software maintenance and evolvement. Although existing deep learning-based code readability models have reached a rather high classification accuracy, only structural features are utilized which inevitably limits their model performance. Objective: To address this problem, we propose to extract readability-related features from visual, semantic, and structural aspects from source code in an attempt to further improve code readability classification. Method: First, we convert a code snippet into a RGB matrix (for visual feature extraction), a token sequence (for semantic feature extraction) and a character matrix (for structural feature extraction). Then, we input them into a hybrid neural network that is composed of BERT, CNN, and BiLSTM for feature extraction. Finally, the extracted features are concatenated and input into a classifier to make a code readability classification. Result: A series of experiments are conducted to evaluate our method. The results show that the average accuracy could reach 85.3%, which outperforms all existing models. Conclusion: As an innovative work of extracting readability-related features automatically from visual, semantic, and structural aspects, our method is proved to be effective for the task of code readability classification. © 2022 Elsevier Inc.
Keyword:
Reprint Author's Address:
Email:
Source :
Journal of Systems and Software
ISSN: 0164-1212
Year: 2022
Volume: 193
3 . 5
JCR@2022
3 . 5 0 0
JCR@2022
ESI Discipline: COMPUTER SCIENCE;
ESI HC Threshold:46
JCR Journal Grade:2
CAS Journal Grade:2
Cited Count:
SCOPUS Cited Count: 9
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 7
Affiliated Colleges: