Loading...
Automatic Linking of Issue Reports and Commits in Software Repositories
Rostami Mazraeh, Pooya | 2021
331
Viewed
- Type of Document: M.Sc. Thesis
- Language: Farsi
- Document No: 54004 (19)
- University: Sharif University of Technology
- Department: Computer Engineering
- Advisor(s): Heydarnoori, Abbas
- Abstract:
- An issue report documents the discussions around required changes in issue-tracking systems, while a commit contains the change itself in the version control systems. Recovering links between issues and commits can facilitate many software evolution tasks such as bug localization, defect prediction, software quality measurement, and software documentation. A previous study on over half a million issues from GitHub reports only about 42.2% of issues are manually linked by developers to their pertinent commits. Automating the linking of commit-issue pairs can contribute to the improvement of the said tasks by increasing the coverage of commit-issue links.By far, current state-of-the-art approaches for automated commit-issue linking suffer from low precision, leading to unreliable results, sometimes to the point that imposes human supervision on the predicted links. The low performance gets even more severe when there is a lack of textual information in either commits or issues. Current approaches are also proven computationally expensive.In this work, we propose Hybrid-Linker, an enhanced approach with a higher performance that overcomes such limitations by employing two components, a non-textual classifier component which operates on non-textual, automatically recorded information of the commit-issue pairs to predict a link, and a textual classifier which does the same using textual information of the commit-issue pairs. In this context, textual data is considered information like issue’s title, issue’s description, commit’s description and Diff Code. Moreover, non-textual data is consist of issue’s submit date, commit’s committer time, issue’s creator, commit’s committer, etc. Then, combining the results from the two classifiers, Hybrid-Linker makes the final prediction. Thus, every time one component falls short in predicting a link, we show that the other component fills the gap and improves the results. We evaluate Hybrid-Linker against competing approaches, namely FRLink and DeepLink over a dataset of 12 different projects. Hybrid-Linker achieves 90.1%, 87.8%, and 88.9% based on recall, precision, and F1-score, respectively. It also outperforms FRLink and DeepLink by 31.3%, and 41.3%, regarding the F1-score. Finally, we demonstrate that our proposed approach exhibits extensive improvements in terms of required computational resources as well
- Keywords:
- Issue Reports ; Machine Learning ; Software Maintenance ; Link Recovery ; Mining Software Repositories ; Commit ; Ensemble Learning
-
محتواي کتاب
- view