Amit Kumar
2021
ArchiNet: A Concept-token based Approach for Determining Architectural Change Categories
Amit Kumar,
Amit Kumar
Proceedings of the 33rd International Conference on Software Engineering and Knowledge Engineering
Causes of software architectural change are classified as perfective, preventive, corrective, and adaptive. Change classification is used to promote common approaches for addressing similar changes, produce appropriate design documentation for a release, construct a developer’s profile, form a balanced team, support code review, etc. However, automated architectural change classification techniques are in their infancy, perhaps due to the lack of a benchmark dataset and the need for extensive human involvement. To address these shortcomings, we present a benchmark dataset and a text classifier for determining the architectural change rationale from commit descriptions. First, we explored source code properties for change classification independent of project activity descriptions and found poor outcomes. Next, through extensive analysis, we identified the challenges of classifying architectural change from text and proposed a new classifier that uses concept tokens derived from the concept analysis of change samples. We also studied the sensitivity of change classification of various types of tokens present in commit messages. The experimental outcomes employing 10-fold and cross-project validation techniques with five popular open-source systems show that the F1 score of our proposed classifier is around 70%. The precision and recall are mostly consistent among all categories of change and more promising than competing methods for text classification
ArchiNet: A Concept-token based Approach for Determining Architectural Change Categories
Amit Kumar,
Amit Kumar
Proceedings of the 33rd International Conference on Software Engineering and Knowledge Engineering
Causes of software architectural change are classified as perfective, preventive, corrective, and adaptive. Change classification is used to promote common approaches for addressing similar changes, produce appropriate design documentation for a release, construct a developer’s profile, form a balanced team, support code review, etc. However, automated architectural change classification techniques are in their infancy, perhaps due to the lack of a benchmark dataset and the need for extensive human involvement. To address these shortcomings, we present a benchmark dataset and a text classifier for determining the architectural change rationale from commit descriptions. First, we explored source code properties for change classification independent of project activity descriptions and found poor outcomes. Next, through extensive analysis, we identified the challenges of classifying architectural change from text and proposed a new classifier that uses concept tokens derived from the concept analysis of change samples. We also studied the sensitivity of change classification of various types of tokens present in commit messages. The experimental outcomes employing 10-fold and cross-project validation techniques with five popular open-source systems show that the F1 score of our proposed classifier is around 70%. The precision and recall are mostly consistent among all categories of change and more promising than competing methods for text classification