Abstract:
Architectural decay can manifest as the evolution of architectural smells, degrading integrity, and increasing maintenance costs. Existing techniques capture smells post hoc or predict on component level, acting too late or on too coarse a granularity. We investigate if the risk of introducing architectural smells can already be predicted when issues are opened. Thus, we propose an issue-level prediction approach that utilizes the semantic representations of Large Language Models (LLMs). To enable training and evaluation, we construct a dataset from three GitLab-hosted projects by linking issues to smells via smell-inducing changes. On this dataset, we train classifiers to identify high-risk issues and conduct an empirical study comparing seven different representations and nine classifiers. Our best-performing classifier (SVM with OpenAI embeddings) achieves F1-scores of up to 0.506, with a recall of about 0.74. This means that our approach can identify approximately 74% of smell-inducing issues before implementation begins. When design alternatives are still being considered. Our approach provides early warnings of potential architectural risks.
... mehr
This work shifts from reactive remediation to proactive quality assurance, raising awareness of potential architectural risks.