KIT | KIT-Bibliothek | Impressum | Datenschutz

Enabling Architecture Traceability by LLM-based Architecture Component Name Extraction

Fuchß, Dominik ORCID iD icon; Liu, Haoyu ORCID iD icon; Hey, Tobias ORCID iD icon; Keim, Jan ORCID iD icon; Koziolek, Anne ORCID iD icon

Abstract (englisch):

Traceability Link Recovery (TLR) is an enabler for various software engineering tasks.
One important task is the recovery of trace links between Software Architecture Documentation (SAD) and source code.
Here, the main challenge is the semantic gap between the two artifact types.
Recent research has shown that this semantic gap can be bridged by using Software Architecture Models (SAMs) as intermediates.
However, the creation of SAMs is a manual and time-consuming task.
This paper investigates the use of Large Language Models (LLMs) to extract component names as simple SAMs for TLR based on SAD and source code.
By doing so, we aim to bridge the semantic gap between SAD and source code without the need for manual SAM creation.
We compare our approach to the state-of-the-art TLR approaches TransArC and ArDoCode.
TransArC is the currently best-performing approach for TLR between SAD and source code, but it requires SAMs as an additional artifact.
Our evaluation shows that our approach performs comparable to TransArC (weighted average F1 with GPT-4o: 0.86 vs. TransArC's 0.87), while only needing the SAD and source code.
Moreover, our approach significantly outperforms the best baseline that does not need SAMs (weighted average F1 with GPT-4o: 0.86 vs. ... mehr


Postprint §
DOI: 10.5445/IR/1000179830
Veröffentlicht am 07.03.2025
Originalveröffentlichung
DOI: 10.1109/ICSA65012.2025.00011
Scopus
Zitationen: 4
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Informationssicherheit und Verlässlichkeit (KASTEL)
Publikationstyp Proceedingsbeitrag
Publikationsdatum 31.03.2025
Sprache Englisch
Identifikator ISBN: 979-8-3315-2091-5
KITopen-ID: 1000179830
HGF-Programm 46.23.01 (POF IV, LK 01) Methods for Engineering Secure Systems
Erschienen in 2025 IEEE 22nd International Conference on Software Architecture (ICSA), Odense, Denmark, 31 March 2025 - 04 April 2025
Veranstaltung 22nd IEEE International Conference on Software Architecture (ICSA 2025), Ottensee, Dänemark, 31.03.2025 – 04.04.2025
Seiten 1–12
Projektinformation SFB 1608/1, 501798263 (DFG, DFG KOORD, SFB 1608)
Schlagwörter Traceability Link Recovery, Large Language Models, Software Architecture, Model Extraction
Nachgewiesen in OpenAlex
Scopus
Relationen in KITopen
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page