ProveRAG: Provenance-Driven Vulnerability Analysis with Automated Retrieval-Augmented LLMs

Fayyazi, Reza; Trueba, Stella Hoyos; Zuzak, Michael; Yang, Shanchieh Jay

doi:10.1109/ACCESS.2025.3638251

ProveRAG: Provenance-Driven Vulnerability Analysis with Automated Retrieval-Augmented LLMs

Fayyazi, Reza ; Trueba, Stella Hoyos ¹; Zuzak, Michael; Yang, Shanchieh Jay
¹ Institut für Angewandte Informatik (IAI), Karlsruher Institut für Technologie (KIT)

Abstract (englisch):

In cybersecurity, security analysts constantly face the challenge of mitigating newly discovered vulnerabilities in real-time, with over 300,000 vulnerabilities identified since 1999. The sheer volume of known vulnerabilities complicates the detection of patterns for unknown threats. While LLMs can assist, they often hallucinate and lack alignment with recent threats. Over 40,000 vulnerabilities have been identified in 2024 alone, which are introduced after most popular LLMs’ (e.g., GPT-5) training data cutoff. This raises a major challenge of leveraging LLMs in cybersecurity, where accuracy and up-to-date information are paramount. Therefore, we aim to improve the adaptation of LLMs in vulnerability analysis by mimicking how an analyst performs such tasks. We propose ProveRAG, an LLM-powered system designed to assist in rapidly analyzing vulnerabilities with automated retrieval augmentation of web data while self-evaluating its responses with verifiable evidence. ProveRAG incorporates a self-critique mechanism to help alleviate the omission and hallucination common in the output of LLMs applied in cybersecurity applications. The system cross-references data from verifiable sources (NVD and CWE), giving analysts confidence in the actionable insights provided. ... mehr

KITopen-Download

Verlagsausgabe

DOI: 10.5445/IR/1000189172

Veröffentlicht am 19.12.2025

Externe Links

Originalveröffentlichung
DOI: 10.1109/ACCESS.2025.3638251

Scopus
Zitationen: 2

Web of Science
Zitationen: 1

Dimensions
Zitationen: 8

Export

Statistiken

Seitenaufrufe: 123
seit 19.12.2025

Downloads: 55
seit 22.12.2025

Zugehörige Institution(en) am KIT	Institut für Angewandte Informatik (IAI)
Publikationstyp	Zeitschriftenaufsatz
Publikationsjahr	2025
Sprache	Englisch
Identifikator	ISSN: 2169-3536 KITopen-ID: 1000189172
Erschienen in	IEEE Access
Verlag	Institute of Electrical and Electronics Engineers (IEEE)
Band	13
Seiten	212815–212826
Schlagwörter	ProveRAG, LLM, Provenance, CVE, CWE, RAG, Vulnerability, Self-Critique
Nachgewiesen in	Dimensions OpenAlex Web of Science Scopus

Repository KITopen

ProveRAG: Provenance-Driven Vulnerability Analysis with Automated Retrieval-Augmented LLMs

Abstract (englisch):