KIT | KIT-Bibliothek | Impressum | Datenschutz

Leveraging Data Shapes in Large Language Model Contexts for Question Answering on Public and Private Knowledge Graphs

Wardenga, Jan G. 1; Käfer, Tobias ORCID iD icon 1
1 Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB), Karlsruher Institut für Technologie (KIT)

Abstract:

Knowledge Graph Question Answering aims to make structured semantic data accessible through natural language interfaces. While recent Large Language Models can generate SPARQL queries from natural language questions, their effectiveness is limited by their reliance on prior exposure to vocabularies and datasets during pretraining. This work explores how Knowledge Graph specific schema information, so called shape constraints, can be used to provide Large Language Models with context about the dataset to be queried, enabling more accurate and generalizable query generation. We show that ShEx-augmented prompting achieves an F1-score of 0.28 on unseen knowledge graphs, compared to a baseline score of 0.00, demonstrating its ability to generalize beyond the training distribution. A pipeline is developed that integrates knowledge graph data shape extraction, prompt construction, and automatic SPARQL validation. Experiments were conducted on benchmarks with public and proprietary data sets using different state-of-the-art Large Language Models and demonstrate that shape-informed prompting improves the execution accuracy of generated queries. ... mehr


Verlagsausgabe §
DOI: 10.5445/IR/1000190296
Veröffentlicht am 06.02.2026
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2025
Sprache Englisch
Identifikator ISSN: 1613-0073
KITopen-ID: 1000190296
Erschienen in TEXT2SPARQL 2025 : First International TEXT2SPARQL Challenge 2025 : Proceedings of the First International TEXT2SPARQL Challenge co-Located with Text2KG at ESWC25. Ed.: S. Trump
Veranstaltung 1st / 22nd International TEXT2SPARQL Challenge, Co-Located with Text2KG at ESWC25 (2025), Portorož, Slowenien, 01.06.2025
Verlag RWTH Aachen
Seiten 1-19
Serie CEUR Workshop Proceedings ; 4094
Projektinformation FOR 5339; TP F (DFG, DFG KOORD, KA 5635/1-1)
Externe Relationen Abstract/Volltext
Schlagwörter Data Shapes; Domain Adaptation; Knowledge Graphs; Large Language Models; Prompt Engineering; Question Answering; Retrieval-Augmented Generation; Semantic Web; SPARQL Generation
Nachgewiesen in Scopus
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page