KIT | KIT-Bibliothek | Impressum | Datenschutz

ConExion: Concept Extraction with Large Language Models

Norouzi, Ebrahim 1; Hertling, Sven; Sack, Harald 1
1 Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB), Karlsruher Institut für Technologie (KIT)

Abstract:

In this paper, an approach for concept extraction from documents using pre-trained large language models (LLMs) is presented. Compared with conventional methods that extract keyphrases summarizing the important information discussed in a document, our approach tackles a more challenging task of extracting all present concepts related to the specific domain, not just the important ones. Through comprehensive evaluations of two widely used benchmark datasets, we demonstrate that our method improves the F1 score compared to state-of-the-art techniques. Additionally, we explore the potential of using prompts within these models for unsupervised concept extraction. The extracted concepts are intended to support domain coverage evaluation of ontologies and facilitate ontology learning, highlighting the effectiveness of LLMs in concept extraction tasks. Our source code and datasets are publicly available at: https://github.com/ISE-FIZKarlsruhe/concept_extraction


Volltext §
DOI: 10.5445/IR/1000184360
Veröffentlicht am 29.08.2025
Originalveröffentlichung
DOI: 10.48550/arXiv.2504.12915
Dimensions
Zitationen: 2
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
Publikationstyp Forschungsbericht/Preprint
Publikationsdatum 16.06.2025
Sprache Englisch
Identifikator KITopen-ID: 1000184360
Schlagwörter Concept Extraction, Present Keyphrase Extraction, Large Language Models
Nachgewiesen in arXiv
Dimensions
OpenAlex
Relationen in KITopen
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page