KIT | KIT-Bibliothek | Impressum | Datenschutz

Characteristic sets profile features: Estimation and application to SPARQL query planning

Heling, Lars 1; Acosta, Maribel; Kirrane, Sabrina [Hrsg.]; Ngonga Ngomo, Axel-Cyrille [Hrsg.]; Kirrane, Sabrina [Hrsg.]; Ngonga Ngomo, Axel-Cyrille [Hrsg.]
1 Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB), Karlsruher Institut für Technologie (KIT)

Abstract:

RDF dataset profiling is the task of extracting a formal representation of a dataset’s features. Such features may cover various aspects of the RDF dataset ranging from information on licensing and provenance to statistical descriptors of the data distribution and its semantics. In this work, we focus on the characteristics sets profile features that capture both structural and semantic information of an RDF dataset, making them a valuable resource for different downstream applications. While previous research demonstrated the benefits of characteristic sets in centralized and federated query processing, access to these fine-grained statistics is taken for granted. However, especially in federated query processing, computing this profile feature is challenging as it can be difficult and/or costly to access and process the entire data from all federation members. We address this shortcoming by introducing the concept of a profile feature estimation and propose a sampling-based approach to generate estimations for the characteristic sets profile feature. In addition, we showcase the applicability of these feature estimations in federated querying by proposing a query planning approach that is specifically designed to leverage these feature estimations. ... mehr


Verlagsausgabe §
DOI: 10.5445/IR/1000160516
Veröffentlicht am 13.07.2023
Originalveröffentlichung
DOI: 10.3233/SW-222903
Scopus
Zitationen: 1
Dimensions
Zitationen: 3
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)
Publikationstyp Zeitschriftenaufsatz
Publikationsdatum 05.04.2023
Sprache Englisch
Identifikator ISSN: 2210-4968, 1570-0844
KITopen-ID: 1000160516
Erschienen in Semantic Web
Verlag IOS Press
Band 14
Heft 3
Seiten 491–526
Nachgewiesen in Web of Science
Dimensions
Scopus
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page