KIT | KIT-Bibliothek | Impressum | Datenschutz

FAIR Digital Object Concept for Composing Machine Learning Training Data

Blumenröhr, Nicolas ORCID iD icon 1; Jejkal, Thomas ORCID iD icon 1; Stotzka, Rainer ORCID iD icon 1
1 Scientific Computing Center (SCC), Karlsruher Institut für Technologie (KIT)

Abstract:

In this poster we introduce how the FAIR¹ Digital Object (FAIR DO) concept can simplify the composition of training data sets for Machine Learning (ML). Training data sets from heterogeneous sources mostly have different label terms. Therefore, composing them for application in ML requires laborious relabeling. To automate this process, the FAIR DO concept can be applied. A FAIR DO is an informative representation of scientific data, e.g. a training data set, that makes the data interpretable and actionable for computer systems. For applicability in the context of ML a FAIR DO requires at least a globally unique Persistent Identifier (PID), mandatory metadata, and a data type. With the self-contained structure of a FAIR DO, the associated label information can be accessed. Here, we show this structure and explain how it facilitates access to label information. Moreover, specialized clients and tools are needed for fully automated acting on FAIR DOs and relabeling. Using FAIR DOs this way could also address other laborious steps in ML training data composition like feature- or file reformatting. The described work is based on the results of the RDA IG FAIR Digital Object Fabric². ... mehr


Volltext §
DOI: 10.5445/IR/1000146754
Veröffentlicht am 30.05.2022
Cover der Publikation
Zugehörige Institution(en) am KIT Scientific Computing Center (SCC)
Publikationstyp Poster
Publikationsdatum 20.06.2022
Sprache Englisch
Identifikator KITopen-ID: 1000146754
HGF-Programm 46.21.05 (POF IV, LK 01) HMC
Veranstaltung 19th International Data Week : A Festival of Data / Plenary meeting (RDA 2022), Seoul, Südkorea, 20.06.2022 – 23.06.2022
Schlagwörter FAIR Digital Objects, Machine Learning, Metadata Management
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page