Evaluation of Application Possibilities for Packaging Technologies in Canonical Workflows

Jejkal, Thomas ORCID iD icon 1; Chelbi, Sabrine ORCID iD icon 1; Pfeil, Andreas ORCID iD icon 1; Wittenburg, Peter
1 Scientific Computing Center (SCC), Karlsruher Institut für Technologie (KIT)

Abstract (englisch):

In Canonical Workflow Framework for Research (CWFR) “packages” are relevant in two different directions. In data science, workflows are in general being executed on a set of files which have been aggregated for specific purposes, such as for training a model in deep learning. We call this type of “package” a data collection and its aggregation and metadata description is motivated by research interests. The other type of “packages” relevant for CWFR are supposed to represent workflows in a self-describing and self-contained way for later execution. In this paper, we will review different packaging technologies and investigate their usability in the context of CWFR. For this purpose, we draw on an exemplary use case and show how packaging technologies can support its realization. We conclude that packaging technologies of different flavors help on providing inputs and outputs for workflow steps in a machine-readable way, as well as on representing a workflow and all its artifacts in a self-describing and self-contained way.

Veröffentlicht am 05.07.2022
Publikationsdatum 01.04.2022
Sprache Englisch
Erschienen in Data Intelligence
Verlag Massachusetts Institute of Technology Press (MIT Press)
Band 4
Heft 2
Seiten 372–385
Schlagwörter Canonical Workflow Framework for Research Packaging technologies Research data collections Packaging formats
