KIT | KIT-Bibliothek | Impressum | Datenschutz

HPC-oriented Canonical Workflows for Machine Learning Applications in Climate and Weather Prediction

Mozaffari, Amirpasha ; Langguth, Michael; Gong, Bing; Ahring, Jessica; Campos, Adrian Rojas; Nieters, Pascal; Escobar, Otoniel José Campos; Wittenbrink, Martin ORCID iD icon 1; Baumann, Peter; Schultz, Martin G.
1 Institut für Meteorologie und Klimaforschung – Atmosphärische Umweltforschung (IMK-IFU), Karlsruher Institut für Technologie (KIT)

Abstract (englisch):

Machine learning (ML) applications in weather and climate are gaining momentum as big data and the immense increase in High-performance computing (HPC) power are paving the way. Ensuring FAIR data and reproducible ML practices are significant challenges for Earth system researchers. Even though the FAIR principle is well known to many scientists, research communities are slow to adopt them. Canonical Workflow Framework for Research (CWFR) provides a platform to ensure the FAIRness and reproducibility of these practices without overwhelming researchers. This conceptual paper envisions a holistic CWFR approach towards ML applications in weather and climate, focusing on HPC and big data. Specifically, we discuss Fair Digital Object (FDO) and Research Object (RO) in the DeepRain project to achieve granular reproducibility. DeepRain is a project that aims to improve precipitation forecast in Germany by using ML. Our concept envisages the raster datacube to provide data harmonization and fast and scalable data access. We suggest the Juypter notebook as a single reproducible experiment. In addition, we envision JuypterHub as a scalable and distributed central platform that connects all these elements and the HPC resources to the researchers via an easy-to-use graphical interface.


Verlagsausgabe §
DOI: 10.5445/IR/1000149399
Veröffentlicht am 03.08.2022
Originalveröffentlichung
DOI: 10.1162/dint_a_00131
Scopus
Zitationen: 4
Dimensions
Zitationen: 6
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Meteorologie und Klimaforschung – Atmosphärische Umweltforschung (IMK-IFU)
Publikationstyp Zeitschriftenaufsatz
Publikationsdatum 01.04.2022
Sprache Englisch
Identifikator ISSN: 2641-435X
KITopen-ID: 1000149399
Erschienen in Data Intelligence
Verlag MIT Press
Band 4
Heft 2
Seiten 271–285
Nachgewiesen in Dimensions
Scopus
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page