KIT | KIT-Bibliothek | Impressum | Datenschutz

PAL – parallel active learning for machine-learned potentials

Zhou, Chen 1,2; Neubert, Marlen 1,2; Koide, Yuri 1,2; Zhang, Yumeng 1,2; Vuong, Van-Quan 1,2,3; Schlöder, Tobias ORCID iD icon 1,2; Dehnen, Stefanie 2; Friederich, Pascal ORCID iD icon 1,2
1 Institut für Theoretische Informatik (ITI), Karlsruher Institut für Technologie (KIT)
2 Institut für Nanotechnologie (INT), Karlsruher Institut für Technologie (KIT)
3 Institut für Physikalische Chemie (IPC), Karlsruher Institut für Technologie (KIT)

Abstract:

Constructing datasets representative of the target domain is essential for training effective machine learning models. Active learning (AL) is a promising method that iteratively extends training data to enhance model performance while minimizing data acquisition costs. However, current AL workflows often require human intervention and lack parallelism, leading to inefficiencies and underutilization of modern computational resources. In this work, we introduce PAL, an automated, modular, and parallel active learning library that integrates AL tasks and manages their execution and communication on shared- and distributed-memory systems using the Message Passing Interface (MPI). PAL provides users with the flexibility to design and customize all components of their active learning scenarios, including machine learning models with uncertainty estimation, oracles for ground truth labeling, and strategies for exploring the target space. We demonstrate that PAL significantly reduces computational overhead and improves scalability, achieving substantial speed-ups through asynchronous parallelization on CPU and GPU hardware. Applications of PAL to several real-world scenarios – including ground-state reactions in biomolecular systems, excited-state dynamics of molecules, simulations of inorganic clusters, and thermo-fluid dynamics – illustrate its effectiveness in accelerating the development of machine learning models. ... mehr


Verlagsausgabe §
DOI: 10.5445/IR/1000184424
Veröffentlicht am 02.09.2025
Originalveröffentlichung
DOI: 10.1039/d5dd00073d
Scopus
Zitationen: 1
Dimensions
Zitationen: 1
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Nanotechnologie (INT)
Institut für Physikalische Chemie (IPC)
Institut für Theoretische Informatik (ITI)
Publikationstyp Zeitschriftenaufsatz
Publikationsdatum 01.07.2025
Sprache Englisch
Identifikator ISSN: 2635-098X
KITopen-ID: 1000184424
HGF-Programm 43.31.01 (POF IV, LK 01) Multifunctionality Molecular Design & Material Architecture
Erschienen in Digital Discovery
Verlag Royal Society of Chemistry (RSC)
Band 4
Heft 7
Seiten 1901–1911
Vorab online veröffentlicht am 22.06.2025
Nachgewiesen in OpenAlex
Dimensions
Scopus
Relationen in KITopen
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page