KIT | KIT-Bibliothek | Impressum | Datenschutz

Active job monitoring in pilots

Kuehn, E.; Fischer, M.; Giffels, M.; Jung, C.; Petzold, A.

Abstract:
Recent developments in high energy physics (HEP) including multi-core jobs and multi-core pilots require data centres to gain a deep understanding of the system to monitor, design, and upgrade computing clusters. Networking is a critical component. Especially the increased usage of data federations, for example in diskless computing centres or as a fall-back solution, relies on WAN connectivity and availability. The specific demands of different experiments and communities, but also the need for identification of misbehaving batch jobs, requires an active monitoring. Existing monitoring tools are not capable of measuring fine-grained information at batch job level. This complicates network-aware scheduling and optimisations. In addition, pilots add another layer of abstraction. They behave like batch systems themselves by managing and executing payloads of jobs internally. The number of real jobs being executed is unknown, as the original batch system has no access to internal information about the scheduling process inside the pilots. Therefore, the comparability of jobs and pilots for predicting run-time behaviour or network performance cannot be ensured. ... mehr

Open Access Logo


Volltext §
DOI: 10.5445/IR/110104358
Originalveröffentlichung
DOI: 10.1088/1742-6596/664/5/052019
Scopus
Zitationen: 1
Coverbild
Zugehörige Institution(en) am KIT Steinbuch Centre for Computing (SCC)
Institut für Kernphysik (IKP)
Publikationstyp Zeitschriftenaufsatz
Jahr 2015
Sprache Englisch
Identifikator ISSN: 1742-6588, 1742-6596
urn:nbn:de:swb:90-AAA1101043589
KITopen-ID: 110104358
HGF-Programm 51.01.01 (POF III, LK 01)
Erschienen in Journal of physics / Conference Series
Band 664
Heft 5
Seiten 052019/1-8
Nachgewiesen in Scopus
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page