KIT | KIT-Bibliothek | Impressum | Datenschutz

Analyzing data flows of WLCG jobs at batch job level

Kuehn, E.; Fischer, M.; Giffels, M.; Jung, C.; Petzold, A.

With the introduction of federated data access to the workows of WLCG, it is becoming increasingly important for data centers to understand specific data ows regarding storage element accesses, firewall configurations, as well as the scheduling of batch jobs themselves. As existing batch system monitoring and related system monitoring tools do not support measurements at batch job level, a new tool has been developed and put into operation at the GridKa Tier 1 center for monitoring continuous data streams and characteristics of WLCG jobs and pilots. Long term measurements and data collection are in progress. These measurements already have been proven to be useful analyzing misbehaviors and various issues. Therefore we aim for an automated, realtime approach for anomaly detection. As a requirement, prototypes for standard workows have to be examined. Based on measurements of several months, different features of HEP jobs are evaluated regarding their effectiveness for data mining approaches to identify these common workows. The paper will introduce the actual measurement approach and statistics as well as the general concept and first results classifying different HEP job workows derived from the measurements at GridKa.

Open Access Logo

Volltext §
DOI: 10.5445/IR/110103537
DOI: 10.1088/1742-6596/608/1/012017
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Kernphysik (IKP)
Steinbuch Centre for Computing (SCC)
Publikationstyp Zeitschriftenaufsatz
Publikationsjahr 2015
Sprache Englisch
Identifikator ISSN: 1742-6588
KITopen-ID: 110103537
HGF-Programm 53.52.02 (POF III, LK 02) GridKa
Erschienen in Journal of Physics: Conference Series
Verlag IOP Publishing
Band 608
Seiten 1-6
Nachgewiesen in Scopus
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page