KIT | KIT-Bibliothek | Impressum | Datenschutz

Summarizing Industrial Log Data with Latent Dirichlet Allocation

Siddharthan, Shunmuga Prabhu; Dix, Marcel; Sprick, Barbara; Klöpper, Benjamin

Industrial systems and equipment produce large log files recording their activities and possible problems. This data is often used for troubleshooting and root cause analysis, but using the raw log data is poorly suited for direct human analysis. Existing approaches based on data mining and machine learning focus on troubleshooting and root cause analysis. However, if a good summary of industrial log files was available, the files could be used to monitor equipment and industrial processes and act more proactively on problems. This contribution shows how a topic modeling approach based on Latent Dirichlet Allocation (LDA) helps to understand, organize and summarize industrial log files. The approach was tested on a real-world industrial dataset and evaluated quantitatively by direct annotation.

Open Access Logo

Verlagsausgabe §
DOI: 10.5445/KSP/1000098011/14
Veröffentlicht am 06.10.2020
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Wirtschaftsinformatik und Marketing (IISM)
Publikationstyp Zeitschriftenaufsatz
Publikationsjahr 2020
Sprache Englisch
Identifikator ISSN: 2363-9881
KITopen-ID: 1000124284
Erschienen in Archives of Data Science, Series A
Band 6
Heft 1
Seiten P14, 18 S. online
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page