KIT | KIT-Bibliothek | Impressum | Datenschutz

Classifying Music Genres Using Image Classification Neural Networks

Hassen, Alan Kai; Janßen, Hilko; Assenmacher, Dennis; Preuss, Mike; Vatolkin, Igor

Domain tailored Convolutional Neural Networks (CNN) have been applied to music genre classification using spectrograms as visual audio representation. It is currently unclear whether domain tailored CNN architectures are superior to network architectures used in the field of image classification. This question arises, because image classification architectures have highly influenced the design of domain tailored network architectures.We examine, whether CNN architectures transferred from image classification are able to achieve similar performance compared to domain tailored CNN architectures used in genre classification. We compare domain tailored and image classification networks by testing their performance on two different datasets, the frequently used benchmarking dataset GTZAN and a newly created, much larger dataset. Our results show that the tested image classification network requires a significantly lower amount of resources and outperforms the domain specific network in our given settings, thus leading to the advantage that it is not necessary to spend expert efforts for the design of the network.

Zugehörige Institution(en) am KIT Institut für Informationswirtschaft und Marketing (IISM)
Publikationstyp Zeitschriftenaufsatz
Publikationsjahr 2018
Sprache Englisch
Identifikator ISSN: 2363-9881
KITopen-ID: 1000118105
Erschienen in Archives of Data Science, Series A (Online First)
Band 5
Heft 1
Seiten A20, 18 S. online
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page