KIT | KIT-Bibliothek | Impressum | Datenschutz

A Comprehensive Survey of Convolutions in Deep Learning: Applications, Challenges, and Future Trends

Younesi, Abolfazl; Ansari, Mohsen; Fazli, Mohammadamin; Ejlali, Alireza; Shafique, Muhammad; Henkel, Jörg 1
1 Institut für Technische Informatik (ITEC), Karlsruher Institut für Technologie (KIT)

Abstract:

In today’s digital age, Convolutional Neural Networks (CNNs), a subset of Deep Learning (DL), are widely used for various computer vision tasks such as image classification, object detection, and image segmentation. There are numerous types of CNNs designed to meet specific needs and requirements, including 1D, 2D, and 3D CNNs, as well as dilated, grouped, attention, depthwise convolutions, and NAS, among others. Each type of CNN has its unique structure and characteristics, making it suitable for specific tasks. It’s crucial to gain a thorough understanding and perform a comparative analysis of these different CNN types to understand their strengths and weaknesses. Furthermore, studying the performance, limitations, and practical applications of each type of CNN can aid in the development of new and improved architectures in the future. We also dive into the platforms and frameworks that researchers utilize for their research or development from various perspectives. Additionally, we explore the main research fields of CNN like 6D vision, generative models, and meta-learning. This survey paper provides a comprehensive examination and comparison of various CNN architectures, highlighting their architectural differences and emphasizing their respective advantages, disadvantages, applications, challenges, and future trends.


Verlagsausgabe §
DOI: 10.5445/IR/1000169984
Veröffentlicht am 22.04.2024
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Technische Informatik (ITEC)
Publikationstyp Zeitschriftenaufsatz
Publikationsdatum 18.03.2024
Sprache Englisch
Identifikator ISSN: 2169-3536
KITopen-ID: 1000169984
Erschienen in IEEE Access
Verlag Institute of Electrical and Electronics Engineers (IEEE)
Band 12
Seiten 41180–41218
Schlagwörter Deep learning, DNN, CNN, machine learning, vision transformers, GAN, attention, computer vision, LLM, large language model, transformer, dilated convolution, depthwise, NAS, NAT, object detection, 6D vision, vision language model
Nachgewiesen in Web of Science
Scopus
Dimensions
Relationen in KITopen
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page