open access publication

Article, 2024

Optimized CNN Architectures Benchmarking in Hardware-Constrained Edge Devices in IoT Environments

IEEE Internet of Things Journal, ISSN 2327-4662, 2372-2541, Volume 11, 11, Pages 20357-20366, 10.1109/jiot.2024.3369607

Contributors

Rosero-Montalvo, Paul D 0000-0003-1995-400X [1] Tözün, Pinar 0000-0001-6838-4854 [1] Hernandez, Wilmar 0000-0003-4643-8377 (Corresponding author) [2]

Affiliations

  1. [1] University of Copenhagen
  2. [NORA names: KU University of Copenhagen; University; Denmark; Europe, EU; Nordic; OECD];
  3. [2] Universidad de Las Américas
  4. [NORA names: Ecuador; America, South]

Abstract

Internet of Things (IoT) and edge devices have grown in their application fields due to machine learning (ML) models and their capacity to classify images into previously known labels, working close to the end-user. However, the model might be trained with several convolutional neural network (CNN) architectures that can affect its performance when developed in hardware-constrained environments, such as edge devices. In addition, new training trends suggest using transfer learning techniques to get an excellent feature extractor obtained from one domain and use it in a new domain, which has not enough images to train the whole model. In light of these trends, this work benchmarks the most representative CNN architectures on emerging edge devices, some of which have hardware accelerators. The ML models were trained and optimized using a small set of images obtained in IoT environments and using transfer learning. Our results show that unfreezing until the last 20 layers of the model’s architecture can be fine-tuned correctly to the new set of IoT images depending on the CNN architecture. Additionally, quantization is a suitable optimization technique to shrink $2\times $ or $3\times $ times the model leading to a lighter memory footprint, lower execution time, and battery consumption. Finally, the Coral Dev Board can boost $100\times $ the inference process, and the EfficientNet model architecture keeps the same classification accuracy even when the model is adopted to a hardware-constrained environment.

Keywords

EfficientNet, Internet, Internet of Things, IoT, IoT environment, ML models, Things, acceleration, accuracy, application fields, applications, architecture, architecture benchmarks, battery, battery consumption, benchmarks, board, capacity, classification, classification accuracy, consumption, convolutional neural network, convolutional neural network architecture, corals, devices, domain, edge, edge devices, end-users, environment, excellent feature extractor, execution, execution time, extractor, feature extractor, field, footprint, hardware, hardware accelerators, images, inference, inference process, labeling, layer, learning, learning techniques, machine, machine learning, memory footprint, model, model architecture, network, neural network, optimization, optimization techniques, performance, process, quantization, results, technique, time, training, training trends, transfer, transfer learning, transfer learning technique, trends, unfreezing, whole model

Funders

  • Danish Agency for Science and Higher Education

Data Provider: Digital Science