open access publication

Article, 2024

Gesture Recognition of Filipino Sign Language Using Convolutional and Long Short-Term Memory Deep Neural Networks

Knowledge, ISSN 1075-5470, 1552-8545, 2673-9585, Volume 4, 3, Pages 358-381, 10.3390/knowledge4030020

Contributors

Cayme, Karl Jensen [1] Retutal, Vince Andrei [1] Salubre, Miguel Edwin [1] Astillo, Philip Virgil Berrer 0000-0001-9611-1036 [1] Canete, Luis Gerardo S 0000-0003-3574-0976 [1] Choudhary, Gaurav 0000-0003-3378-2945 [2]

Affiliations

  1. [1] University of San Carlos
  2. [NORA names: Philippines; Asia, South];
  3. [2] University of Southern Denmark
  4. [NORA names: SDU University of Southern Denmark; University; Denmark; Europe, EU; Nordic; OECD]

Abstract

In response to the recent formalization of Filipino Sign Language (FSL) and the lack of comprehensive studies, this paper introduces a real-time FSL gesture recognition system. Unlike existing systems, which are often limited to static signs and asynchronous recognition, it offers dynamic gesture capturing and recognition of 10 common expressions and five transactional inquiries. To this end, the system sequentially employs cropping, contrast adjustment, grayscale conversion, resizing, and normalization of input image streams. These steps serve to extract the region of interest, reduce the computational load, ensure uniform input size, and maintain consistent pixel value distribution. Subsequently, a Convolutional Neural Network and Long-Short Term Memory (CNN-LSTM) model was employed to recognize nuances of real-time FSL gestures. The results demonstrate the superiority of the proposed technique over existing FSL recognition systems, achieving an impressive average accuracy, recall, and precision rate of 98%, marking an 11.3% improvement in accuracy. Furthermore, this article also explores lightweight conversion methods, including post-quantization and quantization-aware training, to facilitate the deployment of the model on resource-constrained platforms. The lightweight models show a significant reduction in model size and memory utilization with respect to the base model when executed in a Raspberry Pi minicomputer. Lastly, the lightweight model trained with the quantization-aware technique (99%) outperforms the post-quantization approach (97%), showing a notable 2% improvement in accuracy.

Keywords

CNN-LSTM, Filipino Sign Language, Long-Short, Raspberry, Raspberry Pi minicomputer, accuracy, adjustment, approach, average accuracy, base, base model, capture, comprehensive study, computational load, contrast, contrast adjustment, conversion, conversion method, convolution, convolutional neural network, crop, deep neural networks, deployment, distribution, expression, formalism, gesture capture, gesture recognition, gesture recognition system, gestures, grayscale, grayscale conversion, image stream, improvement, input image stream, input size, inquiry, lack, lack of comprehensive studies, language, lightweight, lightweight model, load, long short-term memory, long short-term memory deep neural network, longer, memory, memory utilization, method, minicomputer, model, model size, network, neural network, normalization, nuances, platform, precision, precision rate, quantization-aware training, rate, recall, recognition, recognition system, reduction, resizing, resource-constrained platforms, response, results, short-term memory deep neural network, sign language, signs, size, static signs, stream, study, superiority, system, technique, term memory, training, transactional inquiry, utilization

Data Provider: Digital Science