Conference Paper, 2024

Channel-Configurable Deep Wireless Speech Transmission

2024 IEEE Wireless Communications and Networking Conference (WCNC), ISBN 979-8-3503-0358-2, Volume 00, Pages 1-6, 10.1109/wcnc57260.2024.10570880

Contributors

Bokaei, Mohammad [1] Jensen, Jesper 0000-0003-1478-622X [1] [2] Doclo, Simon 0000-0002-3392-2381 [3] Østergaard, Jan 0000-0002-3724-6114 [1]

Affiliations

  1. [1] Aalborg University
  2. [NORA names: AAU Aalborg University; University; Denmark; Europe, EU; Nordic; OECD];
  3. [2] Oticon (Denmark)
  4. [NORA names: Oticon; Private Research; Denmark; Europe, EU; Nordic; OECD];
  5. [3] Carl von Ossietzky University of Oldenburg
  6. [NORA names: Germany; Europe, EU; OECD]

Abstract

The proliferation of edge-based wireless speech applications necessitates the development of resource-efficient, low-latency speech communication systems capable of functioning across diverse communication channel conditions. Ensuring intelligible speech communication under conditions of constrained resources and low-latency presents a challenging problem within the domain of speech transmission. In this paper, we introduce a very low-latency configurable speech transmission system leveraging joint source-channel coding and deep neural networks (DNNs). Our proposed system is a unified deep neural network system engineered to operate effectively across a wide range of wireless communication channel scenarios. The system encompasses both a joint source-channel encoder and a joint source-channel decoder, each with access to channel state information (CSI). In this context, CSI signifies the type of fading in the wireless channel. Notably, our system has a total latency of 2 ms. Through extensive simulations, we empirically demonstrate that the proposed configurable system closely approximates the performance of ideal systems specifically tailored to individual wireless channel scenarios. Our evaluation is rooted in the assessment of instrumental measures of speech quality and intelligibility, affirming the efficacy of our system in diverse and resource-constrained communication contexts.

Keywords

applications, assessment, channel, channel conditions, channel scenarios, channel state information, communication, communication channel conditions, communication context, communication systems, conditions, configuration system, constrained resources, context, decoding, deep neural network system, deep neural networks, development, development of resource-efficient, domain, efficacy, encoding, evaluation, extensive simulations, fading, ideal system, information, intelligence, joint source-channel decoding, latency, low-latency, measures of speech quality, network, network system, neural network, neural network system, performance, problem, proliferation, quality, resource-efficient, resources, scenarios, simulation, source-channel decoding, speech applications, speech communication, speech communication systems, speech quality, speech transmission, speech transmission system, state information, system, transmission, transmission system, wireless channel

Funders

  • European Commission

Data Provider: Digital Science