Conference Paper,
Channel-Configurable Deep Wireless Speech Transmission
ISBN ,
Affiliations
- [1] Aalborg University [NORA names: AAU Aalborg University; University; Denmark; Europe, EU; Nordic; OECD];
- [2] Oticon (Denmark) [NORA names: Oticon; Private Research; Denmark; Europe, EU; Nordic; OECD];
- [3] Carl von Ossietzky University of Oldenburg [NORA names: Germany; Europe, EU; OECD]
Abstract
The proliferation of edge-based wireless speech applications necessitates the development of resource-efficient, low-latency speech communication systems capable of functioning across diverse communication channel conditions. Ensuring intelligible speech communication under conditions of constrained resources and low-latency presents a challenging problem within the domain of speech transmission. In this paper, we introduce a very low-latency configurable speech transmission system leveraging joint source-channel coding and deep neural networks (DNNs). Our proposed system is a unified deep neural network system engineered to operate effectively across a wide range of wireless communication channel scenarios. The system encompasses both a joint source-channel encoder and a joint source-channel decoder, each with access to channel state information (CSI). In this context, CSI signifies the type of fading in the wireless channel. Notably, our system has a total latency of 2 ms. Through extensive simulations, we empirically demonstrate that the proposed configurable system closely approximates the performance of ideal systems specifically tailored to individual wireless channel scenarios. Our evaluation is rooted in the assessment of instrumental measures of speech quality and intelligibility, affirming the efficacy of our system in diverse and resource-constrained communication contexts.