Benchmarking MLCommons Tiny Audio Denoising with Deployability Constraints

IRIS

Speech enhancement is a critical field in audio signal processing given its essentiality to overcome obstacles related to loud and damaged speech signals. Due to the revolutionary capa-bilities of Deep Learning (DL) models, there has been significant interest on benchmarking them and studying their suitability for tiny embedded systems. In this paper, we thoroughly examine the growing field of voice improvement, with a specific emphasis on the use of DL-based techniques under consideration by the MLCommons standardization. In particular, among the others, the Legendre Memory Unit (LMU) model achieves an average Scale-Invariant Signal-to-Distortion Ratio (SISDR) on 8.613 in 627 KiB of FLASH memory, making it deployable on tiny microcontrollers by requiring only 7 ms per inference run.

Benchmarking MLCommons Tiny Audio Denoising with Deployability Constraints / Mazinani, Armin; Pau, Danilo Pietro; Davoli, Luca; Ferrari, Gianluigi. - (2024), pp. 1-4. ( 2024 IEEE Gaming, Entertainment, and Media Conference (GEM)) [10.1109/gem61861.2024.10585695].

Benchmarking MLCommons Tiny Audio Denoising with Deployability Constraints

Mazinani, Armin;Pau, Danilo Pietro;Davoli, Luca;Ferrari, Gianluigi

2024-01-01

Abstract

Speech enhancement is a critical field in audio signal processing given its essentiality to overcome obstacles related to loud and damaged speech signals. Due to the revolutionary capa-bilities of Deep Learning (DL) models, there has been significant interest on benchmarking them and studying their suitability for tiny embedded systems. In this paper, we thoroughly examine the growing field of voice improvement, with a specific emphasis on the use of DL-based techniques under consideration by the MLCommons standardization. In particular, among the others, the Legendre Memory Unit (LMU) model achieves an average Scale-Invariant Signal-to-Distortion Ratio (SISDR) on 8.613 in 627 KiB of FLASH memory, making it deployable on tiny microcontrollers by requiring only 7 ms per inference run.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Citazione
	
				Benchmarking MLCommons Tiny Audio Denoising with Deployability Constraints / Mazinani, Armin; Pau, Danilo Pietro; Davoli, Luca; Ferrari, Gianluigi. - (2024), pp. 1-4. ( 2024 IEEE Gaming, Entertainment, and Media Conference (GEM)) [10.1109/gem61861.2024.10585695].
			
	Appare nelle tipologie:
	
				4.1b Atto convegno Volume

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11381/2991073

Citazioni

ND

0

1

social impact