Federated Learning With Highly Imbalanced Audio Data

Green, Marc C.; Plumbley, Mark D.

Computer Science > Sound

arXiv:2105.08550 (cs)

[Submitted on 18 May 2021]

Title:Federated Learning With Highly Imbalanced Audio Data

Authors:Marc C. Green, Mark D. Plumbley

View PDF

Abstract:Federated learning (FL) is a privacy-preserving machine learning method that has been proposed to allow training of models using data from many different clients, without these clients having to transfer all their data to a central server. There has as yet been relatively little consideration of FL or other privacy-preserving methods in audio. In this paper, we investigate using FL for a sound event detection task using audio from the FSD50K dataset. Audio is split into clients based on uploader metadata. This results in highly imbalanced subsets of data between clients, noted as a key issue in FL scenarios. A series of models is trained using `high-volume' clients that contribute 100 audio clips or more, testing the effects of varying FL parameters, followed by an additional model trained using all clients with no minimum audio contribution. It is shown that FL models trained using the high-volume clients can perform similarly to a centrally-trained model, though there is much more noise in results than would typically be expected for a centrally-trained model. The FL model trained using all clients has a considerably reduced performance compared to the centrally-trained model.

Subjects:	Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2105.08550 [cs.SD]
	(or arXiv:2105.08550v1 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2105.08550

Submission history

From: Marc Green [view email]
[v1] Tue, 18 May 2021 14:35:55 UTC (189 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.SD

< prev | next >

new | recent | 2021-05

Change to browse by:

cs
eess
eess.AS

References & Citations

DBLP - CS Bibliography

listing | bibtex

Mark D. Plumbley

export BibTeX citation

Computer Science > Sound

Title:Federated Learning With Highly Imbalanced Audio Data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Federated Learning With Highly Imbalanced Audio Data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators