How more data can hurt: Instability and regularization in next-generation reservoir computing

Zhang, Yuanzhao; Santos, Edmilson Roque dos; Cornelius, Sean P.

Computer Science > Machine Learning

arXiv:2407.08641 (cs)

[Submitted on 11 Jul 2024 (v1), last revised 25 Jan 2025 (this version, v2)]

Title:How more data can hurt: Instability and regularization in next-generation reservoir computing

Authors:Yuanzhao Zhang, Edmilson Roque dos Santos, Sean P. Cornelius

View PDF HTML (experimental)

Abstract:It has been found recently that more data can, counter-intuitively, hurt the performance of deep neural networks. Here, we show that a more extreme version of the phenomenon occurs in data-driven models of dynamical systems. To elucidate the underlying mechanism, we focus on next-generation reservoir computing (NGRC) -- a popular framework for learning dynamics from data. We find that, despite learning a better representation of the flow map with more training data, NGRC can adopt an ill-conditioned ``integrator'' and lose stability. We link this data-induced instability to the auxiliary dimensions created by the delayed states in NGRC. Based on these findings, we propose simple strategies to mitigate the instability, either by increasing regularization strength in tandem with data size, or by carefully introducing noise during training. Our results highlight the importance of proper regularization in data-driven modeling of dynamical systems.

Comments:	16 pages, 12 figures. Figures added, References added; Comments welcome
Subjects:	Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Dynamical Systems (math.DS); Adaptation and Self-Organizing Systems (nlin.AO)
Cite as:	arXiv:2407.08641 [cs.LG]
	(or arXiv:2407.08641v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2407.08641

Submission history

From: Edmilson Roque Dos Santos [view email]
[v1] Thu, 11 Jul 2024 16:22:13 UTC (3,667 KB)
[v2] Sat, 25 Jan 2025 20:21:29 UTC (4,321 KB)

Computer Science > Machine Learning

Title:How more data can hurt: Instability and regularization in next-generation reservoir computing

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:How more data can hurt: Instability and regularization in next-generation reservoir computing

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators