Working Memory Connections for LSTM

Landi, Federico; Baraldi, Lorenzo; Cornia, Marcella; Cucchiara, Rita

doi:10.1016/j.neunet.2021.08.030

Computer Science > Machine Learning

arXiv:2109.00020 (cs)

[Submitted on 31 Aug 2021]

Title:Working Memory Connections for LSTM

Authors:Federico Landi, Lorenzo Baraldi, Marcella Cornia, Rita Cucchiara

View PDF

Abstract:Recurrent Neural Networks with Long Short-Term Memory (LSTM) make use of gating mechanisms to mitigate exploding and vanishing gradients when learning long-term dependencies. For this reason, LSTMs and other gated RNNs are widely adopted, being the standard de facto for many sequence modeling tasks. Although the memory cell inside the LSTM contains essential information, it is not allowed to influence the gating mechanism directly. In this work, we improve the gate potential by including information coming from the internal cell state. The proposed modification, named Working Memory Connection, consists in adding a learnable nonlinear projection of the cell content into the network gates. This modification can fit into the classical LSTM gates without any assumption on the underlying task, being particularly effective when dealing with longer sequences. Previous research effort in this direction, which goes back to the early 2000s, could not bring a consistent improvement over vanilla LSTM. As part of this paper, we identify a key issue tied to previous connections that heavily limits their effectiveness, hence preventing a successful integration of the knowledge coming from the internal cell state. We show through extensive experimental evaluation that Working Memory Connections constantly improve the performance of LSTMs on a variety of tasks. Numerical results suggest that the cell state contains useful information that is worth including in the gate structure.

Comments:	Accepted for publication in Neural Networks
Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
Cite as:	arXiv:2109.00020 [cs.LG]
	(or arXiv:2109.00020v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2109.00020
Related DOI:	https://doi.org/10.1016/j.neunet.2021.08.030

Submission history

From: Federico Landi [view email]
[v1] Tue, 31 Aug 2021 18:01:30 UTC (302 KB)

Computer Science > Machine Learning

Title:Working Memory Connections for LSTM

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Working Memory Connections for LSTM

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators