SlowFast Rolling-Unrolling LSTMs for Action Anticipation in Egocentric Videos

Osman, Nada; Camporese, Guglielmo; Coscia, Pasquale; Ballan, Lamberto

Computer Science > Computer Vision and Pattern Recognition

arXiv:2109.00829 (cs)

[Submitted on 2 Sep 2021]

Title:SlowFast Rolling-Unrolling LSTMs for Action Anticipation in Egocentric Videos

Authors:Nada Osman, Guglielmo Camporese, Pasquale Coscia, Lamberto Ballan

View PDF

Abstract:Action anticipation in egocentric videos is a difficult task due to the inherently multi-modal nature of human actions. Additionally, some actions happen faster or slower than others depending on the actor or surrounding context which could vary each time and lead to different predictions. Based on this idea, we build upon RULSTM architecture, which is specifically designed for anticipating human actions, and propose a novel attention-based technique to evaluate, simultaneously, slow and fast features extracted from three different modalities, namely RGB, optical flow, and extracted objects. Two branches process information at different time scales, i.e., frame-rates, and several fusion schemes are considered to improve prediction accuracy. We perform extensive experiments on EpicKitchens-55 and EGTEA Gaze+ datasets, and demonstrate that our technique systematically improves the results of RULSTM architecture for Top-5 accuracy metric at different anticipation times.

Comments:	Accepted to EPIC@ICCV 2021
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2109.00829 [cs.CV]
	(or arXiv:2109.00829v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2109.00829

Submission history

From: Guglielmo Camporese [view email]
[v1] Thu, 2 Sep 2021 10:20:18 UTC (5,357 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2021-09

Change to browse by:

cs
cs.AI
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Pasquale Coscia
Lamberto Ballan

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:SlowFast Rolling-Unrolling LSTMs for Action Anticipation in Egocentric Videos

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:SlowFast Rolling-Unrolling LSTMs for Action Anticipation in Egocentric Videos

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators