The Boombox: Visual Reconstruction from Acoustic Vibrations

Chen, Boyuan; Chiquier, Mia; Lipson, Hod; Vondrick, Carl

Computer Science > Computer Vision and Pattern Recognition

arXiv:2105.08052 (cs)

[Submitted on 17 May 2021 (v1), last revised 23 Oct 2021 (this version, v2)]

Title:The Boombox: Visual Reconstruction from Acoustic Vibrations

Authors:Boyuan Chen, Mia Chiquier, Hod Lipson, Carl Vondrick

View PDF

Abstract:Interacting with bins and containers is a fundamental task in robotics, making state estimation of the objects inside the bin critical. While robots often use cameras for state estimation, the visual modality is not always ideal due to occlusions and poor illumination. We introduce The Boombox, a container that uses sound to estimate the state of the contents inside a box. Based on the observation that the collision between objects and its containers will cause an acoustic vibration, we present a convolutional network for learning to reconstruct visual scenes. Although we use low-cost and low-power contact microphones to detect the vibrations, our results show that learning from multimodal data enables state estimation from affordable audio sensors. Due to the many ways that robots use containers, we believe the box will have a number of applications in robotics. Our project website is at: this http URL

Comments:	CoRL 2021. Website: this http URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Robotics (cs.RO); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2105.08052 [cs.CV]
	(or arXiv:2105.08052v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2105.08052

Submission history

From: Boyuan Chen [view email]
[v1] Mon, 17 May 2021 17:58:41 UTC (19,996 KB)
[v2] Sat, 23 Oct 2021 15:27:10 UTC (20,441 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2021-05

Change to browse by:

cs
cs.MM
cs.RO
cs.SD
eess
eess.AS

References & Citations

DBLP - CS Bibliography

listing | bibtex

Boyuan Chen
Hod Lipson
Carl Vondrick

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:The Boombox: Visual Reconstruction from Acoustic Vibrations

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:The Boombox: Visual Reconstruction from Acoustic Vibrations

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators