Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs

Islam, Md Amirul; Kowal, Matthew; Jia, Sen; Derpanis, Konstantinos G.; Bruce, Neil D. B.

Computer Science > Computer Vision and Pattern Recognition

arXiv:2108.07884 (cs)

[Submitted on 17 Aug 2021]

Title:Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs

Authors:Md Amirul Islam, Matthew Kowal, Sen Jia, Konstantinos G. Derpanis, Neil D. B. Bruce

View PDF

Abstract:In this paper, we challenge the common assumption that collapsing the spatial dimensions of a 3D (spatial-channel) tensor in a convolutional neural network (CNN) into a vector via global pooling removes all spatial information. Specifically, we demonstrate that positional information is encoded based on the ordering of the channel dimensions, while semantic information is largely not. Following this demonstration, we show the real world impact of these findings by applying them to two applications. First, we propose a simple yet effective data augmentation strategy and loss function which improves the translation invariance of a CNN's output. Second, we propose a method to efficiently determine which channels in the latent representation are responsible for (i) encoding overall position information or (ii) region-specific positions. We first show that semantic segmentation has a significant reliance on the overall position channels to make predictions. We then show for the first time that it is possible to perform a `region-specific' attack, and degrade a network's performance in a particular part of the input. We believe our findings and demonstrated applications will benefit research areas concerned with understanding the characteristics of CNNs.

Comments:	ICCV 2021
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2108.07884 [cs.CV]
	(or arXiv:2108.07884v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2108.07884

Submission history

From: Md Amirul Islam [view email]
[v1] Tue, 17 Aug 2021 21:27:30 UTC (1,653 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2021-08

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Md. Amirul Islam
Sen Jia
Konstantinos G. Derpanis
Neil D. B. Bruce

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators