Towards robust visual transformer networks via k-sparse attention

Amini, S; Ghaemmaghami, S Sharif University of Technology

Please enable javascript in your browser.

Towards robust visual transformer networks via k-sparse attention

Amini, S ; Sharif University of Technology | 2022

27 Viewed

Type of Document: Article
DOI: 10.1109/ICASSP43922.2022.9746916
Publisher: Institute of Electrical and Electronics Engineers Inc , 2022
Abstract:
Transformer networks, originally developed in the community of machine translation to eliminate sequential nature of recurrent neural networks, have shown impressive results in other natural language processing and machine vision tasks. Self-attention is the core module behind visual transformers which globally mixes the image information. This module drastically reduces the intrinsic inductive bias imposed by CNNs, such as locality, while encountering insufficient robustness against some adversarial attacks. In this paper we introduce K-sparse attention to preserve low inductive bias, while robustifying transformers against adversarial attacks. We show that standard transformers attend values with dense set of weights, while the sparse attention, automatically selected by an optimization algorithm, can preserve generalization performance of the transformer and, at the same time, improve its robustness. © 2022 IEEE
Keywords:
Adversarial robustness ; Self-attention ; Sparse ; Visual transformer ; Computer vision ; Natural language processing systems ; Adversarial robustness ; Image information ; Inductive bias ; Language processing ; Machine translations ; Machine-vision ; Natural languages ; Self-attention ; Sparse ; Visual transformer ; Recurrent neural networks
Source: 47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022, 23 May 2022 through 27 May 2022 ; Volume 2022-May , 2022 , Pages 4053-4057 ; 15206149 (ISSN); 9781665405409 (ISBN)
URL: https://ieeexplore.ieee.org/document/9746916

Friend's email
Your name
Your email
enter code