Loading...
Towards robust visual transformer networks via k-sparse attention
Amini, S ; Sharif University of Technology | 2022
27
Viewed
- Type of Document: Article
- DOI: 10.1109/ICASSP43922.2022.9746916
- Publisher: Institute of Electrical and Electronics Engineers Inc , 2022
- Abstract:
- Transformer networks, originally developed in the community of machine translation to eliminate sequential nature of recurrent neural networks, have shown impressive results in other natural language processing and machine vision tasks. Self-attention is the core module behind visual transformers which globally mixes the image information. This module drastically reduces the intrinsic inductive bias imposed by CNNs, such as locality, while encountering insufficient robustness against some adversarial attacks. In this paper we introduce K-sparse attention to preserve low inductive bias, while robustifying transformers against adversarial attacks. We show that standard transformers attend values with dense set of weights, while the sparse attention, automatically selected by an optimization algorithm, can preserve generalization performance of the transformer and, at the same time, improve its robustness. © 2022 IEEE
- Keywords:
- Adversarial robustness ; Self-attention ; Sparse ; Visual transformer ; Computer vision ; Natural language processing systems ; Adversarial robustness ; Image information ; Inductive bias ; Language processing ; Machine translations ; Machine-vision ; Natural languages ; Self-attention ; Sparse ; Visual transformer ; Recurrent neural networks
- Source: 47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022, 23 May 2022 through 27 May 2022 ; Volume 2022-May , 2022 , Pages 4053-4057 ; 15206149 (ISSN); 9781665405409 (ISBN)
- URL: https://ieeexplore.ieee.org/document/9746916