Improving the Performance of Non-Volatile Memory based CNN Accelerators

Jasemi, Masoumeh Sadat; Hessabi, Shaahin Bagherzadeh, Nader

Please enable javascript in your browser.

Improving the Performance of Non-Volatile Memory based CNN Accelerators

Jasemi, Masoumeh Sadat | 2020

476 Viewed

Type of Document: Ph.D. Dissertation
Language: Farsi
Document No: 53358 (19)
University: Sharif University of Technology
Department: Computer Engineering
Advisor(s): Hessabi, Shaahin; Bagherzadeh, Nader
Abstract:
Today convolutional neural networks (CNN) are very popular due to their high accuracy and robustness. As the size and complexity of CNNs grow, the demand for larger on-chip memories also increases. Given that off-chip access memory has a high cost, one solution is to enlarge on-chip caches by employing emerging multi-level Cell (MLC) STT-RAMs. This memory provides higher capacity at the cost of lower reliability. The root cause of low reliability of MLC STT-RAM is failure of read and write operations. However, MLC STT-RAM and CNNs are perfect matches in the sense that in one hand MLC STT-RAM provides higher capacity, on the other hand CNN can tolerates moderate level of inaccuracy and low reliability. Additionally, asymmetric reliability of MLC STT-RAM can be exploited by CNN to provide better reliability. General matrix-to-matrix (gemm) and matrix-to-vector (gemv) are the most fundamental operations in CNN. These schemes suffer from lake of precision, analog-to-digital overhead, limitation on array sizes and high energy consumption on interconnects. Also, some prior work could not fully utilize MKC STT RAM due to their low reliability.In this thesis, we use MLC STT RAM to enhance the bottleneck of memory systems of CNN computation platform. Feature in CNN are represented via different representation system such as floating- and fixed-point. We propose 4 schemes to address the low reliability of MLC STT RAM for CNNs. The first two schemes work on fixed-point representation and the second two works cope with reliability in the context of floating-point representation. In the first scheme which is called drop, a few less important bits are dropped from a feature to make space for more important bits. In the second scheme, the bits are rearranged in such a way that asymmetric reliability of a cell is matched with the importance of a bit in the representation. This scheme guarantees that important bits are written to more reliable cell and thus enhances the overall reliability of the system. In the third solution, since normalization layer in contemporary CNNs enforces the values to be between -1 and 1, we realized that the second bit in each weight remains unused. Hence, it can be used to create a backup for sign bits, which is the most important bit in the representation. Finally, in the fourth solution, we utilized some simple operator such as not, and shifts to rearrange blocks in the memory cells to further enhance the reliability.We use various platform to evaluate the proposed solutions. We use Tensor flow to implement all CNN models, which in turns uses Keras. To inject faults, we use TensorFI that enables us to inject faults to all weight and parameters of a network. To obtain the MLC STT RAM characteristic, we use NVSIM. And to evaluate the systems in terms of bandwidth we use SCLAE-SIM.The evaluation results indicate that the first two solution without loss of prediction accuracy provide 2X more bandwidth. Also, the third and fourth solution together, improves the read and write energy with 9% and 6% without any degradation on prediction accuracy and archives 2.1X more bandwidth
Keywords:
Convolutional Neural Network ; Deep Neural Networks ; Fixed Point ; Floating Point ; Spin Transfer Torque-Magnetic (STT-MRAM) ; Multilevel Cell (MLC)Memory ; Multi-Level Cell Spin Transfer Torque-Magnetic (MLC STT-RAMs)

Digital Object List

محتواي کتاب
view

Bookmark

No TOC

Friend's email
Your name
Your email
enter code