Hardware Acceleration of Convolutional Neural Networks by Computational Prediction

Sajjadi, Pegahsadat; Bayatsarmadi, Siavash

Please enable javascript in your browser.

Hardware Acceleration of Convolutional Neural Networks by Computational Prediction

Sajjadi, Pegahsadat | 2022

98 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 55823 (19)
University: Sharif University of Technology
Department: Computer Engineering
Advisor(s): Bayatsarmadi, Siavash
Abstract:
Recently, Convolutional neural networks (CNNs) are widely used in many artificial intelligence applications such as image processing, speech processing and robotics. The neural networks superior accuracy comes at the cost of high computational complexity. Recent studies show that these operations can be performed in parallel. Therefore, as graphic processing units (GPUs) offer the best performance in terms of computational power and throughput, they are widely used to implement and accelerate neural networks. Nevertheless, the high price and power consumption of these processors have resulted in drawing more attraction towards Field-Programmable Arrays (FPGAs). In order to improve resource utilization and throughput in FPGA accelerators, different methods such as pruning and quantization are being used. Prediction of multiply and accumulate (MAC) operations result is one of the upcoming methods for decreasing the number of operations in neural networks. This project aims to improve throughput of hardware accelerators and skip the ineffectual computation using sign prediction. To this end, a Python tool was first developed to investigate different methods of sign prediction and their effect on the final accuracy of the model. Then, based on the evaluations, three sign predicting methods with high accuracy and low hardware overhead are proposed. The evaluation results show that by using these prediction methods, the speed of operations in LeNet network can be ideally improved up to 3.7 and 1.42 times at the cost of 3% and 0% drop in accuracy, respectively. Moreover, the speed of operations in AlexNet can be ideally improved up to 5.98 times at the cost of 1.5% accuracy loss. Lastly, proposed prediction methods are implemented in the FPGA and ASIC platforms. The implementation results shows, the proposed architectures consume 824 slices in FPGA platform
Keywords:
Reconfigurable Devices ; Convolutional Neural Network ; Accelerators ; Application Specific Integrated Circuit (ASIC) ; Field Programmable Gate Array (FPGA) ; Sign Prediction ; Computational Prediction Models ; Hardware Accelerator

Digital Object List

محتواي کتاب
view

Bookmark

No TOC

Friend's email
Your name
Your email
enter code