Deep Neural Networks: Tradeoff Between Compression and Communication Rates

Najafiaghdam, Kossar; Motahari, Abolfazl

Please enable javascript in your browser.

Deep Neural Networks: Tradeoff Between Compression and Communication Rates

Najafiaghdam, Kossar | 2023

52 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 56681 (19)
University: Sharif University of Technology
Department: Computer Engineering
Advisor(s): Motahari, Abolfazl
Abstract:
In recent years, the use of Deep Neural Networks in solving various problems has grown considerably. Possessing a large number of parameters, these networks have the ability to reconstruct complex functions and relations from large amounts of data and have been able to achieve the best results in a wide range of problems. But using these models comes with its own problems. These networks typically require considerable resources in order to run. This makes it ineﬀicient or impossible to use them in systems with limited processing capabilities, e.g mobile phones. The existing approaches, e.g. the deployment of the model on a powerful server and network compression, have their own drawbacks which makes it impossible to use them in some use cases. It seems a hy- brid solution might yield better results. In this research, a hybrid solution is presented by constructing and attempting to solve a trade-off between the aforementioned solu- tions, i.e. deploying a bulky model on the server side and deploying a smaller model on the client side. This solution proposes a new method for network compression, by using the representational powers of the auto-encoders and by introducing a new loss function for learning in the compressed model. Furthermore, a query distribution policy has been introduced to insure maximum exploitation of the resources available client-side and minimum usage of server resources. The proposed method has enabled us to reduce the load of the server by more than 60% while still maintaining almost the same level of accuracy and performance in the overall system. The accuracy of the final system almost equals that of the large model, i.e. 90%. Furthermore, the requests that are handled client-side are answered very fast, due to the simple architecture and smallness of the model
Keywords:
Deep Neural Networks ; Compression ; Trade-off Theory ; Natural Language Processing ; Language Model ; Query Distribution

Digital Object List

محتواي کتاب
view

Bookmark

Friend's email
Your name
Your email
enter code