Edge AI as a Service

Nasrolahi, Mohammad Javad; Hossein Khalaj, Babak

Please enable javascript in your browser.

Edge AI as a Service

Nasrolahi, Mohammad Javad | 2024

0 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 57313 (05)
University: Sharif University of Technology
Department: Electrical Engineering
Advisor(s): Hossein Khalaj, Babak
Abstract:
Edge AI, an emerging approach, aims to provide AI services by utilizing computing resources at the edge of the network rather than relying on cloud servers. This offers advantages such as reduced latency, improved efficiency, privacy preservation, and resilience. Model inference at the edge is a crucial area in this domain, where a trained model is deployed to an edge server or user device. Deploying models on resource constrained edge devices poses key challenges, including optimizing and compressing the models to fit within limited computational and memory constraints. Additionally, ensuring the deployed model can perform inference within required time constraints is critical. Research efforts in this area aim to accelerate inference speed on edge devices while maintaining the model’s accuracy and performance. The deployment of transformers with an encoder-decoder architecture is examined as a case study. Transformers, initially developed for natural language processing tasks due to their ability to effectively process sequential data, have since been adopted in various fields. However, their implementation on user devices is challenging due to the large number of parameters. In this thesis, a method is proposed to divide model computation, where part of the model computation is performed on the user device and the other part on the edge server. The proposed approach involves implementing the encoder part of the transformer on the edge server and the decoder part on the user device. The goal is to reduce the computational load on the edge server while maintaining model accuracy. Results demonstrate that this method offers lower latency compared to the case where both encoder and decoder are implemented on the edge server, especially when multiple users request inference simultaneously
Keywords:
Artificial Intelligence ; Edge Model Inference ; Edge Artificial Intelligence as a Service ; Edge Artificial Intelligence ; Encoder-Decoder

Digital Object List

محتواي کتاب
view

Bookmark

Friend's email
Your name
Your email
enter code