Improving Distributed SVM Learning Algorithm in MapReduce Framework Using Coding

Hosseini, Pejman; Jafari, Mahdi

Please enable javascript in your browser.

Improving Distributed SVM Learning Algorithm in MapReduce Framework Using Coding

Hosseini, Pejman | 2019

926 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 52141 (19)
University: Sharif University of Technology
Department: Computer Engineering
Advisor(s): Jafari, Mahdi
Abstract:
With the rise of the concept of “Big Data”, both data volumes and data processing time increased, imposing the need for new methods of processing and computation of said data.Analytical and computational methods in Machine Learning are some of the most important applications of Big Data processing. There exist many methods of data analysis in the Machine Learning field, each requiring extensive processing on Big Data. One of the methods for working with Big Data is Distributed Systems. MapReduce is one of the most popular methods distributed computation by increasing the ease and speed of distributed processing of big data. But a number of bottlenecks have been discovered in MapReduce which slow down the process in some cases. Slow and unreliable computational nodes and the shuffling part which is a medium between mapper and reducer nodes in the network platform are the most important bottlenecks. It has recently been demonstrated that ideas based on information theory and network coding can be used to improve these weaknesses. In this research we have tried to decrease the process type of a Machine Learning method, called Support Vector Machine (SVM), in the distributed implementation by putting to use a coding idea, called Polynomial Code, and planning adequate strategies to reach the optimum coding
Keywords:
Distributed Computing ; Map Reduce Processing ; Performance ; Machine Learning ; Support Vector Machine (SVM) ; Network Coding ; Polynomial Codes

Digital Object List

محتواي کتاب
view

Bookmark

مقدمه
- تعریف مسئله
- اهمیت موضوع
- اهداف تحقیق
- ساختار پایان‌نامه
مفاهیم اولیه
- سیستم‌های توزیع‌شده
  - نگاشت‌کاهش
- انواع ‌کدگذاری
  - کدهای پاک‌شونده
  - کدهای تکراری
  - کدهای حداکثر فاصله‌ی قابل جداسازی
  - مسئله‌ی کدهای ترتیبی
- روش‌های یادگیری ماشین
  - تعریف ماشین بردار پشتیبان
  - روش حل ماشین بردار پشتیبان
- جمع‌بندی
کارهای پیشین
- مقابله با گره‌های کند
- استفاده از کدگذاری در بخش محاسبات
- استفاده از کدگذاری در بخش بُر زدن داده
- پیاده‌سازی ماشین بردار پشتیبان
- جمع‌بندی
روش پیشنهادی
- نحوه‌ی استفاده از کد در پیاده‌سازی ضرب ماتریسی
  - اضافه کردن افزونگی به تنها یک ماتریس
  - اضافه کردن افزونگی به هر دو ماتریس
- کدهای چندجمله‌ای
  - بهینگی آستانه‌ی بازیابی
- درون‌یابی
  - درون‌یابی تبدیل سریع فوریه
- جمع‌بندی
نتایج آزمایش
- رابط فرستادن پیام
- آزمایش‌های طراحی‌شده
  - ضرب ماتریس‌ها
  - بهینه‌سازی ماشین بردار پشتیبان
جمع‌بندی

Friend's email
Your name
Your email
enter code