Integrated Guidance and Control of a Hexacopter Equipped with Robotic Arm using Reinforcement Learning

Shobeiry, Mohammad Mahdi; Emami Khansari, Mohammad Ali

Please enable javascript in your browser.

Integrated Guidance and Control of a Hexacopter Equipped with Robotic Arm using Reinforcement Learning

Shobeiry, Mohammad Mahdi | 2025

0 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 58386 (45)
University: Sharif University of Technology
Department: Aerospace Engineering
Advisor(s): Emami Khansari, Mohammad Ali
Abstract:
The primary objective of this research is the integrated guidance and control of a hexacopter system equipped with a robotic arm using deep reinforcement learning algorithms in an end-to-end and model-free framework. The hexacopter with a robotic arm system possesses 6 degrees of freedom for the hexacopter and 2 degrees of freedom for the robotic arm, which consists of rigid links attached to the underside of the hexacopter. The dynamic model of this system has been developed in an integrated manner to properly account for the coupling effects between the equations of the hexacopter and the arm in the analyses. After developing the equations describing the system's behavior, the goal is to control all its degrees of freedom using reinforcement learning. The mission defined for this system is a point-tracking mission: the system must reach the location of an external object while maintaining stability, then use its 2 DoF robotic arm to pick up this object with unknown characteristics from a surface, and finally, while maintaining the stability of the combined hexacopter-with-robotic-arm and external object, deliver the object to the desired position. To succeed in this mission, defining a multi-objective reward function is essential for the agent's success. This function includes guidance of the hexacopter flight path, guidance of the robotic arm's path, control of system’s states and the system's control inputs. To evaluate the robustness of the implemented reinforcement learning algorithms, the effects of model uncertainty, disturbances, actuator fault, and noise on the system's performance and mission success are studied. The reinforcement learning algorithms used in this research are model-free, which increases the complexity of the design process due to their success in controlling a complex dynamic system with significant coupling effects between the hexacopter and the arm. According to this research, a comparison of three reinforcement learning algorithms—SAC, TD3, and PPO—with a classical PID controller concludes the superiority of the reinforcement learning algorithms in terms of better handling the coupling effects between the hexacopter and the arm, successfully completing the mission in a shorter time, and compensating for the effects of uncertainties, disturbances, and failures. Among these reinforcement learning algorithms, the TD3 algorithm has the highest mission execution speed, the PPO algorithm has lower speed but the highest robustness and less control effort. The SAC algorithm also shows good speed in completing the mission but has less robustness in challenging conditions compared to TD3 and PPO. The outcome of this research paves the way for designing integrated guidance and control structures for complex dynamic systems in the field of aerial manipulation
Keywords:
Deep Reinforcement Learning ; End-to-End Learning ; Integrated Guidance and Control (IGC) ; Multirotor Control System ; Waypoint Tracking ; End-to-End Reinforcement Learning ; Multirotor with Robotic Arm ; Aerial Manipulation

Digital Object List

محتواي کتاب
view

Bookmark

1 ‌مقدمه
2 مدل‌سازی ریاضی
3 الگوریتم‌های یادگیری تقویتی و رویکرد کلاسیک
4 پیاده‌سازی هدایت و کنترل یکپارچه مبتنی بر یادگیری تقویتی
5 نتایج شبیه‌سازی
6 نتیجه‌گیری
- 6‏.‏1 نوآوری‌های پایان‌نامه
- 6‏.‏2 پیشنهادها برای ادامه کار
منابع و مراجع
ApprovalSheet.pdf
- 1 ‌مقدمه
  - 1‏.‏1 انگیزه پژوهش
  - 1‏.‏2 تعریف مسئله
  - 1‏.‏3 پیشینه پژوهش
  - 1‏.‏4 اهداف و نوآوری‌ها
  - 1‏.‏5 محتوای گزارش
- 2 مدل‌سازی ریاضی
  - 2‏.‏1 قاب‌ها، نقاط و دستگاه‌های مرجع
  - 2‏.‏2 مدل سینماتیکی
    - 2‏.‏2‏.‏1 ماتریس‌های تبدیل
    - 2‏.‏2‏.‏2 تحلیل مکان
    - 2‏.‏2‏.‏3 تحلیل سرعت خطی
    - 2‏.‏2‏.‏4 تحلیل سرعت زاویه‌ای
    - 2‏.‏2‏.‏5 روابط سینماتیکی
  - 2‏.‏3 مدل دینامیکی
    - 2‏.‏3‏.‏1 انرژی جنبشی و پتانسیل
    - 2‏.‏3‏.‏2 نیروها و گشتاورهای ورودی تعمیم‌یافته
    - 2‏.‏3‏.‏3 نیروها و گشتاورهای خارجی تعمیم‌یافته
  - 2‏.‏4 مشخصات پرنده
  - 2‏.‏5 اعتبارسنجی مدل دینامیکی
    - 2‏.‏5‏.‏1 آزمون حالت شناوری
    - 2‏.‏5‏.‏2 آزمون گشتاور رول
    - 2‏.‏5‏.‏3 آزمون گشتاور پیچ
    - 2‏.‏5‏.‏4 آزمون تحریک بازوها
- 3 الگوریتم‌های یادگیری تقویتی و رویکرد کلاسیک
  - 3‏.‏1 معرفی و مرور ادبیات یادگیری تقویتی
  - 3‏.‏2 الگوریتم یادگیری تقویتی SAC
  - 3‏.‏3 الگوریتم یادگیری تقویتی TD3
  - 3‏.‏4 الگوریتم یادگیری تقویتی PPO
  - 3‏.‏5 الگوریتم هدایت و کنترل مبتنی بر PID
    - 3‏.‏5‏.‏1 هدایت و کنترل شش‌پره
    - 3‏.‏5‏.‏2 هدایت و کنترل انتهای بازو رباتیکی
- 4 پیاده‌سازی هدایت و کنترل یکپارچه مبتنی بر یادگیری تقویتی
  - 4‏.‏1 محیط آموزش یادگیری تقویتی
  - 4‏.‏2 ساختار شبکه‌های عصبی
  - 4‏.‏3 پارامترهای تنظیمی یادگیری تقویتی
    - 4‏.‏3‏.‏1 پارامترهای تنظیمی الگوریتم SAC
    - 4‏.‏3‏.‏2 پارامترهای تنظیمی الگوریتم TD3
    - 4‏.‏3‏.‏3 پارامترهای تنظیمی الگوریتم PPO
  - 4‏.‏4 نتایج فاز آموزش الگوریتم‌های یادگیری تقویتی
- 5 نتایج شبیه‌سازی
  - 5‏.‏1 شرایط ایده‌آل
  - 5‏.‏2 حضور اغتشاشات و عدم‌قطعیت‌ها
  - 5‏.‏3 حضور خرابی عملگر و نویز
  - 5‏.‏4 شبیه‌سازی مونت‌کارلو
- 6 نتیجه‌گیری
  - 6‏.‏1 نوآوری‌های پایان‌نامه
  - 6‏.‏2 پیشنهادها برای ادامه کار
- منابع و مراجع

Friend's email
Your name
Your email
enter code