Language-informed Sequential Decision-making

Hashemi Dijujin, Negin; Soleymani Baghshah, Mahdieh

Please enable javascript in your browser.

Language-informed Sequential Decision-making

Hashemi Dijujin, Negin | 2022

93 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 55649 (19)
University: Sharif University of Technology
Department: Computer Engineering
Advisor(s): Soleymani Baghshah, Mahdieh
Abstract:
Sample efficiency and systematic generalization are two long-standing challenges in sequential decision-making problems, especially, in reinforcement learning settings. It is hypothesized that involving natural language in conjunction with other observation modalities in decision-making environments can improve generalization due to its compositional and open-ended nature, and sample efficiency due to the concise information summarized in relatively short linguistic units. By exploiting this information and the compositional structure of the language, one can achieve an abstract and factored understanding of the environment and the task at hand. To do so, it is necessary to find the proper grounding between meaningful data components from different modalities present in the input; e.g. visual and linguistic. In this project, we attempt to examine architecture-level inductive biases that can help improve language-informed reinforcement learning criteria, based on Decision Transformers and Neural Production Systems. Our experiments in BabyAI environments achieve higher sample efficiency and compositional generalization compared to baseline models.
Keywords:
Reinforcement Learning ; Transformers ; Sequential Decision Making ; Language Informed Agents ; Neural Production Systems

Digital Object List

محتواي کتاب
view

Bookmark

مقدمه
مفاهیم اولیه
- تصمیم‌گیری ترتیبی
  - ترنسفورمر تصمیم
  - بهینه‌سازی مجانبی سیاست
- سازوکار توجه
- جمع‌بندی
کارهای مرتبط
- مطالعات مبتنی بر زبان
  - مطالعات مبتنی بر زبان در یادگیری تقویتی
  - مطالعات مبتنی بر زبان در یادگیری تقلیدی
  - مطالعات علوم شناختی
- محیط‌های مبتنی بر زبان
- کارهای مبتنی بر زبان منطق
- یادگیری بازنمایی برمبنای سوگیری‌های استقرایی
  - گسسته‌سازی برداری در خودکدگذارها
  - FiLM
  - توجه شیاری
  - سیستم‌های تولید
- جمع‌بندی
روش پیشنهادی
- رویکرد مبتنی بر ترنسفورمر تصمیم
  - معماری روش و نسخه‌های پیشنهادی
  - جمع‌بندی
- رویکرد مبتنی بر سیستم‌های تولید عصبی
  - کلیت روش
  - پردازش مشاهده در تابع سیاست
    - حالت پایه
    - حالت مشروط بر دستورالعمل
  - نحوه‌ی تعیین شیارها
  - جمع‌بندی
آزمایش‌ها و نتایج
- شرح محیط‌های آزمایش
- معیارهای ارزیابی
- نتایج ترنسفورمر تصمیم
- نتایج شبکه‌های عصبی تولید
  - معماری مدل‌های آزمایش
  - آزمایش‌های کارآمدی نمونه
  - آزمایش‌های تعمیم‌پذیری ترکیبی
    - تفسیرپذیری
- جمع‌بندی
نتیجه‌گیری و ادامه
باقی نمودارها

Friend's email
Your name
Your email
enter code