Performance Improvement of Reinforcement Learning in Non-stationary Environments Using Predictions About Abrupt Environment Changes

Pourshamsaei Dargahi, Hossein; Nobakhti, Amin

Please enable javascript in your browser.

Performance Improvement of Reinforcement Learning in Non-stationary Environments Using Predictions About Abrupt Environment Changes

Pourshamsaei Dargahi, Hossein | 2025

0 Viewed

Type of Document: Ph.D. Dissertation
Language: Farsi
Document No: 57762 (05)
University: Sharif University of Technology
Department: Electrical Engineering
Advisor(s): Nobakhti, Amin
Abstract:
Reinforcement learning in one of the machine learning paradigms in which the agent is seeking to find the optimal policy by interaction with the environment and achieving rewards according to the selected actions. Reinforcement learning has many different applications, such as robotics, control of dynamic systems, industrial automation and etc. In many of the reinforcement learning problems, it is assumed that the environment model does not change over time, while in some problems, reward and state transition probability functions are not necessarily stationary. These problems are referred to as reinforcement learning problems in non-stationary environments. In some real-life applications, availability of some predictions about future sudden environment changes is feasible. For instance, weather changes induce non-stationarity in some problems, such as automatic irrigation or renewable energy production. However, usually there exist some predictions about weather changes with appropriate precision which can be used to improve the policy performance. None of the existing studies provides a framework for utilizing these prediction, while it is possible to use them appropriately prior change occurrence to enter the new environment model with a better initial condition in order to maximize total achieved rewards. In this thesis, following review of existing literature on reinforcement learning in non-stationary environments, novel algorithms are presented which use predictions about environment changes. Together with developing theoretical results, the algorithms are compared with existing methods over several problems such as reference tracking of cart in inverted pendulum system. It will be shown that developed algorithms outperform the previous ones and also outperform the application of individual optimal policies of each observed environment model without utilizing predictions
Keywords:
Reinforcement Learning ; Non-Stationary Environments ; Predictive Policy ; Predictive Reinforcement Learning ; Environment Changes Prediction

Digital Object List

محتواي کتاب
view

Bookmark

Friend's email
Your name
Your email
enter code