Loading...

Performance Improvement of Reinforcement Learning in Non-stationary Environments Using Predictions About Abrupt Environment Changes

Pourshamsaei Dargahi, Hossein | 2025

0 Viewed
  1. Type of Document: Ph.D. Dissertation
  2. Language: Farsi
  3. Document No: 57762 (05)
  4. University: Sharif University of Technology
  5. Department: Electrical Engineering
  6. Advisor(s): Nobakhti, Amin
  7. Abstract:
  8. Reinforcement learning in one of the machine learning paradigms in which the agent is seeking to find the optimal policy by interaction with the environment and achieving rewards according to the selected actions. Reinforcement learning has many different applications, such as robotics, control of dynamic systems, industrial automation and etc. In many of the reinforcement learning problems, it is assumed that the environment model does not change over time, while in some problems, reward and state transition probability functions are not necessarily stationary. These problems are referred to as reinforcement learning problems in non-stationary environments. In some real-life applications, availability of some predictions about future sudden environment changes is feasible. For instance, weather changes induce non-stationarity in some problems, such as automatic irrigation or renewable energy production. However, usually there exist some predictions about weather changes with appropriate precision which can be used to improve the policy performance. None of the existing studies provides a framework for utilizing these prediction, while it is possible to use them appropriately prior change occurrence to enter the new environment model with a better initial condition in order to maximize total achieved rewards. In this thesis, following review of existing literature on reinforcement learning in non-stationary environments, novel algorithms are presented which use predictions about environment changes. Together with developing theoretical results, the algorithms are compared with existing methods over several problems such as reference tracking of cart in inverted pendulum system. It will be shown that developed algorithms outperform the previous ones and also outperform the application of individual optimal policies of each observed environment model without utilizing predictions
  9. Keywords:
  10. Reinforcement Learning ; Non-Stationary Environments ; Predictive Policy ; Predictive Reinforcement Learning ; Environment Changes Prediction

 Digital Object List

 Bookmark

...see more