Autonomous Skill Acquisition in Reinforcement Learning Based on Graph Clustering

Taghizadeh, Nasrin; Beigy, Hamid

Please enable javascript in your browser.

Autonomous Skill Acquisition in Reinforcement Learning Based on Graph Clustering

Taghizadeh, Nasrin | 2011

2777 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 42492 (19)
University: Sharif University of Technology
Department: Computer Engineering
Advisor(s): Beigy, Hamid
Abstract:
Reinforcement Learning (RL) is a branch of machine learning that tries to improve agent’s behaviour through interaction with environment and receiving reinforcement signal. As the size of environment increases, decision-making would be more difficult and learning time will increase. On of the main approaches for decreasing learning complexity is to define skills. Skill is a behavioural unit consists of primitive actions. Humans learn and use a lot of skills in their life. Walking, eating, passing the door to reach kitchen and going to airport for travelling are examples of such skills that humans utilize them for daily activities. Agent can learn skills once and then uses them in other tasks. This property leads to improvement of learning speed. Hierarchical RL (HRL) was proposed to formal definition and representation of skills. Creating skills manually is hard and infeasible in large and unfamiliar environments so automatic acquisition of skills is a main challenges in HRL. Several approaches were proposed for acquisition of skills. In divide and conquer approach, some key states which called subgoal would be discovered and then some skills are created to reach them. In this thesis, a new graph based method for discovering subgoals and creating skills was proposed. The idea of graph-based methods is to construct transition graph with respect to history of agent’s interactions and discovering subgoal nodes. In this thesis, three different method have been examined for discovering subgoals. The first method is adopted from spectral clustering and uses eigenvectors of laplacian matrix of transition graph. The results show this method can find subgoals well but has some weaknesses. In order to solve them, second method was provided which utilizes HITS algorithm. The HITS algorithm has a vast application in search engines but has some incompetences in our problem. With respect to weaknesses of two previous methods, third method was proposed which utilizes eigenvector centrality measure. In the proposed method eigenvector centrality values for all of nodes of the transition graph will be calculated and the transition graph will be clustered based on this values. States that lie between clusters, are considered as subgoals. At the next stage, skills for reaching subgoals will be created using option framework. For every task, redundant options will be eliminated and final set of useful options will be added to the available actions for the agent. The proposed algorithm was tested in different simulated environment and was compared with other skill acquit ion methods. The experimental results shows our method can discover accurate subgoals and creates usefull skills which improve learning speed
Keywords:
Reinforcement Learning ; Skill Acquisition ; Subgoal Discovery ; Graph Clustering ; Eigenvector Centrality ; Options Framework

Digital Object List

محتواي پايان نامه
view

Bookmark

Friend's email
Your name
Your email
enter code