Search for: deep-reinforcement-learning
0.006 seconds

    Algorithmic Trading Using Deep Reinforcement Learning

    , M.Sc. Thesis Sharif University of Technology Majidi, Naseh (Author) ; Marvasti, Farohk (Supervisor)
    Price movement prediction has always been one of the traders’ concerns in the field of financial market prediction. In order to increase the profit of the trades, the traders can process the historical data and predict the movement. The large size of the data and complex relations between them lead us to use algorithmic trading and artificial intelligence.The stock and Cryptocurrency markets are two common markets attracting traders. This thesis aims to offer an approach using Twin-Delayed DDPG (TD3) and daily close price in order to achieve a trading strategy. Unlike the previous studies using a discrete action space reinforcement learning algorithm, TD3 is a continuous one offering both... 

    An Application of Deep Reinforcement Learning for Ambulance Allocation to Emergency Departments under Overcrowding Situation

    , M.Sc. Thesis Sharif University of Technology Taher Gandomabadi, Mohammad Mahdi (Author) ; Akhavan Niaki, Taghi (Supervisor)
    In the last decade, emergency department (ED) overcrowding has become a national crisis for the US healthcare system. Increasing mortality rates, decreasing quality of care, financial losses due to walkouts, and ambulance diversion are some of the consequences of ED overcrowding. Given the increasing demand in terms of ambulance utilization which we can see an instance of it in the COVID-19 pandemic, being able to allocate service requests to EDs efficiently, becomes a key function of emergency medical services. in this investigation, an algorithm of deep reinforcement learning called deep Q-learning is used to address this problem and to assign ambulances to ED's appropriately. under... 

    Meta Reinforcement Learning for Domain Generalization

    , M.Sc. Thesis Sharif University of Technology Riyahi Madvar, Maryam (Author) ; Rohban, Mohammad Hossein (Supervisor)
    Deep reinforcement learning has achieved better cumulative rewards than humans in many environments like Atari. One drawback of these methods is their data inefficiency which makes training time-consuming, and in some cases having this amount of data is infeasible. Meta reinforcement learning can use past experiences to enable agents to adapt to new tasks faster and makes neural networks to train in a short amount of time.One of the methods in meta reinforcement learning is inferring tasks which helps exploitation policy to have good performance in new tasks. There’s a need to improve exploration policy as well as exploitation policy by gaining informative transitions about the new task.... 

    Distributed Cache Management Using Reinforcement Learning based Strategies

    , M.Sc. Thesis Sharif University of Technology Yousefi Ramandi, Amir Hossein (Author) ; Mir Mohseni, Mahtab (Supervisor) ; Maddah Ali, Mohammad Ali (Supervisor)
    Nowadays, video on demand causes a drastic increase in network traffic that it is expected that network traffic surpasses 45 exabytes per month until 2022; consequently, utilizing distributed memories known as caches across the network to alleviate the communication load during peak hours is inevitable. Coded caching is a promising approach to mitigate and smooth traffic during peak hours in the communication network in a way that it creates coded multicasting opportunities in addition to delivering content to users locally. However, it suffers from imposed delay resulting from coding that makes this approach infeasible for delay-sensitive contents, namely video streaming applications. So... 

    Deep Reinforcement Learning for Building Climate Control Using Weather Forecast Data

    , M.Sc. Thesis Sharif University of Technology Honari Latifpour, Ehsan (Author) ; Rezaeizadeh, Amin (Supervisor)
    Buildings account for more than 30% of the world’s total energy consumption. Among building end-uses, air conditioning and in particular cooling systems have a major share of more than 50%. Therefore, design of optimal controllers for AC systems has become increasingly important. Classical and model-free control methods typically lack the ability to optimize energy consumption. On the other hand, model-based optimal control methods rely on precise modeling, which is difficult to acquire due to the complexity of the AC system dynamics.In recent years, deep reinforcement learning has become a popular choice for optimal control of systems with complex dynamics. In this thesis, a deep... 

    Optimal Process Planning for Automated Robotic Assembly of Mechanical Assembles based on Reinforcement Learning Method

    , M.Sc. Thesis Sharif University of Technology Raisi, Mehran (Author) ; Khodaygan, Saeed (Supervisor)
    Nowadays, the assembly process is planned by an expert and requires knowledge and it is time-consuming. The flexibility and optimality of the assembly plan depend on the knowledge and creativity of the expert, and therefore expertise is an important parameter in developing the assembly plan. Therefore, the use of intelligent methods to plan the assembly process has been considered by many researchers. . The reinforcing learning approach has the potential to solve complex problems due to the use of experience gained from interacting with the environment and Has been successfully implemented in controlling many robotic tasks. However, due to the inherent complexity of the assembly, as well as... 

    Learning-based Control System Design for the Bipedal Running Robot and Development of a Two-layer Framework for Generating the Optimal Paths in Various Movement Maneuvers

    , M.Sc. Thesis Sharif University of Technology Amiri, Aref (Author) ; Salarieh, Hassan (Supervisor)
    Foot movement is one of the most powerful and adaptable methods of movement in nature. Inspired by humans, the most intelligent creatures on earth, bipedal robots have many uses. In this research, a control method for running a bipedal robot has been designed. In the simulation part of the five-link model, the robot's motion equations for running and walking at different levels are extracted by the Lagrange method. In path generation, using the two-layer optimization method and holonomic and dynamic constraints, optimal paths are produced which are kinematically and dynamically possible (feasible). Additionally, path generation is facilitated by an invariant impact constraint to ensure the... 

    Robotic Arm Manipulation Learning from Demonstration based on Reinforcement Learning

    , M.Sc. Thesis Sharif University of Technology Noohian, Amir Hossein (Author) ; Khodaygan, Saeed (Supervisor)
    The field of learning from demonstration is the field in which researchers seek to create methods by which a robot can learn and reproduce a skill simply by using the demonstration of the skill. One of the main drawbacks of learning from demonstration methods is their inability to improve the learned skills. To answer this question, the reinforcement learning method can be used. The reinforcement learning approach has the potential to improve the initial skill due to the use of the experience of interacting with the environment. In this project, the dynamic movement primitives algorithm is considered as the learning from demonstration method. The research approach is that first, the dynamic... 

    Brain Inspired Meta Reinforcement Learning Using Brain-Inspired Networks

    , M.Sc. Thesis Sharif University of Technology Razavi Rohani, Roozbeh (Author) ; Soleymani Baghshahi, Mahdih (Supervisor)
    Reinforcement learning is one of the most well-known learning paradigms in biological agents and one of the most used ones for solving plenty of problems. One of the reasons for this widespread use is the low demand for supervising signals. However, the sparsity of the reward signal causes increasing in sample complexity that needs for learning new tasks. This issue makes trouble in multi-task settings, specifically.One of the most promising approaches to learning new tasks by limited interaction with the environment is meta reinforcement learning. An approach in which fast adaption becomes possible by limiting hypothesis space and creating inductive biases by learning meta parameters.... 

    Adaptive Maneuvers for Aircraft Conflict Resolution Using Learning Theory

    , M.Sc. Thesis Sharif University of Technology Mamizadeh, Zahra (Author) ; Malaek, Mohammad Bagher (Supervisor)
    The problem of detection and resolution of aircraft collisions is very important due to the increasing demand for flights. Many algorithms have been developed in the past to increase automation in air traffic management and reduce the workload of air traffic controllers. These algorithms either have difficulty in generalizing to real problems or have high computational costs and do not correspond to the reality of the actual maneuvering characteristics of the aircraft performance. The aim of present study is to obtain dynamic maneuvers that are adaptive with reality and also optimal in terms of utilizing the capacity of flight sectors, so we propose Deep Reinforcement Learning(DRL) based on... 

    Optimal Control of a Quadcopter in Fast Descending Maneuvers Based on Reinforcement Learning

    , M.Sc. Thesis Sharif University of Technology Azadi, Majid (Author) ; Fallah Rajabzadeh, Famida (Supervisor) ; Zohoor, Hassan (Supervisor) ; Nejat Pishkenari, Hossein (Co-Supervisor)
    Quadrotors have limitation in performing fast descent maneuvers due to Vortex Ring State (VRS) region which make quadrotor unstable. In order to avoid entering VRS, a velocity constraint considered which it should be satisfied during this maneuver to guarantee a safe and stable fast descending maneuver by quadrotor. The purpose of this thesis is to overcome limitation in speed space of quadrotor in order to reduce the time of fast descending maneuvers by using Reinforcement Learning Techniques. A new cascade controller proposed which using PID in inner loop as a low level controller and DDPG as one of reinforcement learning techniques in outer loop as high level controller in order to... 

    Design of a HEV’s Controller Using Learning-based Methods

    , M.Sc. Thesis Sharif University of Technology Zare, Aramchehr (Author) ; Boroushaki, Mehrdad (Supervisor)
    Hybrid electric vehicles (HEV) are proving to be one of the most promising innovations in advanced transportation systems to reduce air pollution and fossil fuel consumption. EMS is one of the most vital aspects of the HEV powertrain system. This research aims to design an optimal EMS under the condition of meeting the goals of drivability control, fuel consumption reduction, and battery charge stability. The current EMS is based on the classical rule-based method derived from fuzzy logic, which guides to the suboptimal solution in episodic driving cycles. Previous experiences in implementing Reinforcement Learning (RL) suffer from late convergence, instability in tracking the driving... 

    Designing IoT-based Video/Audio Processing Systems

    , M.Sc. Thesis Sharif University of Technology Golmohammadi, Zahra (Author) ; Gholampour, Iman (Supervisor) ; Haj Sadeghi, Khosrou (Supervisor)
    The use of IoT-based technologies is expanding in many areas today. The use of audio and video processing in IoT systems has been used as an alternative to human operators by increasing power and reducing processing costs. Due to the large volume of audio and video data and bandwidth limitations, complete data transfer to cloud processing servers is not cost-effective in terms of efficiency and energy consumption. As a result, the solution that has provided good results is to discharge these device tasks to the available clouds. In other words, the capacity of resources in the environment can be used to optimize the total latency of the system and energy consumption. In this dissertation, we... 

    A Novel Resource Allocation Algorithm in Edge Computing with Deep Reinforcement Learning

    , M.Sc. Thesis Sharif University of Technology Rahmati, Iman (Author) ; Movaghar, Ali (Supervisor)
    With the explosion of mobile smart devices, many computation intensive applications have emerged, such as interactive gaming and augmented reality. Mobile edge computing (EC) is put forward, as an extension of cloud computing, to meet the low-latency require- ments of the applications. In mobile edge computing systems, an edge node may have a high load when a large number of mobile devices offload their tasks to it. those offloaded tasks may experience large processing delay or even be dropped when their deadlines expire. Due to the uncertain load dynamics at the edge nodes, it is challenging for each device to determine its offloading decision (i.e., whether to offload or not, and which... 

    A Reinforcement Learning Framework for Portfolio Management Problem Leveraging Stocks Historical Data And Their Correlation

    , M.Sc. Thesis Sharif University of Technology Taherkhani, Hamed (Author) ; Fazli, Mohammad Amin (Supervisor)
    Over the past few years, deep reinforcement learning(DRL) has been given a lot of attention in finance for portfolio management. With the help of experts’ signals and historical price data, we have developed a new reinforcement learning(RL) method. The use of experts’ signals in tandem with DRL has been used before in finance, but we believe this is the first time this method has been used to solve the financial portfolio management problem. As our agent, we used the Proximal Policy Optimization(PPO) algorithm to process the reward and take actions in the environment. Our framework comprises a convolutional network to aggregate signals, a convolutional network for historical price data, and... 

    An Application of Deep Reinforcement Learning in Novel Supply Chain Management Approaches for Inventory Control and Management of Perishable Supply Chain Network

    , M.Sc. Thesis Sharif University of Technology Mohammadi, Navid (Author) ; Akhavan Niaki, Taghi (Supervisor)
    This study proposes a deep reinforcement learning approach to solve a perishable inventory allocation problem in a two-echelon supply chain. The inventory allocation problem is studied considering the stochastic nature of demand and supply. The examined supply chain includes two retailers and one distribution center (DC) under a vendor-managed inventory (VMI) system. This research aims to minimize the wastages and shortages occurring at the retailer's sites in the examined supply chain. With regard to continuous action space in the considered inventory allocation problem, the Advantage Actor-Critic algorithm is implemented to solve the problem. Numerical experiments are implemented on... 

    Data-driven Methods for Cooperative Control of Wheeled Mobile Robots

    , M.Sc. Thesis Sharif University of Technology Qahremani, Sina (Author) ; Sadati, Nasser (Supervisor)
    Employing wheeled mobile robots is growing in industry, transportation, space and defense industry and many other social fields as well. These robots are used to execute distinct forms of operations and tasks such as exploring the surface of the earth and other planets, serving in public places, backing natural disasters and warehousing, and so forth. In some cases, the assigned mission may not be capable of being performed as intended by a single robot. In this case, several robots will work together to execute a particular mission. Several research topics that are under investigation currently include the interacting procedure of robots as a multi-agent system in order to perform the... 

    Design and Implementation of an Intelligent Control System Based-on Deep Reinforcement Learning for a Lower-limb Hybrid Exoskeleton Robot

    , M.Sc. Thesis Sharif University of Technology Koushki, Amir Reza (Author) ; Vossoughi, Gholamreza (Supervisor) ; Boroushaki, Mehrdad (Supervisor)
    Hybrid Exoskeletons refer to simultaneous use of wearable robots and functional electrical stimulation technology. Hybrid exoskeletons have many advantages compared to the separate application of each of these technologies, such as reducing the robot’s energy consumption and the need for lighter and cheaper actuators for the robot, using humans muscle power, and reducing muscle fatigue. As a result, these robots have recently attracted a lot of interest in rehabilitation applications for patients suffering from mobility impairment.Control in hybrid exoskeletons is more complicated than control in traditional exoskeletons. Because in addition to robot and functional electrical stimulation... 

    Design and Implementation of a Collision Avoidance Module in Dynamic Environment with Deep Reinforcement Learning on Arash Social Robot

    , M.Sc. Thesis Sharif University of Technology Norouzi, Mostafa (Author) ; Meghdari, Ali (Supervisor) ; Taheri, Alireza (Supervisor) ; Soleymani, Mahdieh (Co-Supervisor)
    Nowadays, one of the challenges in social robotics is to navigate the robot in social environments with moving elements such as humans. The purpose of this study is to navigate the Arash 2 social robot in a dynamic environment autonomously without encountering moving obstacles (humans). The Arash 2 robot was first simulated in the Gazebo simulator environment in this research. The simultaneous location and mapping (SLAM) technique was implemented on the robot using a lidar sensor to obtain an environment map. Then, using the deep reinforcement learning approach, the neural network developed in the simulation environment was trained and implemented on the robot in the real environment. The... 

    Using a Deep Reinforcement Learning Agent for Lane Direction Control

    , M.Sc. Thesis Sharif University of Technology Zare Hadesh, Ashkan (Author) ; Nasiri, Habibollah (Supervisor)
    In recent years with the progress of technology in different areas, the production of self-driving cars has been feasible. We can expect that vehicles of transportation networks will consist of both self-driving and regular cars in the future. In this research, a new method will be proposed for urban transportation networks to change the direction of reversible lanes according to the network's state. These reversible lanes are exclusive for self-driving cars. Human drivers are not allowed to enter these reversible lanes, considering the limitations of human ability compared to a computer in analyzing data and making decisions about moving direction. To achieve this goal, reinforcement...