Loading...
Search for: deep-reinforcement-learning
0.016 seconds
Total 28 records

    Active learning of causal structures with deep reinforcement learning

    , Article Neural Networks ; Volume 154 , 2022 , Pages 22-30 ; 08936080 (ISSN) Amirinezhad, A ; Salehkaleybar, S ; Hashemi, M ; Sharif University of Technology
    Elsevier Ltd  2022
    Abstract
    We study the problem of experiment design to learn causal structures from interventional data. We consider an active learning setting in which the experimenter decides to intervene on one of the variables in the system in each step and uses the results of the intervention to recover further causal relationships among the variables. The goal is to fully identify the causal structures with minimum number of interventions. We present the first deep reinforcement learning based solution for the problem of experiment design. In the proposed method, we embed input graphs to vectors using a graph neural network and feed them to another neural network which outputs a variable for performing... 

    Computation offloading strategy for autonomous vehicles

    , Article 27th International Computer Conference, Computer Society of Iran, CSICC 2022, 23 February 2022 through 24 February 2022 ; 2022 ; 9781665480277 (ISBN) Farimani, M. K ; Karimian Aliabadi, S ; Entezari Maleki, R ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2022
    Abstract
    Vehicular edge computing is a progressing technology which provides processing resources to the internet of vehicles using the edge servers deployed at roadside units. Vehicles take advantage by offloading their computationintensive tasks to this infrastructure. However, concerning time-sensitive applications and the high mobility of vehicles, cost-efficient task offloading is still a challenge. This paper establishes a computation offloading strategy based on deep Q-learning algorithm for vehicular edge computing networks. To jointly minimize the system cost including offloading failure rate and the total energy consumption of the offloading process, the vehicle tasks offloading problem is... 

    Firtual hardware-in-the-loop FMU CO-simulation based digital twins for heating, ventilation, and air-conditioning (HVAC) systems

    , Article IEEE Transactions on Emerging Topics in Computational Intelligence ; 2022 , Pages 1-11 ; 2471285X (ISSN) Abrazeh, S ; Mohseni, S ; Zeitouni, M. J ; Parvaresh, A ; Fathollahi, A ; Gheisarnejad, M ; Khooban, M ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2022
    Abstract
    In this paper, a novel self-adaptive control method based on a digital twin is developed and investigated for a multi-input multi-output (MIMO) nonlinear system, which is a heating, ventilation, and air-conditioning system. For this purpose, hardware-in-loop (HIL) and software-in-loop (SIL) are integrated to develop the digital twin control concept in a straightforward manner. A nonlinear integral backstepping (NIB) model-free control technique is integrated with the HIL (implemented as a physical controller) and SIL (implemented as a virtual controller) controllers to control the HVAC system without the need for dynamic feature identification. The main goal is to design the virtual... 

    Safe Path Planning for Cooperative Mobile Robots Based on Deep Reinforcement Learning

    , M.Sc. Thesis Sharif University of Technology Kazemi Tameh, Ehsan (Author) ; Khodaygan, Saeed (Supervisor)
    Abstract
    Nowadays, with the remarkable development of the robotics industry, there is an increasing demand for mobile robots. Mobile robots can be deployed individually or in groups for various tasks such as autonomous warehouses, search and rescue operations, firefighting operations, and maintenance and repairs. It is evident that performing certain tasks, such as moving large and long objects or firefighting operations, is more efficient when robots are deployed cooperatively, and in some cases, these tasks cannot be accomplished by a single robot alone. Therefore, in recent years, the issue of path planning for cooperative robots has received significant attention. By cooperation, we mean that... 

    Multimodal Image Registration using Reinforcement Learning-based Methods

    , M.Sc. Thesis Sharif University of Technology Sabour, Amir Hossein (Author) ; Fatemizadeh, Emadeddin (Supervisor)
    Abstract
    Image registration is the process of estimating and applying a spatial transformation to a moving image with the aim of spatially aligning it with a fixed image. This allows for the combination of images with complementary information, such as images with different modalities, acquisition times, and even coming from separate individuals, with the purpose of producing more information-rich results. Image registration is a crucial step in many medical applications, such as analyzing the growth and changes of tissue and tumors, preoperative planning, image-guided surgery, radiation therapy planning and various segmentation tasks. Reinforcement learning is a science and mathematical paradigm for... 

    A Reinforcement Learning Framework for Portfolio Management Problem Leveraging Stocks Historical Data And Their Correlation

    , M.Sc. Thesis Sharif University of Technology Taherkhani, Hamed (Author) ; Fazli, Mohammad Amin (Supervisor)
    Abstract
    Over the past few years, deep reinforcement learning(DRL) has been given a lot of attention in finance for portfolio management. With the help of experts’ signals and historical price data, we have developed a new reinforcement learning(RL) method. The use of experts’ signals in tandem with DRL has been used before in finance, but we believe this is the first time this method has been used to solve the financial portfolio management problem. As our agent, we used the Proximal Policy Optimization(PPO) algorithm to process the reward and take actions in the environment. Our framework comprises a convolutional network to aggregate signals, a convolutional network for historical price data, and... 

    A Stock Portfolio Management Algorithm Based on Fundamental Market Data for Tehran’s Stock Exchange – Case Study on Mining & Metal Industries

    , M.Sc. Thesis Sharif University of Technology Zarei, Mohammad (Author) ; Habibi, Moslem (Supervisor)
    Abstract
    The aim of this research is to develop and implement a deep reinforcement learning algorithm for portfolio management in the Tehran stock market, which is considered an emerging market with distinct patterns compared to the stock markets of developed countries. In this study, in addition to the market price data extensively used in previous research, we leverage fundamental ratio data extracted from company financial reports, which have received less attention. Furthermore, the research scope is limited to stocks in the mining and metal industries to enable the utilization of specific industry features, such as susceptibility to global prices of a key commodity. The portfolio management... 

    Learning Methods in Predicting the Outcome of Repeated Games

    , Ph.D. Dissertation Sharif University of Technology Vazifedan, Afrooz (Author) ; Izadi, Mohammad (Supervisor)
    Abstract
    The main goal of this research is to investigate different types of learning methods used on repetitive games. A repetitive game is a model for describing all kinds of frequent activities among humans or intelligent machines. Applying learning models to repeated games is the intersection of game theory and machine learning fields. In the field of game theory, this research is explored under the title of behavioral game theory, where the purpose is to predict the behavior of human beings in repeated games in static environments (without states) and study how they select their actions. In the field of machine learning, repeated games are referred to as multi-agent systems and include problems... 

    Algorithmic Trading Using Deep Reinforcement Learning

    , M.Sc. Thesis Sharif University of Technology Majidi, Naseh (Author) ; Marvasti, Farohk (Supervisor)
    Abstract
    Price movement prediction has always been one of the traders’ concerns in the field of financial market prediction. In order to increase the profit of the trades, the traders can process the historical data and predict the movement. The large size of the data and complex relations between them lead us to use algorithmic trading and artificial intelligence.The stock and Cryptocurrency markets are two common markets attracting traders. This thesis aims to offer an approach using Twin-Delayed DDPG (TD3) and daily close price in order to achieve a trading strategy. Unlike the previous studies using a discrete action space reinforcement learning algorithm, TD3 is a continuous one offering both... 

    Adaptive Maneuvers for Aircraft Conflict Resolution Using Learning Theory

    , M.Sc. Thesis Sharif University of Technology Mamizadeh, Zahra (Author) ; Malaek, Mohammad Bagher (Supervisor)
    Abstract
    The problem of detection and resolution of aircraft collisions is very important due to the increasing demand for flights. Many algorithms have been developed in the past to increase automation in air traffic management and reduce the workload of air traffic controllers. These algorithms either have difficulty in generalizing to real problems or have high computational costs and do not correspond to the reality of the actual maneuvering characteristics of the aircraft performance. The aim of present study is to obtain dynamic maneuvers that are adaptive with reality and also optimal in terms of utilizing the capacity of flight sectors, so we propose Deep Reinforcement Learning(DRL) based on... 

    Meta Reinforcement Learning for Domain Generalization

    , M.Sc. Thesis Sharif University of Technology Riyahi Madvar, Maryam (Author) ; Rohban, Mohammad Hossein (Supervisor)
    Abstract
    Deep reinforcement learning has achieved better cumulative rewards than humans in many environments like Atari. One drawback of these methods is their data inefficiency which makes training time-consuming, and in some cases having this amount of data is infeasible. Meta reinforcement learning can use past experiences to enable agents to adapt to new tasks faster and makes neural networks to train in a short amount of time.One of the methods in meta reinforcement learning is inferring tasks which helps exploitation policy to have good performance in new tasks. There’s a need to improve exploration policy as well as exploitation policy by gaining informative transitions about the new task.... 

    Optimal Control of a Quadcopter in Fast Descending Maneuvers Based on Reinforcement Learning

    , M.Sc. Thesis Sharif University of Technology Azadi, Majid (Author) ; Fallah Rajabzadeh, Famida (Supervisor) ; Zohoor, Hassan (Supervisor) ; Nejat Pishkenari, Hossein (Co-Supervisor)
    Abstract
    Quadrotors have limitation in performing fast descent maneuvers due to Vortex Ring State (VRS) region which make quadrotor unstable. In order to avoid entering VRS, a velocity constraint considered which it should be satisfied during this maneuver to guarantee a safe and stable fast descending maneuver by quadrotor. The purpose of this thesis is to overcome limitation in speed space of quadrotor in order to reduce the time of fast descending maneuvers by using Reinforcement Learning Techniques. A new cascade controller proposed which using PID in inner loop as a low level controller and DDPG as one of reinforcement learning techniques in outer loop as high level controller in order to... 

    Design of a HEV’s Controller Using Learning-based Methods

    , M.Sc. Thesis Sharif University of Technology Zare, Aramchehr (Author) ; Boroushaki, Mehrdad (Supervisor)
    Abstract
    Hybrid electric vehicles (HEV) are proving to be one of the most promising innovations in advanced transportation systems to reduce air pollution and fossil fuel consumption. EMS is one of the most vital aspects of the HEV powertrain system. This research aims to design an optimal EMS under the condition of meeting the goals of drivability control, fuel consumption reduction, and battery charge stability. The current EMS is based on the classical rule-based method derived from fuzzy logic, which guides to the suboptimal solution in episodic driving cycles. Previous experiences in implementing Reinforcement Learning (RL) suffer from late convergence, instability in tracking the driving... 

    A Novel Resource Allocation Algorithm in Edge Computing with Deep Reinforcement Learning

    , M.Sc. Thesis Sharif University of Technology Rahmati, Iman (Author) ; Movaghar, Ali (Supervisor)
    Abstract
    With the explosion of mobile smart devices, many computation intensive applications have emerged, such as interactive gaming and augmented reality. Mobile edge computing (EC) is put forward, as an extension of cloud computing, to meet the low-latency require- ments of the applications. In mobile edge computing systems, an edge node may have a high load when a large number of mobile devices offload their tasks to it. those offloaded tasks may experience large processing delay or even be dropped when their deadlines expire. Due to the uncertain load dynamics at the edge nodes, it is challenging for each device to determine its offloading decision (i.e., whether to offload or not, and which... 

    Supply Chain Optimization with Perishable Products Through Demand Forecasting by a Reinforcement Learning Algorithm

    , M.Sc. Thesis Sharif University of Technology Shams Shemirani, Sadaf (Author) ; Khedmati, Majid (Supervisor)
    Abstract
    Using an efficient method to manage inventory systems is always a challenging issue in supply chain optimization. In supply chains including perishable goods, it is possible to reduce waste and other costs by identifying uncertain demand patterns and managing inventory levels at different stages of the supply chain. Considering the uncertainty and complex conditions of supply chains in the real world, in order to create a suitable model to express these conditions, various uncertain factors must be considered, each of which affects the supply chain inventory level in some way. In this research, a multi-level perishable supply chain model with uncertain demand, lead time and deterioration... 

    An Application of Deep Reinforcement Learning for Ambulance Allocation to Emergency Departments under Overcrowding Situation

    , M.Sc. Thesis Sharif University of Technology Taher Gandomabadi, Mohammad Mahdi (Author) ; Akhavan Niaki, Taghi (Supervisor)
    Abstract
    In the last decade, emergency department (ED) overcrowding has become a national crisis for the US healthcare system. Increasing mortality rates, decreasing quality of care, financial losses due to walkouts, and ambulance diversion are some of the consequences of ED overcrowding. Given the increasing demand in terms of ambulance utilization which we can see an instance of it in the COVID-19 pandemic, being able to allocate service requests to EDs efficiently, becomes a key function of emergency medical services. in this investigation, an algorithm of deep reinforcement learning called deep Q-learning is used to address this problem and to assign ambulances to ED's appropriately. under... 

    An Application of Deep Reinforcement Learning in Novel Supply Chain Management Approaches for Inventory Control and Management of Perishable Supply Chain Network

    , M.Sc. Thesis Sharif University of Technology Mohammadi, Navid (Author) ; Akhavan Niaki, Taghi (Supervisor)
    Abstract
    This study proposes a deep reinforcement learning approach to solve a perishable inventory allocation problem in a two-echelon supply chain. The inventory allocation problem is studied considering the stochastic nature of demand and supply. The examined supply chain includes two retailers and one distribution center (DC) under a vendor-managed inventory (VMI) system. This research aims to minimize the wastages and shortages occurring at the retailer's sites in the examined supply chain. With regard to continuous action space in the considered inventory allocation problem, the Advantage Actor-Critic algorithm is implemented to solve the problem. Numerical experiments are implemented on... 

    Data-driven Methods for Cooperative Control of Wheeled Mobile Robots

    , M.Sc. Thesis Sharif University of Technology Qahremani, Sina (Author) ; Sadati, Nasser (Supervisor)
    Abstract
    Employing wheeled mobile robots is growing in industry, transportation, space and defense industry and many other social fields as well. These robots are used to execute distinct forms of operations and tasks such as exploring the surface of the earth and other planets, serving in public places, backing natural disasters and warehousing, and so forth. In some cases, the assigned mission may not be capable of being performed as intended by a single robot. In this case, several robots will work together to execute a particular mission. Several research topics that are under investigation currently include the interacting procedure of robots as a multi-agent system in order to perform the... 

    Learning-based Control System Design for the Bipedal Running Robot and Development of a Two-layer Framework for Generating the Optimal Paths in Various Movement Maneuvers

    , M.Sc. Thesis Sharif University of Technology Amiri, Aref (Author) ; Salarieh, Hassan (Supervisor)
    Abstract
    Foot movement is one of the most powerful and adaptable methods of movement in nature. Inspired by humans, the most intelligent creatures on earth, bipedal robots have many uses. In this research, a control method for running a bipedal robot has been designed. In the simulation part of the five-link model, the robot's motion equations for running and walking at different levels are extracted by the Lagrange method. In path generation, using the two-layer optimization method and holonomic and dynamic constraints, optimal paths are produced which are kinematically and dynamically possible (feasible). Additionally, path generation is facilitated by an invariant impact constraint to ensure the... 

    Design and Implementation of an Intelligent Control System Based-on Deep Reinforcement Learning for a Lower-limb Hybrid Exoskeleton Robot

    , M.Sc. Thesis Sharif University of Technology Koushki, Amir Reza (Author) ; Vossoughi, Gholamreza (Supervisor) ; Boroushaki, Mehrdad (Supervisor)
    Abstract
    Hybrid Exoskeletons refer to simultaneous use of wearable robots and functional electrical stimulation technology. Hybrid exoskeletons have many advantages compared to the separate application of each of these technologies, such as reducing the robot’s energy consumption and the need for lighter and cheaper actuators for the robot, using humans muscle power, and reducing muscle fatigue. As a result, these robots have recently attracted a lot of interest in rehabilitation applications for patients suffering from mobility impairment.Control in hybrid exoskeletons is more complicated than control in traditional exoskeletons. Because in addition to robot and functional electrical stimulation...