Loading...
Search for: q-learning
0.066 seconds

    Optimizing Replenishment and Pricing in a Vendor-managed Inventory Supply Chain When Customers Negotiate

    , M.Sc. Thesis Sharif University of Technology Bagherirad, Sonia (Author) ; Modarres Yazdi, Mohammad (Supervisor)
    Abstract
    In this study vendor-managed inventory policy in supply chains is investigated and a formulation is developed to optimize replenishments from vendor to retailer and also price for negotiator customers. As a result, we consider a two echelon supply chain containing a vendor and a retailer managed according to VMI policy. The goal is to find the optimal replenishment from vendor to retailer at the beginning of the month and by using dynamic programming approach to maximize the supply chain profit. Demand is nondeterministic and it is supposed Poisson distribution with unknown parameter. We will consider Gamma distribution for this parameter which its parameters are learning in dynamic... 

    Dynamic Pricing of Charter Flight Tickets with Learning

    , M.Sc. Thesis Sharif University of Technology Mehrdar, Atabak (Author) ; Modarres, Mohammad (Supervisor)
    Abstract
    In this thesis, an approach is developed to obtain an optimal pricing policy for chartered flights. In order to do so, a model within the framework of dynamic programming is presented and its main structure is also analyzed. Since in real world cases the dimension of this model happens to be very large, a solution method is developed by “Q Learning” technique. This is an appropriate approach in approximate dynamic programming and reinforcement learning. Analysis is carried out under two different assumptions regarding demand, namely “linear-deterministic” and probabilistic demand for transition probabilities. An exact solution for deterministic demand case is developed. Furthermore, for... 

    Cancer Simulation with Markov Decision Process

    , M.Sc. Thesis Sharif University of Technology Zarepour, Fariborz (Author) ; Habibi, Jafar (Supervisor)
    Abstract
    Cancer is refer to a class of diseases that create as the result of abnormal growth of cells and invasion of them to normal cells of human body, and annually cause the considerable percentage of death in the world. Because cancer can be considered as a complex system, various models presented to modeling and simulation of the behavior of it, using of different methods such as cellular automata, agent-based, game theory and other methods. Multi-agent simulation models as a special kind of agent-based models, is a method that used to simulate some real-world phenomena that usually contains many different components and interact using different and complex ways. Since the cells are located in... 

    Optimal Control of Unknown Interconnected Systems via Distributed Learning

    , M.Sc. Thesis Sharif University of Technology Farjadnasab, Milad (Author) ; Babazadeh, Maryam (Supervisor)
    Abstract
    This thesis addresses the problem of optimal distributed control of unknown interconnected systems. In order to deal with this problem, a data-driven learning framework for finding the optimal centralized and the suboptimal distributed controllers has been developed via convex optimization.First of all, the linear quadratic regulation (LQR) problem is formulated into a nonconvex optimization problem. Using Lagrangian duality theories, a semidefinite program is then developed that requires information about the system dynamics. It is shown that the optimal solution to this problem is independent of the initial conditions and represents the Q-function, an important concept in reinforcement... 

    Reinforcement Learning Approach in Self-Assembly Systems to Acquire Desired Structures

    , M.Sc. Thesis Sharif University of Technology Ravari, Amir Hossein (Author) ; Bagheri Shouraki, Saeed (Supervisor)
    Abstract
    Self-Assembly (SA) plays a critical role in the formation of different phenomena in nature. This phenomenon can be defined as an arrangement of meaningful patterns with the aggregate behavior of simpler structures. One of the examples of Self-Assembly can be considered of the formation of ice crystals from ice molecules. Previous works mainly focus on graph grammar and self-assembly in fully observable environments. These algorithms mainly consist of two main stages: first, constructing simpler structures and then joining these simpler structures to form a complex structure. The challenges of the previous work can be considered as the necessity of a central controller in the formation of... 

    Optimal Design and Intelligent Control of Polymer Electrolyte Membrane Fuel Cell Stack

    , M.Sc. Thesis Sharif University of Technology Ahmadi, Mohammad Reza (Author) ; Boroushaki, Mehrdad (Supervisor)
    Abstract
    We present here an analysis of controlling the Polymer Electrolyte Membrane Fuel Cells (PEMFCs) using the Q-learning algorithm, the most widely-known among reinforcement learning (RL) techniques. The method is to train the controller to guide and sustain the fuel cell power output in the 2.5 kW mark by way of manipulating elements of the reaction subsystem including the fuel cell current, the relative humidity, and the anode/cathode pressures. As the Q-learning algorithm need be implemented within a fuel cell simulation environment, the mathematical model known as Amphlett steady-state model of the PEM fuel cell was employed. The semi-empirical nature of this model necessitates the...