شماره مدرك :
12716
شماره راهنما :
11640
پديد آورنده :
واقف، ابراهيم
عنوان :

توسعه الگوريتم هاي يادگيري كيو به منظور افزايش سرعت همگرايي و بهبود سيگنال كنترل

مقطع تحصيلي :
كارشناسي ارشد
گرايش تحصيلي :
كنترل
محل تحصيل :
اصفهان: دانشگاه صنعتي اصفهان، دانشكده برق و كامپيوتر
سال دفاع :
1396
صفحه شمار :
يازده، 82ص.: مصور، جدول، عكس(رنگي)، نمودار
يادداشت :
ص.ع. به فارسي و انگليسي
استاد راهنما :
مريم ذكري
توصيفگر ها :
يادگيري كيو , يادگيري تقويتي , آونگ معكوس و ارابه , همگرايي لياپانوفي
استاد داور :
فريد شيخ السلام، محمد دانش
تاريخ ورود اطلاعات :
1396/05/31
كتابنامه :
كتابنامه
رشته تحصيلي :
برق و كامپيوتر
دانشكده :
مهندسي برق و كامپيوتر
كد ايرانداك :
ID11640
چكيده انگليسي :
Development of Q learning algorithm to increase convergence speed and improve control signals Vaghef Abraham e vaghe@ec iut ac ir May 2017 Department of Electrical and Computer Engineering Isfahan University of Technology Isfahan 84156 83111 Iran Degree M Sc Language Farsi Supervisor Prof Maryam Zekri m zekri@cc iut ac ir Abstract One of the methods used to make control systems intelligent is reinforcement learning This algorithm based on humanintelligence and behavior creates the capability of learning for systems Q learning as one of the methods of reinforcementlearning by trial and error learns how to control the system To design and implementaion of Q learning algorithm itis needed both increasing the learning convergence and improving control signals amplitude In this proposed research based on developing Q learning some approaches are introduced to control Cart Pole systems In this way at the firststep by utilizing maximum reward for Q learning the speed of convergence is improved In addition with a conditioninspired by delayed Q learning to update the value function the convergence speed is increased too At the second step to modify the amplitude of the control signals applied to the system k nearest neighbor k NN concept is used in standardQ learning Convergence of the presented algorithm is verified by Lyapunov method At the third step a new method basedon combination of delayed Q learning and k nearest neighbor is presented and the convergence is proved by lyapunovtheorem Genetic algorithm is applied to achieve the optimal problem parameters such as learning rate discount factor updating condition number of considered neighbor parameter decreasing rate and etc Numerical simulations for Cart Pole system show speed increasing for converging to a deterministic radius of closed loop system equilibrium point further control signal amplitude is decreased in k nearst neighbor methods based Finally all three presented methods are comparedwith standard Q learning delayed Q learning and methods of k nearest neighbor for reinforcement learning and someconclusions and arguments are stated Key Words Q learning Reinforcement learning Cart Pole Lyapunov convergence
استاد راهنما :
مريم ذكري
استاد داور :
فريد شيخ السلام، محمد دانش
لينک به اين مدرک :

بازگشت