آموزش رفتار دروازه‌باني ربات انسان‌نماي فوتباليست شبيه‌سازي شده با استفاده از روش يادگيري تقويتي قطعي عميق

شماره مدرك :

14716

شماره راهنما :

13245

پديد آورنده :

اشكوفراز، يعقوب

عنوان :

آموزش رفتار دروازه‌باني ربات انسان‌نماي فوتباليست شبيه‌سازي شده با استفاده از روش يادگيري تقويتي قطعي عميق

مقطع تحصيلي :

كارشناسي ارشد

گرايش تحصيلي :

هوش مصنوعي و رباتيكز

محل تحصيل :

اصفهان : دانشگاه صنعتي اصفهان

سال دفاع :

1397

صفحه شمار :

سيزده، 93

استاد راهنما :

مازيار پالهنگ

توصيفگر ها :

ربات انسان‌نماي فوتباليست , شبيه‌سازي سه بعدي فوتبال , چالش دروازه‌باني , يادگيري تقويتي عميق , گراديان سياست , شبكه‌هاي عصبي عميق , آموزش رفتار , كنترل پيوسته

استاد داور :

مهران صفاياني، محمدعلي خسروي فرد

تاريخ ورود اطلاعات :

1398/04/12

كتابنامه :

كتابنامه

رشته تحصيلي :

مهندسي كامپيوتر

دانشكده :

مهندسي برق و كامپيوتر

تاريخ ويرايش اطلاعات :

1398/04/12

كد ايرانداك :

2543718

چكيده انگليسي :

Behavior Learning of Goalie Humanoid Soccer Simulated Robot Using Deep Deterministic Reinforcement Learning Seyed Yaghoub Ashkoofaraz February 6 2019 Department of Electrical and Computer Engineering Isfahan University of Technology Isfahan 84156 83111 IranDegree Master of Science MSc Language FarsiSupervisor Maziar Palhang Assoc Prof AbstractThe main focus of this research is the domains of reinforcement learning RL and neural network in behavior learning ofgoalie humanoid robot in a three dimensional soccer simulation environment RL is a branch of machine learning to chooseaction in an unknown environment to maximize the cumulative reward One of the most important goals of developmentin robotics and artificial intelligence is the winning of a team of autonomous humanoid robots against human teams in asoccer game Among all the agent s behavior the goalie s behavior in a soccer match is an important problem RL in anenvironment with continuous states and actions provides a proper method for learning the agent s behavior at any time According to advances that have been made in this field the goalie s humanoid robot has been able to shut out much moreground shots by designed controllers and RL Achieving better performance requires implementing a method to controlthe agent s behavior to perform a proper response in a more complex environment and with various shots including aerialshots Therefore control of agent behavior in complex environments will be necessary But traditional RL algorithms areinefficient in the environment with two following attributes 1 high dimensional state spaces such as pixels of cameraimages 2 high dimensional continuous action spaces This research tackles the goalie problem using RL algorithm wheretwo asynchronous RL learners are utilized to achieve better performance performance on this problem is the number ofshots shut out by the goalie in the goalie challenge Recently powerful RL methods such as Deep RL Method and RL with Actor Critic architecture based on Policy Gradientsmethod have been proposed to solve robot control problems over a wide range of action spaces Using these two methodsand deep neural networks with more robust network architecture a new hybrid method is proposed that can solve continuouscontrol problems In this research first the problem of goalie s humanoid soccer robot is modeled using two reinforcementlearners To determine the state of the environment a method is proposed to predict the trajectory of the ball Then the skilldescription language is used to design skills such as dive to cover more area by the goalie and the action space is specifiedand then by combining two reinforcement learners doing behavior control of goalie humanoid robot Finally it has beenshown that the RL agent in shutting out the ground and aerial shots is more efficient than the methods implemented by topteams KeywordsHumanoid Soccer Robot Three Dimensional Soccer Simulation Goalie Challenge Deep Reinforcement Learning PolicyGradient Deep Neural Networks Behavior Learning Continuous Control

استاد راهنما :

مازيار پالهنگ

استاد داور :

مهران صفاياني، محمدعلي خسروي فرد

لينک به اين مدرک :

https://library.iut.ac.ir/dL/search/default.aspx?Term=14716&Field=0&DTC=107

کلیه حقوق این اثر برای شرکت مهندسی ارتباطات پيام مشرق محفوظ می باشد