شماره مدرك :
13692
شماره راهنما :
12443
پديد آورنده :
آقا ابراهيمي، سپيده
عنوان :

يادگيري تقويتي با اعمال پارامتري براي حمله از ميانه ي زمين در شبيه سازي سه بعدي ربات هاي فوتباليست

مقطع تحصيلي :
كارشناسي ارشد
گرايش تحصيلي :
هوش مصنوعي و رباتيك
محل تحصيل :
اصفهان: دانشگاه صنعتي اصفهان، دانشكده برق و كامپيوتر
سال دفاع :
۱۳۹۷
صفحه شمار :
هشت، ۶۹ص.: مصور، جدول، نمودار
استاد راهنما :
مازيار پالهنگ
توصيفگر ها :
يادگيري ماشين  , يادگيري تقويتي , اعمال پارامتري , رباتيك , شبيه سازي سه بعدي , ربوكاپ , حمله از ميانه ي زمين
استاد داور :
مريم ذكري، مهران صفاياني
تاريخ ورود اطلاعات :
1397/05/08
كتابنامه :
كتابنامه
رشته تحصيلي :
برق و كامپيوتر
دانشكده :
مهندسي برق و كامپيوتر
كد ايرانداك :
ID12443
چكيده انگليسي :
70Abstract Problems in robotics domain are usually characterized by high dimensional continuousstate and action spaces Partially observable and noisy states are accessible rather than theactual state Controlling autonomous robots in this domain is consequently challenging Theidea of interacting with the environment to autonomously find an optimal behavior is theessence of reinforcement learning RoboCup Soccer is a commonly used testbed forreinforcement learning methods Continuous multi dimensional state space noisy actionsand perceptions and high uncertainty of this environment make reinforcement learningappropriate to use in this domain Many tasks have so far been learned in this domain withthese methods Keepaway and Half Field Offense are two instances that incorporate suitabletasks for reinforcement learning Such tasks have mostly been learned in 2D soccer Becauseof an additional dimension and physical constraints learning is much more difficult in 3Dsoccer Applying 2D environment algorithms in 3D space faces new challenges ExtendingKeepaway from 2D soccer to 3D soccer is an example of such efforts done so far Reinforcement learning problems typically feature discrete or continuous action spaces Parameterizing each discrete action with continuous parameters makes it possible tofine tune actions in different situations Learning in such parameterized action spaces iscomplicated by the necessity of dealing with continuous parameters However it providesthe most fine grained control over the agent s behavior One method of learning in thisdomain is to define separate policies for discrete actions and the continuous parameters ofeach action and then alternate learning these policies from interaction with the environment In this study a single agent task of Half Field Offense is learned in a parameterized actionspace in the domain of 3D soccer simulation The agent must learn to maintain the possessionof the ball while it makes its way towards the goal and finally score on the goal at anappropriate time The performance of the agent is evaluated by the number of goals scored One of the reasons that makes this study important is that this task has never beenimplemented in the 3D environment before Furthermore making use of a parameterizedaction space and learning two separate policies for discrete actions and continuousparameters entails using value based methods along with policy search methods in anenvironment of such great complexity Final results demonstrate that despite a large statespace and noisy intricate actions the agent succeeds in learning these two policies The agenthas been successful in maintaining an uptrend in the number of goals scored in different testscenarios Keywords Machine Learning Reinforcement Learning Parameterized Actions Robotics 3D Simulation RoboCup Half Field Offense
استاد راهنما :
مازيار پالهنگ
استاد داور :
مريم ذكري، مهران صفاياني
لينک به اين مدرک :

بازگشت