Simulacija in analiza hoje z metodo spodbujevalnega učenja

ZIRKELBACH, ALJAŽ

Simulacija in analiza hoje z metodo spodbujevalnega učenja
ID ZIRKELBACH, ALJAŽ (Author), ID Podobnik, Janez (Mentor) More about this mentor... This link opens in a new window

PDF - Presentation file, Download (11,29 MB)
MD5: BA59560C60A07D79B4889B1303400E3D

Abstract

V tem dokumentu je predstavljeno spodbujevalno učenje pri simulaciji hoje humanoidnega robota z uporabo algoritma proksimalna optimizacija strategije (ang. Proximal policy optimization, krajše PPO). Podrobno so opisani model robota, sestavljen v razvijalni platformi Unity, na katerem se izvajajo simulacije spodbujevalnega učenja z uporabo paketa ML-Agents, agent, njegova vhodna in izhodna stanja ter funkcija spodbude. Referenčne animacije, ki jih robot v tej nalogi posnema, so del javno dostopne baze meritev, ki zajemajo časovni potek premikanja posameznih segmentov več različnih primerov hoje, podatke o masah, višinah posnetih oseb, hitrosti hoje in podatke o reakcijski sili tal (ang. ground reaction force, krajše GRF) in centru pritiska (ang. center of pressure, krajše CoP), ki se jih na koncu izvozi in analizira v okolju MATLAB ter se primerja s tistimi pridobljenimi z naučenega modela robota. Predstavljeni sta funkcionalnosti zgodnja prekinitev (ang. Early termination) in inicializacija referenčnega stanja (ang. Reference state initialization), ki pospešita učenje ter pomagata pri stabilnosti učenja in hoje. V končnem in najobsežnejšem poglavju je predstavljen celoten potek gradnje modela robota in posnemanja hoje od prve do zadnje verzije, kar je sestavljeno v čim bolj smiselno celoto. Cilj tega poglavja je, da se razbere miselni proces pri izboljševanju modela in prikaže doseženo posnemanje referenčne hoje, CoP ter GRF.

Language:	Slovenian
Keywords:	spodbujevalno učenje, Unity, model, robot, hoja, funkcija spodbude
Work type:	Master's thesis/paper
Organization:	FE - Faculty of Electrical Engineering
Year:	2021
PID:	20.500.12556/RUL-130496
COBISS.SI-ID:	76589315
Publication date in RUL:	15.09.2021
Views:	917
Downloads:	82
Metadata:
:	Copy citation
Share:

Secondary language

Abstract:
Language:	English
Title:	Simulation and analysis of walking using reinforcement learning
This thesis covers reinforcement learning in the simulation of humanoid robot walking using the proximal policy optimization algorithm (PPO). First, the thesis gives description of the dynamical model of the robot, simulated in the Unity development platform, where simulations of reinforcement learning using the ML-Agents package are performed, the agent, its input and output states, and the reward function. The reference animations that the robot mimics in this research are part of a publicly available data set that includes the time data of movement of individual segments of several different walking patterns, data of masses and heights of recorded subjects and finally walking speed and ground reaction force (GRF) and the center of pressure (CoP) data, which are exported and analyzed in the MATLAB environment and compared with those of the learned robot model. Early termination and Reference state initialization functionalities are presented, which accelerate learning and help with learning and walking stability. The final and most extensive chapter presents the entire process of building the robot model and gait imitation from the first to the last version. The aim of this chapter is to explain the thought process in model improvements and to show the achieved imitation of gait, CoP and GRF.
Keywords:	reinforcement learning, Unity, model, robot, gait, reward function

Similar works from RUL:
Similar works from other Slovenian collections:

Secondary language

Similar documents