Your browser does not allow JavaScript!
JavaScript is necessary for the proper functioning of this website. Please enable JavaScript or use a modern browser.
Repository of the University of Ljubljana
Open Science Slovenia
Open Science
DiKUL
slv
|
eng
Search
Browse
New in RUL
About RUL
In numbers
Help
Sign in
Details
Use of relaxed stochastic controls in reinforcement learning : magistrsko delo
ID
Rems, Jan
(
Author
),
ID
Agram, Nacira
(
Mentor
)
More about this mentor...
,
ID
Košir, Tomaž
(
Comentor
)
PDF - Presentation file,
Download
(794,72 KB)
MD5: F758F2DEA8454D2B99EEBBEA38C1DBB4
Image galllery
Abstract
In this work, we investigate how relaxed stochastic controls are used for exploration in continuous time and space reinforcement learning. The environment
X
u
is modeled by a stochastic differential equation controlled by control
u
, while the value function
V
u
is an infinite horizon performance functional. For relaxed control distribution
π
we introduce relaxed versions of environment
X
π
and value function
V
π
.
In a special linear-quadratic case the optimal control distribution turns out to be Gaussian with mean depending on the current state, and variance depending on exploration weight parameter. A reinforcement learning algorithm for optimal investment strategy in a simple model of the financial market with the infinite horizon is developed and tested.
Language:
English
Keywords:
reinforcement learning
,
exploration
,
stochastic control theory
,
relaxed controls
,
dynamical programming
,
optimal investment strategy
Work type:
Master's thesis/paper
Typology:
2.09 - Master's Thesis
Organization:
FMF - Faculty of Mathematics and Physics
Year:
2021
PID:
20.500.12556/RUL-130550
UDC:
519.8
COBISS.SI-ID:
79333891
Publication date in RUL:
16.09.2021
Views:
1223
Downloads:
238
Metadata:
Cite this work
Plain text
BibTeX
EndNote XML
EndNote/Refer
RIS
ABNT
ACM Ref
AMA
APA
Chicago 17th Author-Date
Harvard
IEEE
ISO 690
MLA
Vancouver
:
REMS, Jan, 2021,
Use of relaxed stochastic controls in reinforcement learning : magistrsko delo
[online]. Master’s thesis. [Accessed 14 April 2025]. Retrieved from: https://repozitorij.uni-lj.si/IzpisGradiva.php?lang=eng&id=130550
Copy citation
Share:
Secondary language
Language:
Slovenian
Title:
Uporaba relaksiranih stohastičnih akcij v spodbujevalnem učenju
Abstract:
V tem delu si ogledamo, kako uporabiti relaksirane stohastične akcije pri definiranju raziskovanja v spodbujevalnem učenju v zveznem prostoru in času. Prostor
X
u
je modeliran s stohastično diferencialno enačbo kontrolirano z akcijo
u
.
Funkcijo vrednosti
V
u
je funkcional uspešnosti na neskončnem časovnem obdobju. Za relaksirano akcijo
π
vpeljemo raziskovalno verzijo okolja
X
π
in funkcijo vrednosti
V
π
.
V posebnem linearno-kvadratičnem primeru se izkaže, da je optimalna relaksirana akcija Gaussova, kjer je pričakovana vrednost odvisna od trenutnega stanja, varianca pa od parametra, ki kontrolira raven raziskovanja v modelu. Predstavljen je algoritem spodbujevalnega učenja za napoved optimalne strategije v preprostem modelu finančnega trga z neskončim časovnim oknom.
Keywords:
spodbujevalno učenje
,
raziskovanje okolja
,
teorija upravljanja stohastičnih sistemov
,
relaksirane stohastične akcije
,
dinamično programiranje
,
optimalna investicijska strategija
Similar documents
Similar works from RUL:
Searching for similar works...
Similar works from other Slovenian collections:
Back