Your browser does not allow JavaScript!
JavaScript is necessary for the proper functioning of this website. Please enable JavaScript or use a modern browser.
Repository of the University of Ljubljana
Open Science Slovenia
Open Science
DiKUL
slv
|
eng
Search
Browse
New in RUL
About RUL
In numbers
Help
Sign in
Details
Samkodirnik z naključnim gozdom : magistrsko delo
ID
Makovecki, Tine
(
Author
),
ID
Todorovski, Ljupčo
(
Mentor
)
More about this mentor...
PDF - Presentation file,
Download
(947,77 KB)
MD5: C6A53892DA563A7A6B47EE7473C8B545
Image galllery
Abstract
Na področju strojnega učenja se pogosto pojavljajo problemi z množicami visokih razsežnosti, ki pa so zaradi “prekletstva razsežnosti” zahtevni za reševanje. Pri reševanju takih problemov pogosto uporabljamo metode za manjšanje razsežnosti množic. Popularna metoda za manjšanje razsežnosti so samokodirniki, ki so ponavadi zgrajeni iz nevronskih mrež. Slabost nevronskih mrež je, da zahtevajo veliko procesorskega časa in da ima uporabnik zaradi njihove kompleksnosti zelo slab vpogled v njihovo delovanje. Zato želimo v magistrskem delu razviti samokodirnik na osnovi naključnega gozda, ki teh slabosti ne bi imel. Za konstrukcijo samokodirnika iz naključnega gozda izberemo nabor listov, ki skupaj čim bolje opišejo podatkovno množico, in jih združimo v kodirni vektor. Smokodirnik nato primer zakodira na osnovi njegove pripadnosti listom konstruiranega kodirnega vektorja. Za postopek dekodiranja imamo na razpolago dve informaciji: poti v odločitvenih drevesih, ki vodijo do listov v kodirnem vektorju in shranjene napovedi naključnega gozda. Da poiščemo čim boljšo rekonstrukcijo zakodiranih primerov, uporabimo oba podatka. Naš samokodirnik testiramo, da določimo čim boljše nastavitve parametrov, in njegovo natančnost primerjamo s samokodirniki iz nevronskih mrež. Ugotovimo, da je zaenkrat manj natančen od standardnega pristopa, in premislimo možnosti, kako ga lahko v prihodnosti izboljšamo.
Language:
Slovenian
Keywords:
strojno učenje
,
manjšanje razsežnosti podatkov
,
samokodirniki
,
naključni gozdovi
,
umetne nevronske mreže
Work type:
Master's thesis/paper
Typology:
2.09 - Master's Thesis
Organization:
FMF - Faculty of Mathematics and Physics
Year:
2021
PID:
20.500.12556/RUL-128385
COBISS.SI-ID:
69733379
Publication date in RUL:
10.07.2021
Views:
5480
Downloads:
144
Metadata:
Cite this work
Plain text
BibTeX
EndNote XML
EndNote/Refer
RIS
ABNT
ACM Ref
AMA
APA
Chicago 17th Author-Date
Harvard
IEEE
ISO 690
MLA
Vancouver
:
MAKOVECKI, Tine, 2021,
Samkodirnik z naključnim gozdom : magistrsko delo
[online]. Master’s thesis. [Accessed 24 March 2025]. Retrieved from: https://repozitorij.uni-lj.si/IzpisGradiva.php?lang=eng&id=128385
Copy citation
Share:
Secondary language
Language:
English
Title:
Autoencoder via random forest
Abstract:
In the field of machine learning, problems with high-dimensional data sets are common, and difficult to solve due to the “curse of dimensionality”. To solve these problems, we usually apply methods for dimensionality reduction. A popular method for this are autoencoders, which are usually built with neural networks. However, the downside of neural networks is high computation costs of training and their complexity which obscures the user insight into how they work. To address these issues, we aim at developing an autoencoder that is based on random forests and does not have such problems. To construct an autoencoder from a random forest, we select a set of forest leaves, which describe the data set well, and save them into an encoding vector. We use the encoding vector to encode data samples. There are two types of information we can use to decode the data: the decision tree paths leading to leaves in the encoding vector and the saved predictions form the random forest. We combine the two to get the best possible reconstruction of encoded data. We test the constructed autoencoder to tune the parameter settings and evaluate its performance in comparison to neural network autoencoders. We establish that at this point our autoencoder is significantly less accurate compared to common autoencoders and consider the possibilities for upgrading it in the future.
Keywords:
machine learning
,
dimensionality reduction
,
autoencoders
,
random forests
,
artificial neural networks
Similar documents
Similar works from RUL:
No similar works found
Similar works from other Slovenian collections:
No similar works found
Back