The use of mixture regression in machine learning

Mlakar, Peter

The use of mixture regression in machine learning
ID Mlakar, Peter (Author), ID Oblak, Polona (Mentor) More about this mentor... This link opens in a new window

, ID Nummi, Tapio (Comentor)

PDF - Presentation file, Download (17,98 MB)
MD5: F71DC5BEAD94C4B25341A04E5D0C8403

Abstract

Regression and clustering are important components of machine learning. The first servers as a tool for discovering relations between dependent and independent variables in a dataset. With the second, data can be ordered in clusters or group, depending on the similarities between individual data entries. In our thesis, we investigate a novel algorithm that conducts both tasks at the same time. The algorithm for non-parametric regression, which is based on Gaussian mixed models, discovers cluster in longitudinal datasets and, with the help of non-parametric regression, creates smooth mean development curves for those clusters. In the proposed algorithm, the non-parametric regression is based on natural cubic spline regression. We present the theoretical basis for the algorithm and its components. We also incorporate approaches to reduce the proposed algorithms computational complexity. An implementation of the proposed algorithm and corresponding speed-ups are constructed in the programming language Julia. The algorithms performance is demonstrated quantitatively on a synthetic and qualitatively on a real dataset. A Covid-19 dataset available from the World Health Organization was utilized in the later evaluation. The goal of this evaluation is to group together countries with similar epidemiological development trends.

Language:	English
Keywords:	mixture models, regression, natural cubic splines, clustering
Work type:	Master's thesis/paper
Typology:	2.09 - Master's Thesis
Organization:	FRI - Faculty of Computer and Information Science
Year:	2021
PID:	20.500.12556/RUL-130170
COBISS.SI-ID:	77153027
Publication date in RUL:	10.09.2021
Views:	1420
Downloads:	146
Metadata:
:	Copy citation
Share:

Secondary language

Abstract:
Language:	Slovenian
Title:	Uporaba regresije z mešanimi modeli v strojnem učenju
Regresija ter gručenje sta pomembni komponenti strojnega učenja. Prva služi kot pripomoček pri odkrivanju relacij med odvisnimi ter neodvisnimi spremenljivkami v podatkih. S pomočjo druge metode podatke uredimo v skupine glede na njihove medsebojne podobnosti. V našem delu predstavimo nov algoritem, ki hkrati opravlja obe nalogi. Algoritem za neparametrično regresijo, ki temelji na Gaussovih mešanih modelih, v časovno odvisnih podatkih poišče gruče ter s pomočjo neparametrične regresije ustvari povprečne razvojne krivulje posameznih gruč. V predstavljenem algoritmu neparametrična regresija temelji na regresiji z naravnimi kubičnimi zlepki. Na začetku predstavimo teoretično ozadje predlaganega algoritma ter njegovih komponent. Prav tako algoritmu zmanjšamo časovno kompleksnost s pomočjo različnih pohitritev. Algoritem ter uporabljenje pohitritve smo implementirali v programskem jeziku Julia. Njegovo delovanje evalviramo kvantitativno na umetni ter kvalitativno na resnični podatkovni zbirki Covid-19. Cilj slednje evalvacije je gručenje podobnih držav glede na potek epidemije Covid-19 v posameznh državah.
Keywords:	mešani modeli, regresija, naravni kubični zlepki, gručenje

Similar works from RUL:
Similar works from other Slovenian collections:

Secondary language

Similar documents