Statistical comparison of machine learning algorithms with respect to multiple performance measures : master's thesis

Dular, Lara

Statistical comparison of machine learning algorithms with respect to multiple performance measures : master's thesis
ID Dular, Lara (Avtor), ID Todorovski, Ljupčo (Mentor) Več o mentorju... Povezava se odpre v novem oknu

, ID Ankerst, Donna (Komentor)

PDF - Predstavitvena datoteka, prenos (1,07 MB)
MD5: 2E71D9306D18E0771BC27788412D04C8

Izvleček

In the theory and practice of machine learning, we often face the task of comparing the performance of learning algorithms on multiple data sets. On the one hand, theoretical studies that propose new algorithms or improvements of the existing ones, compare the newly proposed algorithms to the existing ones. On the other hand, empirical studies on the application of machine learning methods often compare the performance of learning algorithms on various instances of a practical real-world problem. In both cases, an appropriate statistical analysis, which is the subject of this master's thesis, is crucial to determine the significance of the comparison's results. This thesis has two main goals. The first is a thorough presentation of the most commonly used nonparametric statistical tests used for comparing machine learning algorithms with respect to a single performance measure, namely, the Wilcoxon signed-rank test and the Friedman test. The second goal of the master's thesis is to overcome the limitations of existing approaches for comparison of algorithms with respect to a single, pre-selected performance measure. We present a new approach for the comparison of machine learning algorithms with respect to multiple performance measures simultaneously. To this end, the concept of Pareto fronts, used in the field of multi-objective optimization, will be utilized to rank the algorithms according to multiple performance measures. Thus, the above-mentioned nonparametric statistical tests may also be used in the context of the new approach. We illustrate the use of the newly developed approach on an example of comparing the performance of four learning algorithms for classification on ten publicly available data sets. We compare the algorithms with respect to two performance measures that assess two aspects of the accuracy of the trained classification models. The results of the comparison show that in most cases, the new approach rejects the null hypothesis for comparison of algorithms with respect to both performance measures simultaneously, if the existing approach rejects at least one of the two null hypotheses for a single performance measure.

Jezik:	Angleški jezik
Ključne besede:	comparison of machine learning algorithms, pairwise comparison, comparative studies, multiple performance measures, Wilcoxon signed-rank test, Friedman test, Pareto front
Vrsta gradiva:	Magistrsko delo/naloga
Tipologija:	2.09 - Magistrsko delo
Organizacija:	FMF - Fakulteta za matematiko in fiziko
Leto izida:	2018
PID:	20.500.12556/RUL-102425
UDK:	519.8
COBISS.SI-ID:	18421593
Datum objave v RUL:	30.08.2018
Število ogledov:	2801
Število prenosov:	764
Metapodatki:
:	Kopiraj citat
Objavi na:

Sekundarni jezik

Izvleček:
Jezik:	Slovenski jezik
Naslov:	Statistična primerjava algoritmov strojnega učenja glede na več mer zmogljivosti : magistrsko delo
Na področju strojnega učenja se pogosto soočamo z nalogo primerjave zmogljivosti učnih algoritmov na več podatkovnih množicah. Na eni strani razvojne študije, ki predstavljajo nove ali izboljšave obstoječih algoritmov, primerjajo razvite algoritme z obstoječimi, na drugi strani pa empirične študije uporabe strojnega učenja pogosto primerjajo zmogljivost učnih algoritmov na različnih instancah praktičnih problemov. V vsakem primeru je za ugotavljanje pomena rezultatov primerjave ključna primerna statistična analiza, ki je predmet proučevanja tega magistrskega dela. Magistrsko delo ima dva glavna cilja. Prvi je temeljita predstavitev najpogosteje uporabljenih neparametričnih statističnih testov, ki jih uporabljamo pri primerjavi zmogljivosti algoritmov strojnega učenja, Wilcoxonovega testa predznačenih rangov in Friedmanovega testa. Drugi cilj magistrskega dela je preseganje omejitve obstoječih pristopov na primerjavo algoritmov glede na eno samo, vnaprej izbrano mero zmogljivosti. V delu predstavimo novi pristop za primerjavo algoritmov strojnega učenja glede na več mer zmogljivosti hkrati. V ta namen uporabimo koncept Pareto front, ki izhaja iz področja večkriterijske optimizacije in nam omogoči, da algoritme razvrščamo glede na več mer zmogljivosti. Tako lahko tudi novi pristop uporablja zgoraj omenjene neparametrične statistične teste. Uporabo novo razvitega pristopa ponazorimo na primeru primerjave zmogljivosti štirih algoritmov za učenje klasifikacijskih modelov na desetih javno dostopnih podatkovnih množicah. Primerjavo izvajamo glede na dve meri zmogljivosti algoritmov, ki se nanašajo na točnost naučenih klasifikacijskih modelov. Rezultati primerjave kažejo, da novo razviti pristop zavrne ničelno hipotezo za primerjavo algoritmov glede na obe meri zmogljivosti hkrati, če obstoječi pristop zavrne vsaj eno izmed obeh ničelnih hipotez za posamezno mero.
Ključne besede:	primerjava algoritmov strojnega učenja, primerjalna študija, parna primerjava, mere zmogljivosti, Wilcoxonov test predznačenih rangov, Friedmanov test, Pareto fronta

Podobna dela

Podobna dela v RUL:
Podobna dela v drugih slovenskih zbirkah:

Nazaj