izpis_h1_title_alt

Statistical comparison of machine learning algorithms with respect to multiple performance measures : master's thesis
ID Dular, Lara (Author), ID Todorovski, Ljupčo (Mentor) More about this mentor... This link opens in a new window, ID Ankerst, Donna (Comentor)

.pdfPDF - Presentation file, Download (1,07 MB)
MD5: 2E71D9306D18E0771BC27788412D04C8

Abstract
In the theory and practice of machine learning, we often face the task of comparing the performance of learning algorithms on multiple data sets. On the one hand, theoretical studies that propose new algorithms or improvements of the existing ones, compare the newly proposed algorithms to the existing ones. On the other hand, empirical studies on the application of machine learning methods often compare the performance of learning algorithms on various instances of a practical real-world problem. In both cases, an appropriate statistical analysis, which is the subject of this master's thesis, is crucial to determine the significance of the comparison's results. This thesis has two main goals. The first is a thorough presentation of the most commonly used nonparametric statistical tests used for comparing machine learning algorithms with respect to a single performance measure, namely, the Wilcoxon signed-rank test and the Friedman test. The second goal of the master's thesis is to overcome the limitations of existing approaches for comparison of algorithms with respect to a single, pre-selected performance measure. We present a new approach for the comparison of machine learning algorithms with respect to multiple performance measures simultaneously. To this end, the concept of Pareto fronts, used in the field of multi-objective optimization, will be utilized to rank the algorithms according to multiple performance measures. Thus, the above-mentioned nonparametric statistical tests may also be used in the context of the new approach. We illustrate the use of the newly developed approach on an example of comparing the performance of four learning algorithms for classification on ten publicly available data sets. We compare the algorithms with respect to two performance measures that assess two aspects of the accuracy of the trained classification models. The results of the comparison show that in most cases, the new approach rejects the null hypothesis for comparison of algorithms with respect to both performance measures simultaneously, if the existing approach rejects at least one of the two null hypotheses for a single performance measure.

Language:English
Keywords:comparison of machine learning algorithms, pairwise comparison, comparative studies, multiple performance measures, Wilcoxon signed-rank test, Friedman test, Pareto front
Work type:Master's thesis/paper
Typology:2.09 - Master's Thesis
Organization:FMF - Faculty of Mathematics and Physics
Year:2018
PID:20.500.12556/RUL-102425 This link opens in a new window
UDC:519.8
COBISS.SI-ID:18421593 This link opens in a new window
Publication date in RUL:30.08.2018
Views:2809
Downloads:764
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:Slovenian
Title:Statistična primerjava algoritmov strojnega učenja glede na več mer zmogljivosti : magistrsko delo
Abstract:
Na področju strojnega učenja se pogosto soočamo z nalogo primerjave zmogljivosti učnih algoritmov na več podatkovnih množicah. Na eni strani razvojne študije, ki predstavljajo nove ali izboljšave obstoječih algoritmov, primerjajo razvite algoritme z obstoječimi, na drugi strani pa empirične študije uporabe strojnega učenja pogosto primerjajo zmogljivost učnih algoritmov na različnih instancah praktičnih problemov. V vsakem primeru je za ugotavljanje pomena rezultatov primerjave ključna primerna statistična analiza, ki je predmet proučevanja tega magistrskega dela. Magistrsko delo ima dva glavna cilja. Prvi je temeljita predstavitev najpogosteje uporabljenih neparametričnih statističnih testov, ki jih uporabljamo pri primerjavi zmogljivosti algoritmov strojnega učenja, Wilcoxonovega testa predznačenih rangov in Friedmanovega testa. Drugi cilj magistrskega dela je preseganje omejitve obstoječih pristopov na primerjavo algoritmov glede na eno samo, vnaprej izbrano mero zmogljivosti. V delu predstavimo novi pristop za primerjavo algoritmov strojnega učenja glede na več mer zmogljivosti hkrati. V ta namen uporabimo koncept Pareto front, ki izhaja iz področja večkriterijske optimizacije in nam omogoči, da algoritme razvrščamo glede na več mer zmogljivosti. Tako lahko tudi novi pristop uporablja zgoraj omenjene neparametrične statistične teste. Uporabo novo razvitega pristopa ponazorimo na primeru primerjave zmogljivosti štirih algoritmov za učenje klasifikacijskih modelov na desetih javno dostopnih podatkovnih množicah. Primerjavo izvajamo glede na dve meri zmogljivosti algoritmov, ki se nanašajo na točnost naučenih klasifikacijskih modelov. Rezultati primerjave kažejo, da novo razviti pristop zavrne ničelno hipotezo za primerjavo algoritmov glede na obe meri zmogljivosti hkrati, če obstoječi pristop zavrne vsaj eno izmed obeh ničelnih hipotez za posamezno mero.

Keywords:primerjava algoritmov strojnega učenja, primerjalna študija, parna primerjava, mere zmogljivosti, Wilcoxonov test predznačenih rangov, Friedmanov test, Pareto fronta

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back