izpis_h1_title_alt

Dvonivojsko modeliranje lastnosti ekip in napovedovanje izidov športnih tekem : magistrsko delo
ID Fortuna, Rok (Author), ID Štrumbelj, Erik (Mentor) More about this mentor... This link opens in a new window

.pdfPDF - Presentation file, Download (1,22 MB)
MD5: 37002236FD9B0526DB54321DB9799EBB

Abstract
V delu se osredotočamo na problem napovedovanja izidov športnih tekem s pomočjo strojnega učenja. Pri napovedovanju izida neke športne tekme imamo na voljo podatke o preteklih tekmah, ki vsebujejo izmerjene lastnosti sodelujočih ekip. Naša naloga je pretvoriti izmerjene lastnosti ekip v količine, ki opisujejo moči ekip. Te nato uporabimo v učni in testni množici strojnega učenja. Klasični pristopi za vrednosti v učnih in testnih vektorjih uporabijo kar povprečja izmerjenih lastnosti ekip. Izkaže se, da taki pristopi povzročijo preveliko prileganje modelov in posledično slabe napovedi, še posebej, ko imamo malo podatkov. Klasični pristopi namreč ne upoštevajo, da je v športnih podatkih prisoten šum, ki povzroči, da so izračunana povprečja nezanesljiva in posledično vzrok slabega napovednega modela. V delu predlagamo pristop dvonivojskega modeliranja, pri katerem prvi nivo modelira zvezo med lastnostmi ekip in njihovo močjo, drugi nivo pa zvezo med močmi ekip in izidi tekem. Namesto da v učnih in testnih vektorjih uporabljamo povpreč\-ja izmerjenih lastnosti ekip, le-te modeliramo s porazdelitvami. To nam omogoči vzporedno učenje več napovednih modelov, pri čemer končne napovedi dobimo s povprečenjem. Cilj uporabe pristopa dvonivojskega modeliranja je zmanjšanje vpliva šuma v podatkih in izboljšanje napovedi modelov. Poleg opisanega pristopa ponudimo paket v programskem jeziku R, ki vsebuje modularno ogrodje za uporabo pristopov dvonivojskega modeliranja na športnih (in športu podobnih) podatkih. V eksperimentalnem vrednotenju pristopa dvonivojskega modeliranja pokažemo, da le-ta vidno izboljša rezultate napovedovanja v primerjavi s klasičnimi pristopi in predstavlja obetavno metodologijo za nadaljnje raziskave.

Language:Slovenian
Keywords:strojno učenje, Bayesova statistika, šport, modeliranje, šum, preveliko prileganje
Work type:Master's thesis/paper
Typology:2.09 - Master's Thesis
Organization:FMF - Faculty of Mathematics and Physics
FRI - Faculty of Computer and Information Science
Year:2018
PID:20.500.12556/RUL-103022 This link opens in a new window
UDC:004
COBISS.SI-ID:18432345 This link opens in a new window
Publication date in RUL:13.09.2018
Views:1234
Downloads:290
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:English
Title:Two-level modeling of team attributes and prediction of sport matches
Abstract:
We focus on the problem of predicting outcomes of sport matches using machine learning. We predict outcomes of sport matches based on data from past matches. Our task is to transform this data into quantities that describe team strengths which are then used as features in training and test data sets. Standard approaches use averages of team past performance data as features. When our data set size is small, the use of these approaches leads to overfitting and consequently poor predictions. Standard approaches do not take into account the uncertainty in sports data, which is the cause of calculated averages being unreliable. We propose a two-level approach of modeling team attributes. The first level models the connection between team past performance data and their strengths. The second level contains prediction models, which model the connection between team strengths and match outcomes. The first level allows us to train multiple second level prediction models. We obtain final predictions by averaging the predictions from all prediction models. The goal of two-level modeling is to reduce the influence of noise in sports data and to improve predictions of machine learning algorithms. As a part of our work, we offer a package in the R programming language, which contains a modular framework for two-level modeling. In the empirical evaluation of two-level modeling, we show that it clearly improves predictions compared to standard approaches and offers a promising methodology for further research.

Keywords:machine learning, Bayesian statistics, sport, modeling, noise, overfitting

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back