Details

Razvoj dinamičnega programa za gručenje podatkov in detekcijo anomalij z knjižnico ML.NET in storitvijo Azure OpenAI
ID Temelkovski, Bodan (Author), ID Groznik, Vida (Mentor) More about this mentor... This link opens in a new window

.pdfPDF - Presentation file, Download (948,50 KB)
MD5: C6B28685AAEFFCE042D402F73605B671

Abstract
Razvoj umetne inteligence in strojnega učenja je bistveno poenostavil analizo podatkov. Ob sočasnem eksponentnem porastu količine podatkov v podjetjih se pojavlja potreba po avtomatiziranih rešitvah, ki lahko nadomestijo ročno analizo. V okviru te diplomske naloge je bil razvit dinamičen sistem za gručenje podatkov in detekcijo anomalij, ki temelji na algoritmu razvrščanja z voditelji (angl. \textit{K-Means}) ter knjižnici ML.NET. Sistem omogoča avtomatsko pripravo podatkov, normalizacijo z različnimi pristopi (npr. min-max normalizacija in robustno skaliranje), iskanje optimalnega števila gruč z metodo komolca in silhuetno metodo ter zaznavanje odstopanj z metodo PCA. V sklepni fazi so bili rezultati interpretirani s pomočjo velikega jezikovnega modela GPT-4o preko platforme Azure OpenAI, kar omogoča boljše razumevanje vzorcev v podatkih. Rešitev je bila preizkušena na realnih anonimiziranih podatkih farmacevtskega podjetja.

Language:Slovenian
Keywords:umetna inteligenca, gručenje, razvrščanje z voditelji, Azure, OpenAI
Work type:Bachelor thesis/paper
Typology:2.11 - Undergraduate Thesis
Organization:FRI - Faculty of Computer and Information Science
Year:2025
PID:20.500.12556/RUL-170337 This link opens in a new window
COBISS.SI-ID:241388035 This link opens in a new window
Publication date in RUL:03.07.2025
Views:229
Downloads:44
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:English
Title:Development of a dynamic program for data clustering and anomaly detection using the ML.NET library and the Azure OpenAI service
Abstract:
The development of artificial intelligence and machine learning has significantly simplified data analysis. As data volumes in companies continue to grow, the need for automated systems capable of handling complex analysis without human intervention becomes increasingly important. This thesis presents the development of a dynamic system for data segmentation and anomaly detection based on the K-Means algorithm and the ML.NET framework. The system automatically prepares and normalizes data using methods such as Min-Max normalization and Robust Scaling, determines the optimal number of clusters using the Elbow and Silhouette methods, and detects anomalies through Principal Component Analysis (PCA). In the final stage, the results are interpreted using a large language model (GPT-4o) via the Azure OpenAI platform, providing deeper insights into detected patterns. The solution was tested on real but anonymised data from a pharmaceutical company, demonstrating its practical applicability in real-world environments.

Keywords:artificial intelligence, segmentation, K-means, Azure, OpenAI

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back