Everyday life assignments, payments, communication, access to the internet and other services are getting simplified by technology development but there are also negative effects like different fraud actions that can cause damage to a lot of organizations and their clients.
In this work we are focused on the fraud detection in telecommunication industry. Telecommunication companies are confronting with well known frauds as well as with unknown frauds. Fraud actions are not always detected or they may be detected too late. Usually more different parties from all around the world are included in telecommunication sevices and thus punishment of the criminals is very difficult. From a mathematical point of view fraud detection is considered as the identification of unusual pattern or anomaly detection in a big data. Anomalies or outliers are rare cases in the data, which are significantly different from the majority of the data. Since all the activities of clients are stored in CDR files, amount of data is very large.
Test data are not labeled as fraudulent or normal activities, therefore unsupervised methods and other advanced techniques for anomaly detection, which do not require target variable, are considered in this work. The aim of the thesis is to examine different advanced methods of data mining in order to detect anomalies, and to develop a model that would be capable to distinguish normal from fraudulent activities.