Detecting temporal and spatial anomalies in users' activities for security provisioning in computer networks

Huč, Aleks

Detecting temporal and spatial anomalies in users' activities for security provisioning in computer networks
ID Huč, Aleks (Author), ID Trček, Denis (Mentor) More about this mentor... This link opens in a new window

, ID Trebar, Mira (Comentor)

PDF - Presentation file, Download (4,42 MB)
MD5: C8ABA5362D8457B2303B28B9C59C0568

Abstract

Communication is essential to humans as social beings - it enables us to build and maintain relationships, take part in education, work and act in other private and public social environments settings. Nowadays, more and more communication is occurring through computers, computer networks and other digital devices. Unfortunately, as with many things in our lives, this communication can be compromised and exploited by attackers for their monetary gain, social status or curiosity at the expense of legitimate users. Therefore, the need for robust, reliable and rapid detection and prevention of network security threats has become very important. The field of computer network security is very broad, which is why we focused on intrusion detection in computer networks. Over the years two main techniques of intrusion detection were developed: anomaly-based detection and signature-based detection. Anomaly-based detection builds a normal network activity model and focuses on detecting abnormal network activity that differs from the model. Signature-based detection includes a knowledge database of signatures of known attacks and focuses on detecting network activity that conforms to stored signatures. Many different intrusion detection ap\-proach\-es have already been developed, however, networks with ever growing volume, velocity, variety and variability of transmitted data pose an open challenge. Specifically, how to identify new types of attacks, effectively analyze large amounts of data, learn from unlabeled data, adapt to changes in data and improve robustness and accuracy of detection. The goal of this dissertation is to build upon the current state-of-the-art computer network anomaly detection approaches. We explore lightweight, unsupervised and incremental approaches that can handle a large volume of data, adapt to non-stationary changes automatically and do not need prerequisite training on labeled data. We propose two new approaches for detecting anomalies in computer networks, which for their input, instead of packet headers and payloads, use network packet aggregates (so called network flows) that greatly reduce the volume of the data that needs to be analyzed. For every network entity we build a profile that models its activity with an incremental hierarchical clustering algorithm based on BIRCH clustering that automatically updates to changes in the input data with the help of a fading function. The first approach detects anomalies inside profiles by tracking cluster changes over time with the ADWIN algorithm, distance of the new clustered observation from the cluster center, distance of the new cluster from its neighboring cluster and by tracking the size and age of the cluster. The second approach adds an additional level with incremental hierarchical clustering that groups together similar profiles and detects anomalies in activity of those groups with mechanisms presented in the first approach. Second level clustering analyzes tree data structures, which is why we defined a new metric for determining similarities between them on the basis of distances between clusters and size of clusters. In our analysis we have used up to date data sets of network flows (ISCXIDS2012 and CICIDS2017) with the most common types of network attacks. We have evaluated prediction performance, execution time, feature importance and performed sensitivity analysis of the most important parameters. Both approaches achieved prediction performance (F1 score over 0.90) comparable to the state-of-the-art supervised approaches even when taking into account that they see every data point only once and then discard it without the prerequisite learning phase with labeled data. The two approaches present a good baseline for further improvement of detection performance with additional detection mechanisms. They can provide data reduction and a pre-processing step for computationally more demanding methods. They are also of a general nature, which is why they can also be used in other problem domains that can be presented as data streams.

Language:	English
Keywords:	anomaly detection, incremental learning, unsupervised learning, clustering, adaptive windowing, profiling, network security, network flows
Work type:	Doctoral dissertation
Typology:	2.08 - Doctoral Dissertation
Organization:	FRI - Faculty of Computer and Information Science
Year:	2022
PID:	20.500.12556/RUL-137562
COBISS.SI-ID:	113961987
Publication date in RUL:	22.06.2022
Views:	1518
Downloads:	156
Metadata:
:	Copy citation
Share:

Secondary language

Abstract:
Language:	Slovenian
Title:	Detekcija časovnih in prostorskih anomalij pri aktivnostih uporabnikov za zagotavljanje varnosti v računalniških omrežjih
Komunikacija je ključna za ljudi kot družbena bitja, saj nam omogoča graditi in ohranjati odnose, sodelovati v izobraževanju, delu in drugih zasebnih in javnih družbenih okoljih. V današnjem času poteka komunikacija v vse večjem obsegu preko računalnikov, računalniških omrežij in ostalih digitalnih naprav. Tako kot v realnem svetu, tudi tukaj lahko pride do zlorab s strani napadalcev zaradi radovednosti, denarne koristi ali pridobitve socialnega statusa na račun legitimnih uporabnikov. Zato je potrebno zagotoviti varno računalniško komunikacijo z robustno, zanesljivo ter hitro detekcijo in preprečevanjem varnostnih napadov. Področje varnosti računalniških omrežij je zelo široko, zato smo se osredotočili na zaznavanje nevarnosti v računalniških omrežjih. Razvita sta bila dva glavna pristopa, in sicer zaznavanje na podlagi anomalij in zaznavanje na podlagi definicij. Zaznavanje na podlagi anomalij zazna anomalije na podlagi njihovih odstopanj od zgrajenega modela običajne omrežne aktivnosti. Zaznavanje na podlagi definicij zazna anomalije na podlagi primerjanja trenutne omrežne aktivnosti z bazo shranjenih definicij znanih napadov. Razvitih je bilo že veliko različnih pristopov za zaznavanje napadov, vendar omrežja s pogostimi spremembami velikosti, hitrosti, raznolikosti in spremenljivosti prenesenih podatkov predstavljajo odprt izziv. Pogosta vprašanja so kako zaznati nove vrste napadov, učinkovito analizirati velike količine podatkov, se učiti iz neoznačenih podatkov, se prilagajati spremembam v podatkih ter izboljšati robustnost in natančnost zaznavanja. Cilj doktorske disertacije je izdelava sodobnih pristopov detekcije anomalij z uporabo lahkih nenadzorovanih in inkrementalnih pristopov učenja, ki omogočajo obdelavo velike količine podatkov v obliki podatkovnih tokov, posodabljanje svojih modelov v realnem času in ne vključujejo predhodnega učenja na označenih podatkih. Predlagamo dva nova pristopa za odkrivanje anomalij v računalniških omrežjih, ki namesto glave (angl. header) in vsebine (angl. payload) paketov za vhod uporabljata agregate omrežnih paketov, imenovane omrežni tokovi, ki močno zmanjšajo količino podatkov, ki jih želimo analizirati. Za vsako omrežno entiteto zgradimo profil, ki modelira njeno aktivnost z algoritmom inkrementalnega hierarhičnega združevanja v gruče osnovanim na metodi BIRCH, ki se samodejno posodablja glede na spremembe v vhodnih podatkih z uporabo funkcije zmanjševanja pomembnosti starim podatkom. Prvi pristop zazna anomalije znotraj profilov s sledenjem spremembam gruč skozi čas z algoritmom ADWIN, oddaljenostjo novega podatka v gruči od središča gruče, oddaljenostjo nove gruče od sosednjih gruč in sledenjem velikosti ter starosti gruče. Drugi pristop nadgradi prvega tako, da doda drugi nivo inkrementalnega hierarhičnega gručenja, ki združuje podobne profile v gruče in zazna anomalije v aktivnosti le teh z mehanizmi, predstavljenimi pri prvem pristopu. Gručenje na drugem nivoju za podatkovne točke uporablja drevesne podatkovne strukture, zato smo definirali novo metriko za ugotavljanje podobnosti med njimi na podlagi razdalj med gručami in njihovimi velikostmi. Pristopa smo analizirali s sodobnima podatkovnima zbirkama omrežnih tokov (ISCXIDS2012 in CICIDS2017) z najbolj pogostimi tipi napadov. Vrednotili smo natančnost napovedi, čas izvajanja, pomembnost značilk in analizirali senzitivnost najpomembnejših parametrov. Pri obeh smo dosegli primerljivo stopnjo natančnosti detekcij anomalij (mera F1 nad 0.90) z nadzorovanimi metodami. Kljub temu, da pri svojem delovanju vidita vsak podatek le enkrat in ga nato pozabita ter ne vključujeta predhodnega učenja z označenimi podatki. Oba pristopa predstavljata dobro osnovo za nadaljnjo nadgradnjo mehanizmov za detekcijo, redukcijo in predhodno obdelavo podatkov za računsko bolj zahtevne metode detekcije. Prav tako omogočata uporabo tudi v drugih problemskih domenah, ki jih lahko predstavimo s podatkovnimi tokovi.
Keywords:	detekcija anomalij, inkrementalno učenje, nenadzorovano učenje, gručenje, prilagodljivo okno, profiliranje, omrežna varnost, omrežni tokovi

Similar works from RUL:
Similar works from other Slovenian collections:

Secondary language

Similar documents