Generiranje anonimiziranih statističnih vzorcev iz zdravstvenih podatkovnih zbirk

Arsovski, Martin

Generiranje anonimiziranih statističnih vzorcev iz zdravstvenih podatkovnih zbirk
ID Arsovski, Martin (Author), ID Brodnik, Andrej (Mentor) More about this mentor... This link opens in a new window

, ID Žibert, Janez (Comentor)

PDF - Presentation file, Download (549,53 KB)
MD5: 06C545B7C1DB340F87409E8E31EEFE70

Abstract

Dandanes lahko rečemo, da je precej priljubljeno, hkrati pa lahko zelo koristno, proučevanje podatkov, povezanih z medicinskimi preiskavami med bolniki. Proučevanje takšnih podatkov je lahko zelo koristno v sodobni medicini in lahko tudi izboljša kakovost zdravstvenih storitev. Danes imajo verjetno vse bolnišnice za svoje bolnike zdravstvene podatkovne zbirke, ki vključujejo veliko zasebnih podatkov o pacientih, zdravstvenih obravnavah, posegih, laboratorijskih izvidih ipd. Za uporabo teh podatkov za izvajanje medicinskih raziskav in analiz pa bi morali imeti dovoljenje bolnišnic in drugih institucij, kar ljudem, ki se s tem ukvarjajo, predstavlja težavo. Poleg tega lahko takšne analize včasih stanejo veliko denarja in časa. Podatke je treba še anonimizirati in pripraviti tako, da ohranjajo statistične lastnosti osnovne podatkovne zbirke. V naši magistrski nalogi bomo pregledali in ustrezno predstavili več metod generiranja sintetičnih podatkov na podlagi dejanskih podatkov. Bomo izbrali in implementirali nekaj najboljših metod iz literature. Implementirane metode bomo uporabili za generiranje sintetičnih podatkov. Evaluacija postopkov generiranja vzorcev bo izvedena tako, da se bodo primerjale statistične lastnosti vzorca s populacijskimi lastnostmi. Na podlagi evaluacije bomo ocenili, katere metode generiranja sintetičnih podatkov so pri tem najuspešnejše.

Language:	Slovenian
Keywords:	vzorčenje populacije, anonimizacija podatkov, zdravstvena informatika, statistika, sintetični podatki
Work type:	Master's thesis/paper
Typology:	2.09 - Master's Thesis
Organization:	FRI - Faculty of Computer and Information Science
Year:	2023
PID:	20.500.12556/RUL-153150
COBISS.SI-ID:	178836995
Publication date in RUL:	19.12.2023
Views:	573
Downloads:	82
Metadata:
:	Copy citation
Share:

Secondary language

Abstract:
Language:	English
Title:	Generation of anonymized statistical samples from health databases
Nowadays, we can say that it is quite popular, and at the same time it can be very useful, to study data related to medical examinations among patients. Studying such data can be very useful in modern medicine and can also improve the quality of health services. Today, probably all hospitals have medical databases for their patients, which include a lot of private data about patients, medical treatments, interventions, laboratory results, etc. However, in order to use this data to conduct medical research and analysis, you would have to get permission from hospitals and other institutions, which presents a problem for the people involved. In addition, such analyzes can sometimes cost a lot of money and time. The data must be anonymized and prepared in such a way that they preserve the statistical properties of the basic database. In our master's thesis, we will review and adequately present several methods of generating synthetic data based on real data. Based on the review, we will select some of the best methods from the literature and implement them. We will use the implemented methods to generate synthetic data. The evaluation of the sample generation procedures will be carried out by comparing the statistical properties of the sample with the population properties. Based on the evaluation, we will assess which methods of generating synthetic data are the most successful.
Keywords:	population sampling, data anonymization, health informatics, statistics, synthetic data

Similar works from RUL:
Similar works from other Slovenian collections:

Secondary language

Similar documents