izpis_h1_title_alt

Začetni prikazi podatkov v analizi zgodovine dogodkov : magistrsko delo
ID Kregar, Neža (Author), ID Vidmar, Gaj (Mentor) More about this mentor... This link opens in a new window, ID Ružić Gorenjec, Nina (Comentor)

.pdfPDF - Presentation file, Download (2,23 MB)
MD5: 5308C948C110B37076132D738998974D

Abstract
Analiza zgodovine dogodkov je sklop metod in testov, ki se uporabljajo, ko nas zanimajo dogodki, stanja, povezave med njimi in spremembe v času. Za podatke v analizi zgodovine dogodkov je značilno, da so sestavljeni iz popolnih in nepopolnih podatkov − dogodek se lahko posamezniku zgodi ali ne, lahko pa zgolj nimamo informacije o tem. Nepopolni podatki se imenujejo krnjeni podatki in jih iz analize ne smemo izpuščati, saj s tem postane ocena pristranska. Za obravnavo krnjenih podatkov so se razvili številni modeli in metode, prikazi tovrstnih podatkov pa se velikokrat zanemarjajo in se jih v analizo pogosto ne vključi. So namreč težavni tako zaradi pomanjkanja informacije kakor tudi številnih spremenljivk, zaradi česar je težko vse na pregleden način uvrstiti v dvorazsežen prikaz. Grafični prikazi nam lahko pomagajo pri odkrivanju napak v podatkih, ki so v analizi zgodovine dogodkov pogoste. Napake so lahko naključne (napake pri prepisovanju podatkov, nemogoč vrstni red dogodkov) ali sistematične (npr. dodelitev enakih časov več dogodkom iste enote, neosveženi podatki). Podatke je potrebno pred analizo podrobno pregledati, da napake odpravimo in je zato analiza kakovostna in verodostojna. V magistrskem delu smo pregledali obstoječe prikaze podatkov v analizi zgodovine dogodkov – krivuljo preživetja, kumulativno porazdelitveno funkcijo, kumulativno ogroženost, ogroženost, histogram za krnjene podatke, okvir z ročaji za krnjene podatke, dogodkovni diagram, Lexisov diagram in diagram v obliki svinčnika – ter ocenili njihovo ustreznost. Vse navedene diagrame (razen diagrama v obliki svinčnika) smo narisali tudi na primeru lastnih podatkov. Za risanje histograma za krnjene podatke ter okvirja z ročaji za krnjene podatke smo napisali lastni funkciji v programu R (kodi sta podani v prilogi). Za risanje krivulje preživetja, kumulativne porazdelitvene funkcije, kumulativne ogroženosti, dogodkovnih diagramov ter Lexisovega diagrama smo uporabili obstoječe funkcije in knjižnice v programu R. Za odkrivanje napak v podatkih pred analizo smo ustvarili uporabniku prijazno interaktivno spletno aplikacijo, ki omogoča pregled vnešenih podatkov z dogodkovnimi diagrami, poleg tega pa izpostavi enote z napako v sosledju dogodkov ter enote z enakimi časi dogodkov. Enote z napako izpiše glede na njihovo identifikacijo, jih izpiše v tabeli ter grafično prikaže z dogodkovnim diagramom. Za izdelavo aplikacije smo uporabili knjižnico Shiny v programu R. Izvirni doprinos magistrskega dela je hiter in enostaven prikaz podatkov iz analize zgodovine dogodkov, s čimer dobimo okviren vtis o podatkih in njihovi porazdelitvi. Poleg tega lahko v aplikaciji podatke pregledamo na ravni vsakega posameznika, poiščemo napake v podatkih ter jih posledično tudi odpravimo. S tem se bistveno skrajša čas urejanja in preoblikovanja podatkov pred analizo ter omogoči bolj kakovostno analizo brez napak v podatkih.

Language:Slovenian
Keywords:analiza zgodovine dogodkov, analiza preživetja, prikazi podatkov, histogram za krnjene podatke, okvir z ročaji za krnjene podatke, dogodkovni diagram.
Work type:Master's thesis/paper
Typology:2.09 - Master's Thesis
Organization:FE - Faculty of Electrical Engineering
Place of publishing:Ljubljana
Publisher:[N. Kregar]
Year:2021
Number of pages:IX, 54 str.
PID:20.500.12556/RUL-128881 This link opens in a new window
UDC:303:311(043.3)
COBISS.SI-ID:74480387 This link opens in a new window
Publication date in RUL:11.08.2021
Views:2923
Downloads:194
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:English
Title:Initial data visulisations in event history analysis : magistrski študijski program druge stopnje Uporabna statistika
Abstract:
Event history analysis is a range of methods and tests which are used when events, states, connections between them and time changes are of our interest. What is characteristic for data in the event history analysis is the complete and incomplete data structure – one event can or cannot occur to an individual, or the information about it remains unknown. The incomplete data are called censored data and they should not be left out of the analysis because by doing so, the estimates would become biased. Various models and methods have been developed for censored data, however, visualisation is often neglected or not even included in the analysis. Censored data visualisation is difficult due to the censored information as well as numerous variables, which makes it difficult to display everything in a two-dimensional graphics. Graphical displays can aid in data errors detection, which is a common phenomenon in the event history analysis. The errors can be random (copying data errors, impossible event sequences) or systematic (e.g., assigning the same time to multiple events, unrefreshed data). The data must be examined in detail before carrying out the analysis in order to eliminate errors and obtain a high-quality and reliable analysis. In this Master’s thesis, we have examined the existing data visualisations in event history analysis – survival curve, cumulative distribution function, cumulative hazard, hazard, censored data histogram, censored data boxplot, event charts, Lexis diagram and pencil diagram. We have assessed their adequacy and drawn them for our own data (with the exception of the pencil diagram). We have written our own functions in the R software for drawing the censored data histogram and the censored data boxplot (the code is in the appendix). For drawing survival curve, cumulative distribution function, cumulative hazard, event charts and Lexis diagrams, we have applied the existing functions and libraries in the R software. We have created a user-friendly interactive web-based application for detecting data error before carrying out the analysis, which enables an overview of the entered data using event charts and identifies units with errors in the sequence of events or identical event times. The application lists the units with errors by their identification, displays them in a table and visualises them using the event chart. We have used the Shiny library in R for creating the application. The original contribution of the Master’s thesis is a fast and simple visualisation of data from event history analysis, which gives us an overview about the data and their distribution. In addition, the application enables looking at each individual’s data, searching for errors and consequently also eliminating them. This markedly shortens the time for editing and transforming the data before the analysis, thus enabling a better analysis without data errors.

Keywords:event history analysis, survival analysis, visualisation, censored data histogram, censored data boxplot, event charts.

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back