Parni problem dveh vzorcev pri MCAR manjkajočih podatkih

Brulić, Melisa

Repository of the University of Ljubljana

Details

Parni problem dveh vzorcev pri MCAR manjkajočih podatkih
ID Brulić, Melisa (Author), ID Smrekar, Jaka (Mentor) More about this mentor... This link opens in a new window

PDF - Presentation file, Download (737,07 KB)
MD5: 77B6109A2332783617A3DD3CB8A3A634

Abstract

Pri problemu dveh vzorcev preizkušamo domnevo, da dva vzorca izhajata iz iste verjetnostne porazdelitve. Poseben primer tega problema je parni problem dveh vzorcev, kjer je vsaka enota iz enega vzorca smiselno povezana z enoto iz drugega vzorca. Gre za klasičen statistični problem, ki se pogosto pojavi v praksi, na primer pri longitudinalnih študijah, kjer za vsakega preiskovanca pridobimo podatke v dveh različnih časovnih točkah. Cilj dela je podati test za parni problem dveh vzorcev pri povsem naključno (MCAR) manjkajočih podatkih. V ta namen predstavimo matematični okvir, potreben za konstrukcijo testa. Najprej obravnavamo U- in V-statistike ter njuno asimptotsko obnašanje, nato pa se posvetimo teoriji jeder, kjer spoznamo pojem Hilbertovega prostora z reproducirajočim jedrom (RKHS) in jedrne vložitve. Nazadnje v delu razvijemo neparametrični test, ki upošteva tako popolne kot tudi nepopolne pare podatkov. Test temelji na maksimalnem srednjem odstopanju (MMD), (pol)razdalji med verjetnostnima porazdelitvama, definirani kot razdalja med njunima jedrnima vložitvama v Hilbertovem prostoru z reproducirajočim jedrom. V delu utemeljimo doslednost testa, pri čemer uporabimo asimptotske lastnosti izrojenih V-statistik. Podamo tudi algoritem za preizkušanje domneve, ki temelji na ponovnem vzorčenju in utemeljimo njegovo doslednost.

Language:	Slovenian
Keywords:	parni problem dveh vzorcev, manjkajoči podatki, MMD, jedro, jedrna vložitev, RKHS, preizkušanje domnev, U-statistike, V-statistika
Work type:	Master's thesis/paper
Organization:	FMF - Faculty of Mathematics and Physics
Year:	2025
PID:	20.500.12556/RUL-171684
UDC:	519.2
COBISS.SI-ID:	246912003
Publication date in RUL:	30.08.2025
Views:	139
Downloads:	56
Metadata:
:	Copy citation
Share:

Secondary language

Abstract:
Language:	English
Title:	The paired two-sample problem with MCAR missing data
The two-sample problem is centered about a statistical test that determines if two samples are from the same probability distribution. A special case of this problem is the paired two-sample problem, where the samples are dependent in the sense that each observation in one sample is paired with an observation in the other. This is a classical statistical problem that frequently arises in practice, for example in longitudinal studies where data is collected for each subject at two different time points. The aim of this thesis is to develop a test for the paired two-sample problem with MCAR missing data. For this purpose we introduce the mathematical framework required for constructing the test. We begin by presenting the theory of U- and V-statistics and their asymptotic behavior. This is followed by a discussion of the theory of kernel functions, introducing the concept of reproducing kernel Hilbert spaces (RKHS) and kernel mean embeddings. Finally, we develop a nonparametric test that incorporates both the complete and incomplete pairs of observations. The test is based on the maximum mean discrepancy statistic, which is a (pseudo)distance between probability distributions, defined as the distance between their kernel mean embeddings in a RKHS. We establish the consistency of the test using the asymptotic properties of degenerate V-statistics. We also present a bootstrap algorithm for testing the null hypothesis and establish its consistency.
Keywords:	two sample problem, matched pairs, missing data, MMD, kernel, kernel mean embedding, RKHS, hypothesis testing, U-statistics, V-statistics

Similar works from RUL:
Similar works from other Slovenian collections:

Details

Secondary language

Similar documents