The problem of missing data is relatively common in almost all research. It is important that we treat missing data problem comprehensively, following the steps of guidelines. The purpose of the thesis was to compare different methods for addressing missing data, within the framework of various mechanisms, in order to make recommendations for their proper treatment. Simulations (with 1000 repetitions) were performed on a case of EHIS research (2014), conducted by NIPH, where the most problematic variables are income-related. The diagnosis of missing data, which was carried out through logistic regression, indicated the presence of at least MAR mechanism. The results of the simulations showed that the percentage of missing values and the mechanism significantly influences the bias of the estimates. According to the simulations' results we can conclude that the CCA and PD methods are impartial, insofar as values are missing within the MCAR mechanism, but ineffective if the percentage of missing values is high. If the MCAR mechanism does not apply, multiple imputation have been proven as the most recommended methods. We used joint modelling multiple imputation for the analysis of initial data. Interestingly, the comparison of the results of the selected multiple imputation method with the results of the analysis on the available data did not show drastic differences between these two methods. A possible cause could be the weakness of MAR mechanism on the observed variable.
|