In the Master's thesis we are interested in the survival of Slovenian Olympians in comparison with the general Slovenian population and especially the statistical methods used for estimation.
The work is divided into three major sections. The first part represents the process of collecting the data needed to perform a survival analysis. Due to the nature of sensitive personal data, the collection of data required considerable time, effort and usage of multiple data sources. In the second part we represent the characteristics of a survival analysis, a branch of statistics that deals with a specific form of incomplete data. We describe the characteristics of the methods of years lost and standardized mortality ratio, as well as the method of calculating parameter estimates and confidence intervals, which is also implemented in a function in software environment R. For the number of years lost we use theoretical confidence intervals, which do not take into account the variability of demographic factors, and bootstrap confidence intervals, which solve this problem. In the last part we estimate the survival of Slovenian Olympians compared with the general Slovenian population according to different periods of life and some other factors such as gender, year of appearance in olympic games, type of sports activity, etc., in order to understand the relationship between sports and life expectancy.
It turns out that Slovenian Olympians live better than the population in all stages of life. Both statistical methods prove to be appropriate for estimating the survival of the sample relative to the population. The use of theoretical confidence intervals for the number of years lost is particularly suitable for estimation in areas where the number of observation and event units is large enough, otherwise it is more appropriate to use bootstrap confidence interval estimates.
|