Nowadays, we can say that it is quite popular, and at the same time it can be very useful, to study data related to medical examinations among patients. Studying such data can be very useful in modern medicine and can also improve the quality of health services. Today, probably all hospitals have medical databases for their patients, which include a lot of private data about patients, medical treatments, interventions, laboratory results, etc. However, in order to use this data to conduct medical research and analysis, you would have to get permission from hospitals and other institutions, which presents a problem for the people involved. In addition, such analyzes can sometimes cost a lot of money and time. The data must be anonymized and prepared in such a way that they preserve the statistical properties of the basic database.
In our master's thesis, we will review and adequately present several methods of generating synthetic data based on real data. Based on the review, we will select some of the best methods from the literature and implement them. We will use the implemented methods to generate synthetic data. The evaluation of the sample generation procedures will be carried out by comparing the statistical properties of the sample with the population properties. Based on the evaluation, we will assess which methods of generating synthetic data are the most successful.
|