In modern public administration, data forms the foundation for quality analysis, policy development, and strategic decision-making. However, the growing amounts of data bring challenges such as slower processing and increasingly complex procedures. In this context, data sampling is gaining recognition as an effective solution, enabling reliable insights from smaller, representative datasets without the need to process entire data collections.
This thesis explores the methodological, technical, and legal dimensions of data sampling, with a focus on its practical application in public administration. Central to the research is the development of a prototype solution that enables sampling directly within relational databases using PostgreSQL and DuckDB. The study combines qualitative and quantitative approaches: the first part involves document analysis of the legal and theoretical sampling framework, while the second focuses on implementing selected algorithms, visualising results using open data, and testing the developed solution.
Findings confirm that sampling significantly reduces the volume of data to be processed, resulting in faster queries and lower processing costs. It also revealed that Slovenian public administration lacks clear guidelines for sampling, making analyses harder to compare. The results contribute to the advancement of the field and the improvement of methodological approaches to data sampling. The developed prototype facilitates more efficient data management in public administration by reducing processing volume, costs, and execution time of analyses. Furthermore, it encourages the broader adoption of sampling practices in the public administration.
|