The portal Odprti Podatki Slovenije (OPSI) was established by the Ministry of Public Administration with the aim to provide an integrated listing of databases managed by the bodies of the public sector and to allow easy publication of data collections in form of open data. All the data provided on the portal are records and collections, which are created by the public sector bodies and are freely accessible.
We have preprocessed the freely accessible data and formatted it into the format for Orange data mining, which is a program for machine learning and data visualization. We provide statistics on the available data in the OPSI portal. The interface automatically transforms 788 files out of 19565. The largest source of files, namely 375, is the area of the Government and the public sector. The interface automatically converts data in 1038 columns total, which are grouped into four categories: discrete (464), string (257), continuous (180) and time (137). Most of the files (13765 files) that the interface cannot transform automatically are in the html format.
With the interface we can provide comments or improvements regarding the published data and report any irregularities of the portal and data. The data provided by the interface can be mined for interesting patterns using the Orange data mining software.
|