This Bachelor’s Thesis describes the process of collecting and visualizing data from different sources. There are three different data sources. The first source is from the Statistical office of Slovenia where there is data about the number of baby names occurring from 1992 to 2017. The second sourse is the IMDb database, which has data about actors and movies. The third data source is the free Wikipedia encyclopedia , which holds interesting data about names.
To be able to merge all the datasources requires a great range of frameworks. For importing the data, the programming language Python is used. For data storage about the number of babynames, the unrelation database Elasticsearch is used. For the exchange of data which is stored on the inter- net or on local machine servers, either Python or Node.js. are implemented. In addition, the basic web technologies JavaScript and D3.js are the main tools for data visualization.
|