In the modern world, we daily face a flood of news. For easier searching, it is useful if the news are grouped according to related events. In the thesis, we present a methodology for clustering news by events. The methodology combines the use of text embeddings, a clustering algorithm and news filtering methods. We tested the methodology on a dataset of online news and evaluated it statisticaly and manualy. The results indicate that the news clusters primarily depict the same events. However, higher accuracy is accompanied by a substantial amount of non-clustered news.
|