Crowd counting is an important research topic in the field of computer vision. It is still difficult to accuratley count larger crowds of people at festivals, concerts, protests and choirs, where people are crowded together. In the past years crowd counting has advanced greatly with the help of deep neural networks. Deep learning methods are the most modern approach to crowd counting and estimating human density. Many of them occur in literature and the estimations are influaced by many factors, such as weather conditions, scene type, perspective and image resolution. In the diploma thesis we are interested in how two crowd counting methods work. We selected CSRNet in MCNN for our analysis. CSRNet (Congested Scene Recognition Network) is a method designed to count people in large crowds, which works on the principle of dialated convolution. The second method MCNN (Multi-column Convolutional Neural Network) uses three column convolutional neural networks to better recognize the different sizes of people in the image. Both methods were evaluated on both parts of the ShanghaiTech dataset and on the UCF-CC-50 dataset. The experiments were performed on all three collections, separating the data into additional sets so that we could analyze the influence of the angle of view and the type of light. Our analysis shows that, on average, the CSRNet method performs better. In the analysis of the angle of view, we come to the conclusion that the models have better results at a lower angle of view. With regard to the type of light factor, we can conclude that the models have good recognition of people both in natural light and in artificially created light.
|