Recently, significant progress has been made in the field of few-shot object counting. The current most successful global counters are based on predicting density maps, whose sum estimates the number of objects. However, a major drawback of these methods is that the results are not interpretable, as they do not provide object locations, which is crucial for many applications. In this thesis, we propose the CVDFC method, which uses a diffusion model to enhance the quality of density maps and convert them into precise object location points. The proposed model employs a conditional diffusion process to generate location points, and an object counting module performs non-maximum suppression (NMS) on the generated points, enabling accurate counting and localization of objects in the image. Experimental results showed that the CVDFC method outperforms the reference method LOCA combined with object localization via NMS by 30% in the task of object counting. CVDFC has also proven competitive compared to other methods, demonstrating its effectiveness and practical utility in few-shot object counting.
|