This thesis addresses short-term visual object tracking by deformable parts models (DPM). The DPMs show a great potential in addressing non-rigid object deformations and self-occlusions, but according to recent benchmarks, they often lag behind the holistic approaches, which model an object with a single appearance model. The reason is that potentially large number of parameters in constellation needs to be estimated for target localization and simplifications of the constellation topology are often assumed to make the inference tractable. Furthermore, the visual model and geometric constraints are usually combined in an ad-hoc fashion. In contrast to related approaches, we present a generative model that jointly treats contributions of the visual and of the geometric model as a single physics-based spring system with a convex energy function. An efficient optimization method is proposed for this dual form that allows MAP inference of a fully-connected constellation model. The proposed optimization method is compared to the existing optimization approach and outperforms it in terms of stability and efficiency. In the thesis we propose a part-based tracker that combines two visual representations of the target, i.e., coarse and mid-level representation. The proposed optimization method is used for target localization on the mid-level representation. The resulting tracker is rigorously analyzed on a highly challenging VOT2014 benchmark, it outperforms the related part-based and holistic trackers including the winner of the VOT2014 challenge and runs in real-time. The design of the proposed tracker is analyzed by an analysis of each component of the tracker.
|