<?xml version="1.0"?>
<metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/elements/1.1/"><dc:title>Representing visual entities with deep hierarchical and compositional models</dc:title><dc:creator>Tabernik,	Domen	(Avtor)
	</dc:creator><dc:creator>Leonardis,	Aleš	(Mentor)
	</dc:creator><dc:creator>Kristan,	Matej	(Komentor)
	</dc:creator><dc:subject>compositional hierarchies</dc:subject><dc:subject>histogram of compositions</dc:subject><dc:subject>displaced aggregation units</dc:subject><dc:subject>deep neural networks</dc:subject><dc:subject>visual image recognition</dc:subject><dc:subject>semantic image segmentation</dc:subject><dc:subject>de-blurring</dc:subject><dc:description>The doctoral thesis explores two prominent hierarchical approaches for the modeling of visual entities: (a) compositional hierarchies and (b) deep neural networks. Both approaches are explored in detail together with their advantages and disadvantages. In compositional hierarchies, poor discriminative power is identified as a major limiting factor, which is address with a novel discriminative feature, termed Histogram of Compositions, proposed in the first part of this thesis. HoC is shown to successfully capture important discriminative information to improve classification accuracy. The second part of the thesis highlights the lack of a spatial relationship between features as an important limitation of deep convolutional networks (ConvNets). This limitation leads to rigid and non-learnable receptive field sizes, poor utilization of parameters and low flexibility of deep architectures. All of those problems are addressed by introducing the explicit compositional structure into deep neural networks, which is implemented with the proposed novel filter unit for ConvNets, termed Displaced Aggregation Unit. DAUs enable novel properties for deep models: (a) the decoupling of the parameters from the receptive field, (b) the learning of the receptive field sizes and (c) the automatic adjustment of the spatial focus of features. The benefits of DAUs are demonstrated on three practical problems: image classification, semantic segmentation and blind image de-blurring. In all cases, the inclusion of DAUs into modern architectures enables simpler networks with fewer number of operations and parameters, significantly reduces the manual modification of architectures for specific tasks and domains while it also retains or even improves the overall prediction accuracy.</dc:description><dc:date>2021</dc:date><dc:date>2021-05-10 13:49:29</dc:date><dc:type>Doktorsko delo/naloga</dc:type><dc:identifier>126916</dc:identifier><dc:identifier>VisID: 20805</dc:identifier><dc:identifier>COBISS_ID: 62595075</dc:identifier><dc:language>sl</dc:language></metadata>
