Your browser does not allow JavaScript!
JavaScript is necessary for the proper functioning of this website. Please enable JavaScript or use a modern browser.
Repository of the University of Ljubljana
Open Science Slovenia
Open Science
DiKUL
slv
|
eng
Search
Browse
New in RUL
About RUL
In numbers
Help
Sign in
Details
Representing visual entities with deep hierarchical and compositional models
ID
Tabernik, Domen
(
Author
),
ID
Leonardis, Aleš
(
Mentor
)
More about this mentor...
,
ID
Kristan, Matej
(
Comentor
)
PDF - Presentation file,
Download
(10,00 MB)
MD5: C46C180284D24D5058D68468CD038758
Image galllery
Abstract
The doctoral thesis explores two prominent hierarchical approaches for the modeling of visual entities: (a) compositional hierarchies and (b) deep neural networks. Both approaches are explored in detail together with their advantages and disadvantages. In compositional hierarchies, poor discriminative power is identified as a major limiting factor, which is address with a novel discriminative feature, termed Histogram of Compositions, proposed in the first part of this thesis. HoC is shown to successfully capture important discriminative information to improve classification accuracy. The second part of the thesis highlights the lack of a spatial relationship between features as an important limitation of deep convolutional networks (ConvNets). This limitation leads to rigid and non-learnable receptive field sizes, poor utilization of parameters and low flexibility of deep architectures. All of those problems are addressed by introducing the explicit compositional structure into deep neural networks, which is implemented with the proposed novel filter unit for ConvNets, termed Displaced Aggregation Unit. DAUs enable novel properties for deep models: (a) the decoupling of the parameters from the receptive field, (b) the learning of the receptive field sizes and (c) the automatic adjustment of the spatial focus of features. The benefits of DAUs are demonstrated on three practical problems: image classification, semantic segmentation and blind image de-blurring. In all cases, the inclusion of DAUs into modern architectures enables simpler networks with fewer number of operations and parameters, significantly reduces the manual modification of architectures for specific tasks and domains while it also retains or even improves the overall prediction accuracy.
Language:
English
Keywords:
compositional hierarchies
,
histogram of compositions
,
displaced aggregation units
,
deep neural networks
,
visual image recognition
,
semantic image segmentation
,
de-blurring
Work type:
Doctoral dissertation
Typology:
2.08 - Doctoral Dissertation
Organization:
FRI - Faculty of Computer and Information Science
Year:
2021
PID:
20.500.12556/RUL-126916
COBISS.SI-ID:
62595075
Publication date in RUL:
10.05.2021
Views:
2688
Downloads:
184
Metadata:
Cite this work
Plain text
BibTeX
EndNote XML
EndNote/Refer
RIS
ABNT
ACM Ref
AMA
APA
Chicago 17th Author-Date
Harvard
IEEE
ISO 690
MLA
Vancouver
:
TABERNIK, Domen, 2021,
Representing visual entities with deep hierarchical and compositional models
[online]. Doctoral dissertation. [Accessed 18 March 2025]. Retrieved from: https://repozitorij.uni-lj.si/IzpisGradiva.php?lang=eng&id=126916
Copy citation
Share:
Secondary language
Language:
Slovenian
Title:
Reprezentacija vizualnih entitet z globokimi hierarhičnimi in kompozicionalnimi modeli
Abstract:
Doktorska disertacija obravnava dva pomembna hierarhična pristopa za modeliranje vizualnih entitet: (a) kompozicijsko hierarhijo in (b) globoke nevronske mreže. Oba pristopa sta podrobno ovrednotena skupaj z njunimi prednosti in slabosti. V kompozicijski hierarhiji je kot glavna pomanjkljivost naslovljena slaba diskriminativna moč, kar je obravnavano v prvem delu disertacije. Predlagana je nova diskriminativna značilka, imenovana Histogram Kompozicij (ang. Histogram of Compositons - HoC), ki uspešno zajame pomembne diskriminativne informacije za izboljšanje natančnosti klasifikacije. V drugem delu disertacije je v globokih konvolucijskih mrežah (ConvNet) kot pomembna pomanjkljivost izpostavljena slaba prostorska relacija med značilkami. Slednje pripelje do rigidnih in ne-učljivih velikosti dovzetnih polij, do slabe izkoriščenosti parametrov ter do nizke fleksibilnosti globokih arhitektur. Omenjeni problemi so naslovljeni z integracijo eksplicitne kompozicijske strukture v globoke nevronske mreže. V ta namen je predstavljena nova enota filtra za konvolucijske mreže, imenovana premikajoča agregacijska enota (ang. Displaced Aggregation Unit - DAU), ki omogoči vpeljavo novih lastnosti v globoke mreže: (a) neodvisnost števila parametrov od dovzetnega polja, (b) učenje velikosti dovzetnega polja in (c) samodejno prilagajanje prostorskega fokusa značilk. Prednosti filtra DAU so prikazane na treh praktičnih problemih: klasifikacija slik, semantična segmentacija slik ter razmeglejevanje slik. V vseh primerih vključitev filtra DAU v sodobne arhitekture omogoči enostavnejše globoke mreže z manjšim številom operacij in parametrov ter z manjšo potrebo po ročni modifikaciji arhitekture za specifične naloge in domene, hkrati pa ohranja ali celo izboljša klasifikacijsko točnost.
Keywords:
kompozicionalne hierarhije
,
histogram kompozicij
,
premikajoča agregacijska enota
,
globoke nevronske mreže
,
vizualno razpoznavanje slik
,
semantična segmentacija slik
,
razmeglejevanje slik
Similar documents
Similar works from RUL:
Izboljšan vizualni model za sledenje s segmentacijo
Sledenje objektov s segmentacijo in napovedovanjem globinskih barvnih slik
Siamski sledilnik s segmentacijo za robustno lokalizacijo tarče
Segmentacija lebdečih predmetov v podatkih LIDAR
Destilacija znanja globokih modelov za biometrijo beločnice
Similar works from other Slovenian collections:
No similar works found
Back