A semi-automatic video object segmentation method

Pelhan, Jer

A semi-automatic video object segmentation method
ID Pelhan, Jer (Author), ID Kristan, Matej (Mentor) More about this mentor... This link opens in a new window

PDF - Presentation file, Download (32,55 MB)
MD5: D825D0B1D2F2E895FF418EF6DFAABDB8

Abstract

Visual object tracking has recently shifted towards target segmentation, which has increased the demand for video datasets with objects segmented in each frame. However, manually obtaining large segmented video datasets is time-consuming and costly. We address this problem by introducing a Semi-supervised Annotation by Tracking algorithm (SAT), which is specialized for target segmentation specifically for visual object tracking domain with minimal user input. The annotation pipeline is split into two modules. The anchor frame segmentation module predicts a segmentation mask by few (approximately four) user clicks on the object of interest. The module is used to segment the target in a subset of frames, anchors, throughout the sequence. Then a mask propagation module propagates the segmentation masks from the anchors to the in-between frames. On the VOT dataset, SAT achieves an IoU of 73% already at 5% of user annotated frames and outperforms the winner of the DAVIS2020 challenge IVOS and the winner of DAVIS2018 challenge IVS by 40% and 67%, respectively and shortens the annotation time by 98%. On the DAVIS interactive challenges, SAT performs comparably to the state-of-the-art in video object segmentation.

Language:	English
Keywords:	convolutional neural network, video object segmentation, video object tracking
Work type:	Bachelor thesis/paper
Typology:	2.11 - Undergraduate Thesis
Organization:	FRI - Faculty of Computer and Information Science
Year:	2021
PID:	20.500.12556/RUL-127022
COBISS.SI-ID:	63110659
Publication date in RUL:	13.05.2021
Views:	1178
Downloads:	197
Metadata:
:	Copy citation
Share:

Secondary language

Abstract:
Language:	Slovenian
Title:	Delno-avtomatska metoda za segmentacijo objekta v videoposnetku
Na področju vizualnega sledenja se je pred kratkim zaradi hitrega razvoja uveljavilo poročanje lokacije tarče s segmentacijskimi maskami, kar je povečalo zahtevo po popolnoma segmentiranih zbirkah videoposnetkov. Postopek ročne anotacije zbirk videoposnetkov je dolgotrajen in drag, zato v diplomskem delu naslovimo prav ta problem. Predstavimo metodo za pol-avtomatsko segmentacijo objektov na videposnetku SAT, specializirano za učinkovito anotiranje videoposnetkov vizualnega sledenja. Segmentiranje videoposnetka smo razdelili na dva modula. Prvi modul učinkovito segmentira objekte na pozameznih slikah, saj za oceno segmentacijske maske potrebuje zgolj nekaj klikov na rob objekta. Drugi modul, ki temelji na pred kratkim predstavljenim sledilnikom D3S, pa skrbi za prenos mask na preostale slike videoposnetka. Na podatkovni zbirki VOT2020 metoda SAT doseže IoU 73%, z zgolj 5% anotiranih slik, kar je 40% izboljšava v primerjavi z zmagovalno metodo interaktivnega izziva DAVIS2020, IVOS, in kar 67% izboljšava v primerjavi z zmagovalno metodo interaktivnega izziva DAVIS2018, IVS. SAT skrajša čas ročnega anotiranja videoposnetka za kar 98%. Na DAVIS interaktivnem izzivu SAT doseže rezultate, ki so primerljivi z naprednimi metodami na področju segmentacije videoposnetkov.
Keywords:	konvolucijske nevronske mreže, segmentacija videoposnetka, sledenje objektom

Similar works from RUL:
Similar works from other Slovenian collections:

Secondary language

Similar documents