izpis_h1_title_alt

FICE : text-conditioned fashion-image editing with guided GAN inversion
ID Pernuš, Martin (Avtor), ID Fookes, Clinton (Avtor), ID Štruc, Vitomir (Avtor), ID Dobrišek, Simon (Avtor)

.pdfPDF - Predstavitvena datoteka, prenos (4,18 MB)
MD5: 159EEA326A40F18F0E79952FA1615B4F
URLURL - Izvorni URL, za dostop obiščite https://www.sciencedirect.com/science/article/pii/S0031320324007738 Povezava se odpre v novem oknu

Izvleček
Fashion-image editing is a challenging computer-vision task where the goal is to incorporate selected apparel into a given input image. Most existing techniques, known as Virtual Try-On methods, deal with this task by first selecting an example image of the desired apparel and then transferring the clothing onto the target person. Conversely, in this paper, we consider editing fashion images with text descriptions. Such an approach has several advantages over example-based virtual try-on techniques: (i) it does not require an image of the target fashion item, and (ii) it allows the expression of a wide variety of visual concepts through the use of natural language. Existing image-editing methods that work with language inputs are heavily constrained by their requirement for training sets with rich attribute annotations or they are only able to handle simple text descriptions. We address these constraints by proposing a novel text-conditioned editing model called FICE (Fashion Image CLIP Editing) that is capable of handling a wide variety of diverse text descriptions to guide the editing procedure. Specifically, with FICE, we extend the common GAN-inversion process by including semantic, pose-related, and image-level constraints when generating images. We leverage the capabilities of the CLIP model to enforce the text-provided semantics, due to its impressive image–text association capabilities. We furthermore propose a latent-code regularization technique that provides the means to better control the fidelity of the synthesized images. We validate the FICE through rigorous experiments on a combination of VITON images and Fashion-Gen text descriptions and in comparison with several state-of-the-art, text-conditioned, image-editing approaches. Experimental results demonstrate that the FICE generates very realistic fashion images and leads to better editing than existing, competing approaches. The source code is publicly available from: https://github.com/MartinPernus/FICE.

Jezik:Angleški jezik
Ključne besede:text-conditioning, GAN inversion, image editing, generative artificial intelligence, generative adversarial networks, deep learning, multimodality
Vrsta gradiva:Članek v reviji
Tipologija:1.01 - Izvirni znanstveni članek
Organizacija:FE - Fakulteta za elektrotehniko
Status publikacije:Objavljeno
Različica publikacije:Objavljena publikacija
Leto izida:2025
Št. strani:18 str.
Številčenje:Vol. 158, art. 111022
PID:20.500.12556/RUL-162586 Povezava se odpre v novem oknu
UDK:004.93
ISSN pri članku:0031-3203
DOI:10.1016/j.patcog.2024.111022 Povezava se odpre v novem oknu
COBISS.SI-ID:207863555 Povezava se odpre v novem oknu
Datum objave v RUL:25.09.2024
Število ogledov:126
Število prenosov:67
Metapodatki:XML DC-XML DC-RDF
:
Kopiraj citat
Objavi na:Bookmark and Share

Gradivo je del revije

Naslov:Pattern recognition
Skrajšan naslov:Pattern recogn.
Založnik:Elsevier
ISSN:0031-3203
COBISS.SI-ID:26103040 Povezava se odpre v novem oknu

Licence

Licenca:CC BY 4.0, Creative Commons Priznanje avtorstva 4.0 Mednarodna
Povezava:http://creativecommons.org/licenses/by/4.0/deed.sl
Opis:To je standardna licenca Creative Commons, ki daje uporabnikom največ možnosti za nadaljnjo uporabo dela, pri čemer morajo navesti avtorja.

Sekundarni jezik

Jezik:Slovenski jezik
Ključne besede:besedilno pogojevanje, invertiranje GAN modelov, urejanje slik, generativna umetna inteligenca

Projekti

Financer:ARRS - Agencija za raziskovalno dejavnost Republike Slovenije
Številka projekta:P2-0250
Naslov:Metrologija in biometrični sistemi

Financer:ARRS - Agencija za raziskovalno dejavnost Republike Slovenije
Številka projekta:J2-2501
Naslov:Globoki generativni modeli za lepotno in modno industrijo (DeepBeauty)

Financer:ARRS - Agencija za raziskovalno dejavnost Republike Slovenije
Program financ.:Young researchers

Podobna dela

Podobna dela v RUL:
Podobna dela v drugih slovenskih zbirkah:

Nazaj