This thesis investigates the application of StableDiffusion model for adapting upper-body garments to the human body posture. The objective is to fine-tune a pretrained diffusion model that, given an input image of a shirt and a human pose, adjusts the garment to fit the body shape. The publicly available VITON-HD dataset is used as the primary data source, based on which four different models are trained, varying in image resolution and types of body poses. The training set also includes target images generated using the Segment Anything Model (SAM). Both qualitative and quantitative evaluations of the results are conducted, employing metrics such as \mbox{CLIP-IQA}, CLIP Score, FID, and KID. The findings demonstrate that all four models successfully deform the original garments to conform to the human body, although they show limitations in preserving the garment’s original characteristics.
|