Generative Adversarial Networks – GANs have in recent years shown to be one of the most ground-breaking approaches for deep models used in generating semantic data. This is also true in the field of biometrics. The concept of two modules, generator and discriminator, that oppose each other during the training process, seems to be an incredibly powerful idea when designing a neural network architecture. In theory, it leads to an adversarial game between the two actors. This appears to be troublesome in practice, when training such systems in a stable manner, however, different methods and approaches enable successful training of such models.
Generative nature of GAN networks, that are successfully trained, practically allows us to use this kind of generated data in unlimited ways. Researchers have in recent years especially shown interest in generating and manipulating images of human faces. The best results on this field are achieved by models StyleGAN and its successor StyleGAN2, which are created specifically for generating hyper-realistic images of human faces with possibilities of additional manipulations of the generated images.
In this work we describe the development of our model for hairstyle manipulation on high resolution images of human faces, where we use different methods of image processing. Our approaches are based on usage of the disentangled latent space of the StyleGAN2 generator in which we first reconstruct an arbitrary real image. In our process of direct manipulation of the latent image projections, we use a conditional manipulation approach using hyperplanes, that we determine when training an SVM (Support Vector Machine) classificator on a labelled database. The model additionally allows for complete hairstyle transformation from a reference image onto the input image. We use different classical methods of imaging technologies in combination with modern approaches, that are mainly enabled by the powerful StyleGAN2 generator. Additionally, we emphasize the importance of face identity preservation when employing described methods. The latter is dealt with by using specific post-correction processes of final encoding and the transformation of key facial characteristics.
Lastly, we present our results and additionally analyse key elements of the model, that affect the success of our manipulations. We perform an ablation study, where we compare found results with some other models, that specifically allow for hairstyle manipulation on an image of a human face.
|