FaceStylizer (MediaPipe)
MediaPipe FaceStylizer is an end-to-end pipeline that transfers a raw face image to a stylized face by one-shot fine-tuning.MediaPipe FaceStylizer is an end-to-end pipeline that transfers a raw face image to a stylized face using one-shot fine tuning. To enable such a face stylization pipeline, we built the pipeline with a GAN inversion encoder and efficient face generator model. The encoder and generator pipeline can then be adapted to different styles via a few-shot learning process.
To fine-tune the mode, first send one or several image samples to MediaPipe FaceStylizer. The fine-tuning process freezes the encoder module and only fine tunes the generator. The training process samples multiple latent codes close to the encoding output of the input style images as the input to the generator. The generator is then trained to reconstruct an image of a personâs face in the style of the input style image by optimizing a joint adversarial loss function that also accounts for style and content. With such a fine tuning process, the MediaPipe FaceStylizer can adapt to the customized style, which approximates the input. It can then be applied to stylize test images of real human faces.
Encoder: The encoder is used to map input images to the latent space of the generator. The encoder is defined by a MobileNet V2 backbone and trained with natural face images. The loss is defined as a combination of image perceptual quality loss and L1 loss. The image perceptual quality loss includes the content reconstruction loss that regulates the image reconstruction quality, style loss that controls the style difference between input and output, and an embedding loss. The image perceptual quality loss is calculated in the VGG feature space of the input and output images.
Decoder: The decoder is designed based on the StyleGAN family model which maps the encoded latent code to the image space. We optimize the synthesis network in the StyleGAN model to significantly reduce the complexity and maintain the high quality for on-device face generation.
MediaPipe FaceStylizer is intended for on-device use cases. Using the notebook, you can create a custom FaceStylizer model with your own data, which can be deployed on-device (Android, iOS, Web, desktop, etc) using MediaPipe Tasks FaceStylizer. Use MediaPipe Studio to evaluate the model through interactive live demo.
This model can be used in a notebook. Click Open notebook to use the model in Colab.
This model checkpoint was pre-trained on the Google open source web dataset.
Training stylized image is resized/rescaled to the same resolution (256, 256) and normalized by mean and standard deviation values of 127.5. For more details, see the notebook.
Given a image, the model will output an stylized RGB image of the resolution (256, 256) and pixel value scaled between [0, 1]
swap_layers
, learning_rate
, epochs
, etc. to achieve the best performance. See the colab for the details of the hyperparameter setting.Resource ID | Release date | Release stage | Description |
---|---|---|---|
mediapipe/face-stylizer | 2024-04-01 | General Availability | Fine tuning and on-device serving |
Google Cloud Console has failed to load JavaScript sources from www.gstatic.com.
Possible reasons are: