Official Implementation of LARGE: Latent-Based Regression through GAN Semantics

Last update: Dec 06, 2022

Related tags

Deep Learning LARGE

Overview

LARGE: Latent-Based Regression through GAN Semantics

[Project Website] [Google Colab] [Paper]

Yotam Nitzan^*, Rinon Gal^*, Ofir Brenner, and Daniel Cohen-Or

Abstract: We propose a novel method for solving regression tasks using few-shot or weak supervision. At the core of our method is the fundamental observation that GANs are incredibly successful at encoding semantic information within their latent space, even in a completely unsupervised setting. For modern generative frameworks, this semantic encoding manifests as smooth, linear directions which affect image attributes in a disentangled manner. These directions have been widely used in GAN-based image editing. We show that such directions are not only linear, but that the magnitude of change induced on the respective attribute is approximately linear with respect to the distance traveled along them. By leveraging this observation, our method turns a pre-trained GAN into a regression model, using as few as two labeled samples. This enables solving regression tasks on datasets and attributes which are difficult to produce quality supervision for. Additionally, we show that the same latent-distances can be used to sort collections of images by the strength of given attributes, even in the absence of explicit supervision. Extensive experimental evaluations demonstrate that our method can be applied across a wide range of domains, leverage multiple latent direction discovery frameworks, and achieve state-of-the-art results in few-shot and low-supervision settings, even when compared to methods designed to tackle a single task.

Sorting Examples

Black to Blond hair

Age

Fur Fluffiness

Sickness

Credits

StyleGAN2 implementation:
https://github.com/rosinality/stylegan2-pytorch
Copyright (c) 2019 Kim Seonghyeon
License (MIT) https://github.com/rosinality/stylegan2-pytorch/blob/master/LICENSE

pSp model and implementation:
https://github.com/eladrich/pixel2style2pixel
Copyright (c) 2020 Elad Richardson, Yuval Alaluf
License (MIT) https://github.com/eladrich/pixel2style2pixel/blob/master/LICENSE

e4e model and implementation:
https://github.com/omertov/encoder4editing Copyright (c) 2021 omertov
License (MIT) https://github.com/omertov/encoder4editing/blob/main/LICENSE

ReStyle model and implementation:
https://github.com/yuval-alaluf/restyle-encoder/ Copyright (c) 2021 Yuval Alaluf
License (MIT) https://github.com/yuval-alaluf/restyle-encoder/blob/main/LICENSE

Acknowledgement

We would like to thank Raja Gyres, Yangyan Li, Or Patashnik, Yuval Alaluf, Amit Attia, Noga Bar and Zonzge Wu for helpful comments. We additionaly thank Zonzge Wu for the trained e4e models for AFHQ cats and dogs.

Citation

If you use this code for your research, please cite our papers.

@misc{nitzan2021large,
      title={LARGE: Latent-Based Regression through GAN Semantics}, 
      author={Yotam Nitzan and Rinon Gal and Ofir Brenner and Daniel Cohen-Or},
      year={2021},
      eprint={2107.11186},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Official Implementation of LARGE: Latent-Based Regression through GAN Semantics

Related tags

Overview

LARGE: Latent-Based Regression through GAN Semantics

[Project Website] [Google Colab] [Paper]

Sorting Examples

Credits

Acknowledgement

Citation

Owner

The code for "Deep Level Set for Box-supervised Instance Segmentation in Aerial Images".

The source codes for TME-BNA: Temporal Motif-Preserving Network Embedding with Bicomponent Neighbor Aggregation.

This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.

[ICCV2021] Official Pytorch implementation for SDGZSL (Semantics Disentangling for Generalized Zero-Shot Learning)

Exploit ILP to learn symmetry breaking constraints of ASP programs.

A customisable game where you have to quickly click on black tiles in order of appearance while avoiding clicking on white squares.

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

(3DV 2021 Oral) Filtering by Cluster Consistency for Large-Scale Multi-Image Matching

Create Own QR code with Python

A simple rest api serving a deep learning model that classifies human gender based on their faces. (vgg16 transfare learning)

Multi-resolution SeqMatch based long-term Place Recognition

Making Structure-from-Motion (COLMAP) more robust to symmetries and duplicated structures

Simple improvement of VQVAE that allow to generate x2 sized images compared to baseline

Paper: De-rendering Stylized Texts

Implementation of Deep Deterministic Policy Gradiet Algorithm in Tensorflow

Use Python, OpenCV, and MediaPipe to control a keyboard with facial gestures

Model Zoo for AI Model Efficiency Toolkit

MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Connecting Java/ImgLib2 + Python/NumPy