Augmented CLIP - Training simple models to predict CLIP image embeddings from text embeddings, and vice versa.

Last update: Sep 13, 2022

Related tags

Overview

Train aug_clip against laion400m-embeddings found here: https://laion.ai/laion-400-open-dataset/ - note that this used the base ViT-B/32 CLIP model.

Sample notebook adapted from Sadnow's 360Diffusion repo, thanks to all involved!

Latest revision: Beta 1.52 (10/11/21): https://colab.research.google.com/github/sadnow/360Diffusion/blob/main/360Diffusion_Public.ipynb

Latest highlights: Full compatibility for both 256 and 512 model for upscaling to 256,512,1024,2048, and 4096px.

Note that 4096 files aren’t quite as pretty as 2048, and they’re massive in file size. 2048 is appealing in most cases. If you intend on upscaling to anything higher than 1024, I recommend using the 512 diffusion model found in the settings-

Credits & Acknowledgements

Katherine Crowson (https://github.com/crowsonkb, https://twitter.com/RiversHaveWings)
Founder of OG Diffusion Notebook Original notebook founder; [I think] has a large involvement in both VQGAN and Diffusion!
Daniel Russell (https://github.com/russelldc, https://twitter.com/danielrussruss) Fast Diffusion Fork Founder Made the OG Fast Diffusion notebook.
Dango233 and nsheppard Contributed to Daniel’s Fast Diffusion Notebook
Sadnow (twitter.com/sadly_existent) 360Diffusion Fork Founder Forked Daniel Russel’s Fast Diffusion Notebook to include Real-ESRGAN integration-
airguitararchon (steven) Init Research
Everyone else on the VQLIPSE Discord (https://www.patreon.com/sportsracer48); Support & Research

Prior release(s): Implemented Daniel Russ’s Perlin revisions, fixed init_bug, 4096 double-pass, VRAM fixes, practical debug_mode (set to higher skip_timestep)

All edits & additions are welcome and appreciated~

Augmented CLIP - Training simple models to predict CLIP image embeddings from text embeddings, and vice versa.

Related tags

Overview

Train aug_clip against laion400m-embeddings found here: https://laion.ai/laion-400-open-dataset/ - note that this used the base ViT-B/32 CLIP model.

Sample notebook adapted from Sadnow's 360Diffusion repo, thanks to all involved!

Owner

Peter Baylies

Try out deep learning models online on Google Colab

Code for Neural-GIF: Neural Generalized Implicit Functions for Animating People in Clothing(ICCV21)

Train SN-GAN with AdaBelief

[NIPS 2021] UOTA: Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration.

This repo is a PyTorch implementation for Paper "Unsupervised Learning for Cuboid Shape Abstraction via Joint Segmentation from Point Clouds"

Official implementation of VaxNeRF (Voxel-Accelearated NeRF).

SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data

Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

Tensorflow implementation of Swin Transformer model.

The comma.ai Calibration Challenge!

Faune proche - Retrieval of Faune-France data near a google maps location

dataset for ECCV 2020 "Motion Capture from Internet Videos"

An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.

[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

M2MRF: Many-to-Many Reassembly of Features for Tiny Lesion Segmentation in Fundus Images

CrossNorm and SelfNorm for Generalization under Distribution Shifts (ICCV 2021)

Monocular Depth Estimation Using Laplacian Pyramid-Based Depth Residuals

This tool uses Deep Learning to help you draw and write with your hand and webcam.

Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery"

[ICCV 2021 Oral] Mining Latent Classes for Few-shot Segmentation