Finetune the base 64 px GLIDE-text2im model from OpenAI on your own image-text dataset

Last update: Oct 13, 2022

Related tags

Overview

glide-finetune

Finetune the base 64 px GLIDE-text2im model from OpenAI on your own image-text dataset.

Installation

git clone https://github.com/afiaka87/glide-finetune.git
cd glide-finetune/
python3 -m venv .venv # create a virtual environment to keep global install clean.
source .venv/bin/activate
(.venv) # optionally install pytorch manually for your own specific env first...
(.venv) python -m pip install -r requirements.txt

Usage

(.venv) python glide-finetune.py 
    --data_dir=./data \
    --batch_size=1 \
    --grad_acc=1 \
    --guidance_scale=4.0 \
    --learning_rate=2e-5 \
    --dropout=0.1 \
    --timestep_respacing=1000 \
    --side_x=64 \
    --side_y=64 \
    --resume_ckpt='' \
    --checkpoints_dir='./glide_checkpoints/' \
    --use_fp16 \
    --device='' \
    --freeze_transformer \
    --freeze_diffusion \
    --weight_decay=0.0 \
    --project_name='glide-finetune'

Known issues:

batching isn't handled in the dataloader
NaN/Inf errors
Resizing doesn't handle non-square aspect ratios properly
some of the code is messy, needs refactoring.

Comments

Fixed a couple of minor issues
Pinned webdataset version to work with python 3.7 which is the version being used in Colab, Kaggle. A new version of this module is releaed few days back which only works with 3.8/9

Fixed an issue with data_dir arg not getting picked up.
opened by vanga 1
Fix NameError when using --data_dir
Hello and thank you for your great work.

Right now using a local data folder with --data_dir results in

Traceback (most recent call last): File "/content/glide-finetune/train_glide.py", line 292, in <module> data_dir=data_dir, NameError: name 'data_dir' is not defined

This PR fixes that.
opened by tillfalko 0

mention mpi4py dependency

mpi4py installation will fail unless the user has this package installed. Since MPI is not a ubiquitous dependency it should probably be mentioned. Edit: Since torch==1.10.1 is a requirement, and torch versions come with their own cuda versions (torch 1.10.1 uses cuda 10.2), I don't see a reason not to just include bitsandbytes-cuda102 in requirements.txt.

$ py -m venv .venv
$ source .venv/bin/activate
$ pip install torch==1.10.1
Collecting torch==1.10.1
  Downloading torch-1.10.1-cp39-cp39-manylinux1_x86_64.whl (881.9 MB)
     |████████████████████████████████| 881.9 MB 15 kB/s
Collecting typing-extensions
  Downloading typing_extensions-4.0.1-py3-none-any.whl (22 kB)
Installing collected packages: typing-extensions, torch
Successfully installed torch-1.10.1 typing-extensions-4.0.1
$ py -c "import torch; print(torch.__version__)"
1.10.1+cu102

opened by tillfalko 0

Fixed half precision optimizer bug

Problem

In half precision, after the first iteration nan values start appearing regardless of input data or gradients since the adam optimizer breaks in float16. The discussion for that can be viewed here.

Solution

This can be fixed by setting the eps variable to 1e-4 instead of the default 1e-8. This is the only thing this pr does

opened by isamu-isozaki 0
Training on half precision leads to nan values

I was training my model and I noticed that after just the first iteration I was running into nan values. As it turns out my gradients and input values/images were all normal but the adam optimizer by pytorch does has some weird behavior on float16 precision where it produces nans probably because of a divide by 0 error. A discussion can be found below

https://discuss.pytorch.org/t/adam-half-precision-nans/1765/4

I hear changing the epison parameter for the adam weights parameter when on half precisions works but I haven't tested it yet. Will make one once I tested.

And also let me say thanks for this repo. I wanted to fine tune the glide model and this made it so much easier.

opened by isamu-isozaki 1
Where is the resume_ckpt

Hi, thanks for your job.

I noticed to finetune the glide, we should have a base_model, namely "resume_ckpt". --resume_ckpt 'ckpt_to_resume_from.pt'
Where can we get this model? Because I find Glide also didn't provide any checkpoint. Thanks for your help.

opened by zhaobingbingbing 0

Releases(v0.0.1)

v0.0.1(Feb 20, 2022)
Having some experience with finetuning GLIDE on laion/alamy, etc. I think this code works great now and hope as many people can use it as possible. Please file bugs - I know there may be a few.

New additions:

dataloader for LAION400M

dataloader for alamy

train the upsample model instead of just the base model

(early) code for training the released noisy CLIP. still a WIP.

Source code(tar.gz)
Source code(zip)

Owner

Clay Mullis

Software engineer working with multi-modal deep learning.

GitHub Repository

IsoGCN code for ICLR2021

IsoGCN The official implementation of IsoGCN, presented in the ICLR2021 paper Isometric Transformation Invariant and Equivariant Graph Convolutional N

39 Nov 25, 2022

Decision Transformer: A brand new Offline RL Pattern

DecisionTransformer_StepbyStep Intro Decision Transformer: A brand new Offline RL Pattern. 这是关于NeurIPS 2021 热门论文Decision Transformer的复现。 👍 原文地址: Deci

14 Nov 22, 2022

Official implementation of "A Unified Objective for Novel Class Discovery", ICCV2021 (Oral)

A Unified Objective for Novel Class Discovery This is the official repository for the paper: A Unified Objective for Novel Class Discovery Enrico Fini

118 Dec 26, 2022

Colour detection is necessary to recognize objects, it is also used as a tool in various image editing and drawing apps.

Colour Detection On Image Colour detection is the process of detecting the name of any color. Simple isn’t it? Well, for humans this is an extremely e

1 Jan 13, 2022

This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"

This is an official pytorch implementation of ActionCLIP: A New Paradigm for Video Action Recognition [arXiv] Overview Content Prerequisites Data Prep

268 Jan 09, 2023

Using PyTorch Perform intent classification using three different models to see which one is better for this task

1 Feb 14, 2022

COCO Style Dataset Generator GUI

A simple GUI-based COCO-style JSON Polygon masks' annotation tool to facilitate quick and efficient crowd-sourced generation of annotation masks and bounding boxes. Optionally, one could choose to us

142 Dec 09, 2022

Code for "Infinitely Deep Bayesian Neural Networks with Stochastic Differential Equations"

Infinitely Deep Bayesian Neural Networks with SDEs This library contains JAX and Pytorch implementations of neural ODEs and Bayesian layers for stocha

95 Nov 26, 2021

A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

CLEVR Dataset Generation This is the code used to generate the CLEVR dataset as described in the paper: CLEVR: A Diagnostic Dataset for Compositional

503 Jan 04, 2023

Dynamic Environments with Deformable Objects (DEDO)

DEDO - Dynamic Environments with Deformable Objects DEDO is a lightweight and customizable suite of environments with deformable objects. It is aimed

32 Dec 22, 2022

RINDNet: Edge Detection for Discontinuity in Reflectance, Illumination, Normal and Depth, in ICCV 2021 (oral)

RINDNet RINDNet: Edge Detection for Discontinuity in Reflectance, Illumination, Normal and Depth Mengyang Pu, Yaping Huang, Qingji Guan and Haibin Lin

75 Dec 15, 2022

Incorporating Transformer and LSTM to Kalman Filter with EM algorithm

Deep learning based state estimation: incorporating Transformer and LSTM to Kalman Filter with EM algorithm Overview Kalman Filter requires the true p

57 Dec 27, 2022

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation This repository contains the official implementation of our paper: Self-su

132 Dec 21, 2022

PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

48 Dec 08, 2022

Deep generative modeling for time-stamped heterogeneous data, enabling high-fidelity models for a large variety of spatio-temporal domains.

Neural Spatio-Temporal Point Processes [arxiv] Ricky T. Q. Chen, Brandon Amos, Maximilian Nickel Abstract. We propose a new class of parameterizations

75 Dec 19, 2022

Scripts for training an AI to play the endless runner Subway Surfers using a supervised machine learning approach by imitation and a convolutional neural network (CNN) for image classification

About subwAI subwAI - a project for training an AI to play the endless runner Subway Surfers using a supervised machine learning approach by imitation

82 Jan 01, 2023