pix2pix in tensorflow.js

Overview

pix2pix in tensorflow.js

This repo is moved to https://github.com/yining1023/pix2pix_tensorflowjs_lite

See a live demo here: https://yining1023.github.io/pix2pix_tensorflowjs/

Screen_Shot_2018_06_17_at_11_06_09_PM

Try it yourself: Download/clone the repository and run it locally:

git clone https://github.com/yining1023/pix2pix_tensorflowjs.git
cd pix2pix_tensorflowjs
python3 -m http.server

Credits: This project is based on affinelayer's pix2pix-tensorflow. I want to thank christopherhesse, nsthorat, and dsmilkov for their help and suggestions from this Github issue.

How to train a pix2pix(edges2xxx) model from scratch

    1. Prepare the data
    1. Train the model
    1. Test the model
    1. Export the model
    1. Port the model to tensorflow.js
    1. Create an interactive interface in the browser

1. Prepare the data

  • 1.1 Scrape images from google search
  • 1.2 Remove the background of the images
  • 1.3 Resize all images into 256x256 px
  • 1.4 Detect edges of all images
  • 1.5 Combine input images and target images
  • 1.6 Split all combined images into two folders: train and val

Before we start, check out affinelayer's Create your own dataset. I followed his instrustion for steps 1.3, 1.5 and 1.6.

1.1 Scrape images from google search

We can create our own target images. But for this edge2pikachu project, I downloaded a lot of images from google. I'm using this google_image_downloader to download images from google. After downloading the repo above, run -

$ python image_download.py <query> <number of images>

It will download images and save it to the current directory.

1.2 Remove the background of the images

Some images have some background. I'm using grabcut with OpenCV to remove background Check out the script here: https://github.com/yining1023/pix2pix-tensorflow/blob/master/tools/grabcut.py To run the script-

$ python grabcut.py <filename>

It will open an interactive interface, here are some instructions: https://github.com/symao/InteractiveImageSegmentation Here's an example of removing background using grabcut:

Screen Shot 2018 03 13 at 7 03 28 AM

1.3 Resize all images into 256x256 px

Download pix2pix-tensorflow repo. Put all images we got into photos/original folder Run -

$ python tools/process.py --input_dir photos/original --operation resize --output_dir photos/resized

We should be able to see a new folder called resized with all resized images in it.

1.4 Detect edges of all images

The script that I use to detect edges of images from one folder at once is here: https://github.com/yining1023/pix2pix-tensorflow/blob/master/tools/edge-detection.py, we need to change the path of the input images directory on line 31, and create a new empty folder called edges in the same directory. Run -

$ python edge-detection.py

We should be able to see edged-detected images in the edges folder. Here's an example of edge detection: left(original) right(edge detected)

0_batch2 0_batch2_2

1.5 Combine input images and target images

python tools/process.py --input_dir photos/resized --b_dir photos/blank --operation combine --output_dir photos/combined

Here is an example of the combined image: Notice that the size of the combined image is 512x256px. The size is important for training the model successfully.

0_batch2

Read more here: affinelayer's Create your own dataset

1.6 Split all combined images into two folders: train and val

python tools/split.py --dir photos/combined

Read more here: affinelayer's Create your own dataset

I collected 305 images for training and 78 images for testing.

2. Train the model

# train the model
python pix2pix.py --mode train --output_dir pikachu_train --max_epochs 200 --input_dir pikachu/train --which_direction BtoA

Read more here: https://github.com/affinelayer/pix2pix-tensorflow#getting-started

I used the High Power Computer(HPC) at NYU to train the model. You can see more instruction here: https://github.com/cvalenzuela/hpc. You can request GPU and submit a job to HPC, and use tunnels to tranfer large files between the HPC and your computer.

The training takes me 4 hours and 16 mins. After train, there should be a pikachu_train folder with checkpoint in it. If you add --ngf 32 --ndf 32 when training the model: python pix2pix.py --mode train --output_dir pikachu_train --max_epochs 200 --input_dir pikachu/train --which_direction BtoA --ngf 32 --ndf 32, the model will be smaller 13.6 MB, and it will take less time to train.

3. Test the model

# test the model
python pix2pix.py --mode test --output_dir pikachu_test --input_dir pikachu/val --checkpoint pikachu_train

After testing, there should be a new folder called pikachu_test. In the folder, if you open the index.html, you should be able to see something like this in your browser:

Screen_Shot_2018_03_15_at_8_42_48_AM

Read more here: https://github.com/affinelayer/pix2pix-tensorflow#getting-started

4. Export the model

python pix2pix.py --mode export --output_dir /export/ --checkpoint /pikachu_train/ --which_direction BtoA

It will create a new export folder

5. Port the model to tensorflow.js

I followed affinelayer's instruction here: https://github.com/affinelayer/pix2pix-tensorflow/tree/master/server#exporting

cd server
python tools/export-checkpoint.py --checkpoint ../export --output_file static/models/pikachu_BtoA.pict

We should be able to get a file named pikachu_BtoA.pict, which is 54.4 MB. If you add --ngf 32 --ndf 32 when training the model: python pix2pix.py --mode train --output_dir pikachu_train --max_epochs 200 --input_dir pikachu/train --which_direction BtoA --ngf 32 --ndf 32, the model will be smaller 13.6 MB, and it will take less time to train.

6. Create an interactive interface in the browser

Copy the model we get from step 5 to the models folder.

Owner
Yining Shi
Creative Coding 👩‍💻+ Machine Learning 🤖
Yining Shi
EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers

EntityQuestions This repository contains the EntityQuestions dataset as well as code to evaluate retrieval results from the the paper Simple Entity-ce

Princeton Natural Language Processing 119 Sep 28, 2022
[NeurIPS '21] Adversarial Attacks on Graph Classification via Bayesian Optimisation (GRABNEL)

Adversarial Attacks on Graph Classification via Bayesian Optimisation @ NeurIPS 2021 This repository contains the official implementation of GRABNEL,

Xingchen Wan 12 Dec 23, 2022
Using Clinical Drug Representations for Improving Mortality and Length of Stay Predictions

Using Clinical Drug Representations for Improving Mortality and Length of Stay Predictions Usage Clone the code to local. https://github.com/tanlab/MI

Computational Biology and Machine Learning lab @ TOBB ETU 3 Oct 18, 2022
Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"

FAME: Feature-based Adversarial Meta-Embeddings This is the companion code for the experiments reported in the paper "FAME: Feature-Based Adversarial

Bosch Research 11 Nov 27, 2022
DeepLabv3+:Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现

DeepLabv3+:Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现 目录 性能情况 Performance 所需环境 Environment 注意事项 Attention 文件下载 Download

Bubbliiiing 31 Nov 25, 2022
RoBERTa Marathi Language model trained from scratch during huggingface 🤗 x flax community week

RoBERTa base model for Marathi Language (मराठी भाषा) Pretrained model on Marathi language using a masked language modeling (MLM) objective. RoBERTa wa

Nipun Sadvilkar 23 Oct 19, 2022
Code for Domain Adaptive Video Segmentation via Temporal Consistency Regularization in ICCV 2021

Domain Adaptive Video Segmentation via Temporal Consistency Regularization Updates 08/2021: check out our domain adaptation for sematic segmentation p

36 Dec 12, 2022
Vehicle Detection Using Deep Learning and YOLO Algorithm

VehicleDetection Vehicle Detection Using Deep Learning and YOLO Algorithm Dataset take or find vehicle images for create a special dataset for fine-tu

Maryam Boneh 96 Jan 05, 2023
A fast Evolution Strategy implementation in Python

Evostra: Evolution Strategy for Python Evolution Strategy (ES) is an optimization technique based on ideas of adaptation and evolution. You can learn

Mika 251 Dec 08, 2022
This repository provides an efficient PyTorch-based library for training deep models.

s3sec Test AWS S3 buckets for read/write/delete access This tool was developed to quickly test a list of s3 buckets for public read, write and delete

Bytedance Inc. 123 Jan 05, 2023
A Python library for common tasks on 3D point clouds

Point Cloud Utils (pcu) - A Python library for common tasks on 3D point clouds Point Cloud Utils (pcu) is a utility library providing the following fu

Francis Williams 622 Dec 27, 2022
DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification

DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification Created by Yongming Rao, Wenliang Zhao, Benlin Liu, Jiwen Lu, Jie Zhou, Ch

Yongming Rao 414 Jan 01, 2023
StyleGAN2-ada for practice

This version of the newest PyTorch-based StyleGAN2-ada is intended mostly for fellow artists, who rarely look at scientific metrics, but rather need a working creative tool. Tested on Python 3.7 + Py

vadim epstein 170 Nov 16, 2022
Real-time analysis of intracranial neurophysiology recordings.

py_neuromodulation Click this button to run the "Tutorial ML with py_neuro" notebooks: The py_neuromodulation toolbox allows for real time capable pro

Interventional Cognitive Neuromodulation - Neumann Lab Berlin 15 Nov 03, 2022
TiP-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling

TiP-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling This is the official code release for the paper 'TiP-Adapter: Training-fre

peng gao 189 Jan 04, 2023
HCQ: Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval

HCQ: Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval [toc] 1. Introduction This repository provides the code for our paper at

13 Dec 08, 2022
SMIS - Semantically Multi-modal Image Synthesis(CVPR 2020)

Semantically Multi-modal Image Synthesis Project page / Paper / Demo Semantically Multi-modal Image Synthesis(CVPR2020). Zhen Zhu, Zhiliang Xu, Anshen

316 Dec 01, 2022
Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

This repository contains code for the following two papers: VisualBERT: A Simple and Performant Baseline for Vision and Language (arxiv) with a short

Natural Language Processing @UCLA 463 Dec 09, 2022
Code base of object detection

rmdet code base of object detection. 环境安装: 1. 安装conda python环境 - `conda create -n xxx python=3.7/3.8` - `conda activate xxx` 2. 运行脚本,自动安装pytorch1

3 Mar 08, 2022
[ACM MM 2021] Diverse Image Inpainting with Bidirectional and Autoregressive Transformers

Diverse Image Inpainting with Bidirectional and Autoregressive Transformers Installation pip install -r requirements.txt Dataset Preparation Given the

Yingchen Yu 25 Nov 09, 2022