The pytorch implementation of the paper "text-guided neural image inpainting" at MM'2020

Overview

TDANet: Text-Guided Neural Image Inpainting, MM'2020 (Oral)

MM | ArXiv

This repository implements the paper "Text-Guided Neural Image Inpainting" by Lisai Zhang, Qingcai Chen, Baotian Hu and Shuoran Jiang. Given one masked image, the proposed TDANet generates diverse plausible results according to guidance text.

Inpainting example

Manipulation Extension example

Getting started

Installation

This code was tested with Pytoch 1.2.0, CUDA 10.1, Python 3.6 and Ubuntu 16.04 with a 2080Ti GPU

pip install visdom dominate
  • Clone this repo (we suggest to only clone the depth 1 version):
git clone https://github.com/idealwhite/tdanet --depth 1
cd tdanet
  • Download the dataset and pre-processed files as in following steps.

Datasets

  • CUB_200: dataset from Caltech-UCSD Birds 200.
  • COCO: object detection 2014 datset from MS COCO.
  • pre-processed datafiles: train/test split, caption-image mapping, image sampling and pre-trained DAMSM from GoogleDrive and extarct them to dataset/ directory as specified in config.bird.yml/config.coco.yml.

Training Demo

python train.py --name tda_bird  --gpu_ids 0 --model tdanet --mask_type 0 1 2 3 --img_file ./datasets/CUB_200_2011/train.flist --mask_file ./datasets/CUB_200_2011/train_mask.flist --text_config config.bird.yml
  • Important: Add --mask_type in options/base_options.py for different training masks. --mask_file path is needed for object mask, use train_mask.flist for CUB and image_mask_coco_all.json for COCO. --text_config refer to the yml configuration file for text setup, --img_file is the image file dir or file list.
  • To view training results and loss plots, run python -m visdom.server and copy the URL http://localhost:8097.
  • Training models will be saved under the ./checkpoints folder.
  • More training options can be found in ./options folder.
  • Suggestion: use mask type 0 1 2 3 for CUB dataset and 0 1 2 4 for COCO dataset. Train more than 2000 epochs for CUB and 200 epochs for COCO.

Evaluation Demo

Test

python test.py --name tda_bird  --img_file datasets/CUB_200_2011/test.flist --results_dir results/tda_bird  --mask_file datasets/CUB_200_2011/test_mask.flist --mask_type 3 --no_shuffle --gpu_ids 0 --nsampling 1 --no_variance

Note:

  • Remember to add the --no_variance option to get better performance.
  • For COCO object mask, use image_mask_coco_all.json as the mask file..

A eval_tda_bird.flist will be generated after the test. Then in the evaluation, this file is used as the ground truth file list:

python evaluation.py --batch_test 60 --ground_truth_path eval_tda_bird.flist --save_path results/tda_bird
  • Add --ground_truth_path to the dir of ground truth image path or list. --save_path as the result dir.

Pretrained Models

Download the pre-trained models bird inpainting or coco inpainting and put them undercheckpoints/ directory.

GUI

  • Install the PyQt5 for GUI operation
pip install PyQt5

The GUI could now only avaliable in debug mode, please refer to this issues for detailed instructions. The author is not good at solving PyQt5 problems, wellcome contrbutions.

TODO

  • Debug the GUI application
  • Further improvement on COCO quality.

License

This software is for educational and academic research purpose only. If you wish to obtain a commercial royalty bearing license to this software, please contact us at [email protected].

Acknowledge

We would like to thanks Zheng et al. for providing their source code. This project is fit from their greate Pluralistic Image Completion Project.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{10.1145/3394171.3414017,
author = {Zhang, Lisai and Chen, Qingcai and Hu, Baotian and Jiang, Shuoran},
title = {Text-Guided Neural Image Inpainting},
year = {2020},
booktitle = {Proceedings of the 28th ACM International Conference on Multimedia},
pages = {1302–1310},
location = {Seattle, WA, USA},
}
Owner
LisaiZhang
Enjoy thinking about everything.
LisaiZhang
🐤 Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation

🐤 Nix-TTS An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation Rendi Chevi, Radityo Eko Prasojo, Alham Fikri Aji

Rendi Chevi 156 Jan 09, 2023
Official implementation of the NeurIPS'21 paper 'Conditional Generation Using Polynomial Expansions'.

Conditional Generation Using Polynomial Expansions Official implementation of the conditional image generation experiments as described on the NeurIPS

Grigoris 4 Aug 07, 2022
A Pytorch Implementation of Domain adaptation of object detector using scissor-like networks

A Pytorch Implementation of Domain adaptation of object detector using scissor-like networks Please follow Faster R-CNN and DAF to complete the enviro

2 Oct 07, 2022
Background-Click Supervision for Temporal Action Localization

Background-Click Supervision for Temporal Action Localization This repository is the official implementation of BackTAL. In this work, we study the te

LeYang 221 Oct 09, 2022
Official repository for the paper F, B, Alpha Matting

FBA Matting Official repository for the paper F, B, Alpha Matting. This paper and project is under heavy revision for peer reviewed publication, and s

Marco Forte 404 Jan 05, 2023
Automates Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning :rocket:

MLJAR Automated Machine Learning Documentation: https://supervised.mljar.com/ Source Code: https://github.com/mljar/mljar-supervised Table of Contents

MLJAR 2.4k Dec 31, 2022
PyTorch implementation for ComboGAN

ComboGAN This is our ongoing PyTorch implementation for ComboGAN. Code was written by Asha Anoosheh (built upon CycleGAN) [ComboGAN Paper] If you use

Asha Anoosheh 139 Dec 20, 2022
The software associated with a paper accepted at EMNLP 2021 titled "Open Knowledge Graphs Canonicalization using Variational Autoencoders".

Open-KG-canonicalization The software associated with a paper accepted at EMNLP 2021 titled "Open Knowledge Graphs Canonicalization using Variational

International Business Machines 13 Nov 11, 2022
Sionna: An Open-Source Library for Next-Generation Physical Layer Research

Sionna: An Open-Source Library for Next-Generation Physical Layer Research Sionna™ is an open-source Python library for link-level simulations of digi

NVIDIA Research Projects 313 Dec 22, 2022
Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"

Deformable Attention Implementation of Deformable Attention from this paper in Pytorch, which appears to be an improvement to what was proposed in DET

Phil Wang 128 Dec 24, 2022
You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling

You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling Transformer-based models are widely used in natural language processi

Zhanpeng Zeng 12 Jan 01, 2023
Jittor implementation of Recursive-NeRF: An Efficient and Dynamically Growing NeRF

Recursive-NeRF: An Efficient and Dynamically Growing NeRF This is a Jittor implementation of Recursive-NeRF: An Efficient and Dynamically Growing NeRF

33 Nov 30, 2022
The Face Mask recognition system uses AI technology to detect the person with or without a mask.

Face Mask Detection Face Mask Detection system built with OpenCV, Keras/TensorFlow using Deep Learning and Computer Vision concepts in order to detect

Rohan Kasabe 4 Apr 05, 2022
AVD Quickstart Containerlab

AVD Quickstart Containerlab WARNING This repository is still under construction. It's fully functional, but has number of limitations. For example: RE

Carl Buchmann 3 Apr 10, 2022
PyTorch Implementation of Realtime Multi-Person Pose Estimation project.

PyTorch Realtime Multi-Person Pose Estimation This is a pytorch version of Realtime_Multi-Person_Pose_Estimation, origin code is here Realtime_Multi-P

Dave Fang 157 Nov 12, 2022
Awesome Long-Tailed Learning

Awesome Long-Tailed Learning This repo pays specially attention to the long-tailed distribution, where labels follow a long-tailed or power-law distri

Stomach_ache 284 Jan 06, 2023
Various operations like path tracking, counting, etc by using yolov5

Object-tracing-with-YOLOv5 Various operations like path tracking, counting, etc by using yolov5

Pawan Valluri 5 Nov 28, 2022
I will implement Fastai in each projects present in this repository.

DEEP LEARNING FOR CODERS WITH FASTAI AND PYTORCH The repository contains a list of the projects which I have worked on while reading the book Deep Lea

Thinam Tamang 43 Dec 20, 2022
A standard framework for modelling Deep Learning Models for tabular data

PyTorch Tabular aims to make Deep Learning with Tabular data easy and accessible to real-world cases and research alike.

801 Jan 08, 2023
FFTNet vocoder implementation

Unofficial Implementation of FFTNet vocode paper. implement the model. implement tests. overfit on a single batch (sanity check). linearize weights fo

Eren Gölge 81 Dec 08, 2022