Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization

Related tags

Deep LearningURST
Overview

Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization

Official PyTorch implementation for our URST (Ultra-Resolution Style Transfer) framework.

URST is a versatile framework for ultra-high resolution style transfer under limited memory resources, which can be easily plugged in most existing neural style transfer methods.

With the growth of the input resolution, the memory cost of our URST hardly increases. Theoretically, it supports style transfer of arbitrary high-resolution images.

One ultra-high resolution stylized result of 12000 x 8000 pixels (i.e., 96 megapixels).

This repository is developed based on six representative style transfer methods, which are Johnson et al., MSG-Net, AdaIN, WCT, LinearWCT, and Wang et al. (Collaborative Distillation).

For details see Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization.

If you use this code for a paper please cite:

@misc{chen2021towards,
      title={Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization}, 
      author={Zhe Chen and Wenhai Wang and Enze Xie and Tong Lu and Ping Luo},
      year={2021},
      eprint={2103.11784},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Environment

  • python3.6, pillow, tqdm, torchfile, pytorch1.1+ (for inference)

    pip install pillow
    pip install tqdm
    pip install torchfile
    conda install pytorch==1.1.0 torchvision==0.3.0 -c pytorch
  • tensorboardX (for training)

    pip install tensorboardX

Then, clone the repository locally:

git clone https://github.com/czczup/URST.git

Test (Ultra-high Resolution Style Transfer)

Step 1: Prepare images

  • Content images and style images are placed in examples/.
  • Since the ultra-high resolution images are quite large, we not place them in this repository. Please download them from this google drive.
  • All content images used in this repository are collected from pexels.com.

Step 2: Prepare models

  • Download models from this google drive. Unzip and merge them into this repository.

Step 3: Stylization

First, choose a specific style transfer method and enter the directory.

Then, please run the corresponding script. The stylized results will be saved in output/.

  • For Johnson et al., we use the PyTorch implementation Fast-Neural-Style-Transfer.

    cd Johnson2016Perceptual/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --model <model_path> --URST
  • For MSG-Net, we use the official PyTorch implementation PyTorch-Multi-Style-Transfer.

    cd Zhang2017MultiStyle/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --style <style_path> --URST
  • For AdaIN, we use the PyTorch implementation pytorch-AdaIN.

    cd Huang2017AdaIN/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --style <style_path> --URST
  • For WCT, we use the PyTorch implementation PytorchWCT.

    cd Li2017Universal/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --style <style_path> --URST
  • For LinearWCT, we use the official PyTorch implementation LinearStyleTransfer.

    cd Li2018Learning/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --style <style_path> --URST
  • For Wang et al. (Collaborative Distillation), we use the official PyTorch implementation Collaborative-Distillation.

    cd Wang2020Collaborative/PytorchWCT/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --style <style_path> --URST

Optional options:

  • --patch_size: The maximum size of each patch. The default setting is 1000.
  • --style_size: The size of the style image. The default setting is 1024.
  • --thumb_size: The size of the thumbnail image. The default setting is 1024.
  • --URST: Use our URST framework to process ultra-high resolution images.

Train (Enlarge the Stroke Size)

Step 1: Prepare datasets

Download the MS-COCO 2014 dataset and WikiArt dataset.

  • MS-COCO

    wget http://msvocds.blob.core.windows.net/coco2014/train2014.zip
  • WikiArt

    • Either manually download from kaggle.
    • Or install kaggle-cli and download by running:
    kg download -u <username> -p <password> -c painter-by-numbers -f train.zip

Step 2: Prepare models

As same as the Step 2 in the test phase.

Step 3: Train the decoder with our stroke perceptual loss

  • For AdaIN:

    cd Huang2017AdaIN/
    CUDA_VISIBLE_DEVICES=<gpu_id> python trainv2.py --content_dir <coco_path> --style_dir <wikiart_path>
  • For LinearWCT:

    cd Li2018Learning/
    CUDA_VISIBLE_DEVICES=<gpu_id> python trainv2.py --contentPath <coco_path> --stylePath <wikiart_path>

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Owner
czczup
Knowledge is infinite.
czczup
Implementation of Feedback Transformer in Pytorch

Feedback Transformer - Pytorch Simple implementation of Feedback Transformer in Pytorch. They improve on Transformer-XL by having each token have acce

Phil Wang 93 Oct 04, 2022
Code for the Shortformer model, from the paper by Ofir Press, Noah A. Smith and Mike Lewis.

Shortformer This repository contains the code and the final checkpoint of the Shortformer model. This file explains how to run our experiments on the

Ofir Press 138 Apr 15, 2022
Implementation of "JOKR: Joint Keypoint Representation for Unsupervised Cross-Domain Motion Retargeting"

JOKR: Joint Keypoint Representation for Unsupervised Cross-Domain Motion Retargeting Pytorch implementation for the paper "JOKR: Joint Keypoint Repres

45 Dec 25, 2022
Python based framework for Automatic AI for Regression and Classification over numerical data.

Python based framework for Automatic AI for Regression and Classification over numerical data. Performs model search, hyper-parameter tuning, and high-quality Jupyter Notebook code generation.

BlobCity, Inc 141 Dec 21, 2022
scAR (single-cell Ambient Remover) is a package for data denoising in single-cell omics.

scAR scAR (single cell Ambient Remover) is a package for denoising multiple single cell omics data. It can be used for multiple tasks, such as, sgRNA

19 Nov 28, 2022
Repository for scripts and notebooks from the book: Programming PyTorch for Deep Learning

Repository for scripts and notebooks from the book: Programming PyTorch for Deep Learning

Ian Pointer 368 Dec 17, 2022
The code for paper "Learning Implicit Fields for Generative Shape Modeling".

implicit-decoder The tensorflow code for paper "Learning Implicit Fields for Generative Shape Modeling", Zhiqin Chen, Hao (Richard) Zhang. Project pag

Zhiqin Chen 353 Dec 30, 2022
The Generic Manipulation Driver Package - Implements a ROS Interface over the robotics toolbox for Python

Armer Driver Armer aims to provide an interface layer between the hardware drivers of a robotic arm giving the user control in several ways: Joint vel

QUT Centre for Robotics (QCR) 13 Nov 26, 2022
PyTorch framework, for reproducing experiments from the paper Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks. Code, based on the PyTorch framework, for reprodu

Asaf 3 Dec 27, 2022
Cupytorch - A small framework mimics PyTorch using CuPy or NumPy

CuPyTorch CuPyTorch是一个小型PyTorch,名字来源于: 不同于已有的几个使用NumPy实现PyTorch的开源项目,本项目通过CuPy支持

Xingkai Yu 23 Aug 17, 2022
WeakVRD-Captioning - Implementation of paper Improving Image Captioning with Better Use of Caption

WeakVRD-Captioning - Implementation of paper Improving Image Captioning with Better Use of Caption

30 Oct 28, 2022
Multi-Objective Loss Balancing for Physics-Informed Deep Learning

Multi-Objective Loss Balancing for Physics-Informed Deep Learning Code for ReLoBRaLo. Abstract Physics Informed Neural Networks (PINN) are algorithms

Rafael Bischof 16 Dec 12, 2022
SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

Wentao Zhu 24 May 20, 2022
This repository contains the official implementation code of the paper Transformer-based Feature Reconstruction Network for Robust Multimodal Sentiment Analysis

This repository contains the official implementation code of the paper Transformer-based Feature Reconstruction Network for Robust Multimodal Sentiment Analysis, accepted at ACMMM 2021.

Ziqi Yuan 10 Sep 30, 2022
unofficial pytorch implementation of RefineGAN

RefineGAN unofficial pytorch implementation of RefineGAN (https://arxiv.org/abs/1709.00753) for CSMRI reconstruction, the official code using tensorpa

xinby17 5 Jul 21, 2022
Stock-Prediction - prediction of stock market movements using sentiment analysis and deep learning.

Stock-Prediction- In this project, we aim to enhance the prediction of stock market movements using sentiment analysis and deep learning. We divide th

5 Jan 25, 2022
Image Captioning using CNN ,LSTM and Attention

Image Captioning using CNN ,LSTM and Attention This is a deeplearning model which tries to summarize an image into a text . Installation Install this

ASUTOSH GHANTO 1 Dec 16, 2021
Repo for parser tensorflow(.pb) and tflite(.tflite)

tfmodel_parser .pb file is the format of tensorflow model .tflite file is the format of tflite model, which usually used in mobile devices before star

1 Dec 23, 2021
Predictive Maintenance LSTM

Predictive-Maintenance-LSTM - Predictive maintenance study for Complex case study, we've obtained failure causes by operational error and more deeply by design mistakes.

Amir M. Sadafi 1 Dec 31, 2021
SBINN: Systems-biology informed neural network

SBINN: Systems-biology informed neural network The source code for the paper M. Daneker, Z. Zhang, G. E. Karniadakis, & L. Lu. Systems biology: Identi

Lu Group 15 Nov 19, 2022