traiNNer is an open source image and video restoration (super-resolution, denoising, deblurring and others) and image to image translation toolbox based on PyTorch.

Overview

traiNNer

Python Version License DeepSource Issues PR's Accepted

traiNNer is an open source image and video restoration (super-resolution, denoising, deblurring and others) and image to image translation toolbox based on PyTorch.

Here you will find: boilerplate code for training and testing computer vision (CV) models, different methods and strategies integrated in a single pipeline and modularity to add and remove components as needed, including new network architectures and templates for different training strategies. The code is under a constant state of change, so if you find an issue or bug please open a issue, a discussion or write in one of the Discord channels for help.

Different from other repositories, here the focus is not only on repeating previous papers' results, but to enable more people to train their own models more easily, using their own custom datasets, as well as integrating new ideas to increase the performance of the models. For these reasons, a lot of the code is made in order to automatically take care of fixing potential issues, whenever possible.

Details of the currently supported architectures can be found here.

For a changelog and general list of features of this repository, check here.

Table of Contents

  1. Dependencies
  2. Codes
  3. Usage
  4. Pretrained models
  5. Datasets
  6. How to help

Dependencies

  • Python 3 (Recommend to use Anaconda)
  • PyTorch >= 0.4.0. PyTorch >= 1.7.0 required to enable certain features (SWA, AMP, others), as well as torchvision.
  • NVIDIA GPU + CUDA
  • Python packages: pip install numpy opencv-python
  • JSON files can be used for the configuration option files, but in order to use YAML, the PyYAML python package is also a dependency: pip install PyYAML

Optional Dependencies

Codes

This repository is a full framework for training different kinds of networks, with multiple enhancements and options. In ./codes you will find a more detailed explaination of the code framework ).

You will also find:

  1. Some useful scripts. More details in ./codes/scripts.
  2. Evaluation codes, e.g., PSNR/SSIM metric.

Additionally, it is complemented by other repositories like DLIP, that can be used in order to extract estimated kernels and noise patches from real images, using a modified KernelGAN and patches extraction code. Detailed instructions about how to use the estimated kernels are available here

Usage

Training

Data and model preparation

In order to train your own models, you will need to create a dataset consisting of images, and prepare these images, both considering IO constrains, as well as the task the model should target. Detailed data preparation can be seen in codes/data.

Pretrained models that can be used for fine-tuning are available.

Detailed instructions on how to train are also available.

Augmentations strategies for training real-world models (blind SR) like Real-SR, BSRGAN and Real-ESRGAN are provided via presets that define the blur, resizing and noise configurations, but many more augmentations are available to define custom training strategies.

How to Test

For simple testing

The recommended way to get started with some of the models produced by the training codes available in this repository is by getting the pretrained models to be tested and run them in the companion repository iNNfer, with the purpose of model inference.

Additionally, you can also use a GUI (for ESRGAN models, for video) or a smaller repo for inference (for ESRGAN, for video).

If you are interested in obtaining results that can automatically return evaluation metrics, it is also possible to do inference of batches of images and some additional options with the instructions in how to test.

Pretrained models

The most recent community pretrained models can be found in the Wiki, Discord channels (game upscale and animation upscale) and nmkd's models.

For more details about the original and experimental pretrained models, please see pretrained models.

You can put the downloaded models in the default experiments/pretrained_models directory and use them in the options files with the corresponding network architectures.

Model interpolation

Models that were trained using the same pretrained model or are derivates of the same pretrained model are able to be interpolated to combine the properties of both. The original author demostrated this by interpolating the PSNR pretrained model (which is not perceptually good, but results in smooth images) with the ESRGAN resulting models that have more details but sometimes is excessive to control a balance in the resulting images, instead of interpolating the resulting images from both models, giving much better results.

The capabilities of linearly interpolating models are also explored in "DNI": Deep Network Interpolation for Continuous Imagery Effect Transition (CVPR19) with very interesting results and examples. The script for interpolation can be found in the net_interp.py file. This is an alternative to create new models without additional training and also to create pretrained models for easier fine tuning. Below is an example of interpolating between a PSNR-oriented and a perceptual ESRGAN model (first row), and examples of interpolating CycleGAN style transfer models.

More details and explanations of interpolation can be found here in the Wiki.

Datasets

Many datasets are publicly available and used to train models in a way that can be benchmarked and compared with other models. You are also able to create your own datasets with your own images.

Any dataset can be augmented to expose the model to information that might not be available in the images, such a noise and blur. For this reason, a data augmentation pipeline has been added to the options in this repository. It is also possible to add other types of augmentations, such as Batch Augmentations to apply them to minibatches instead of single images. Lastly, if your dataset is small, you can make use of Differential Augmentations to allow the discriminator to extract more information from the available images and train better models. More information can be found in the augmentations document.

How to help

There are multiple ways to help this project. The first one is by using it and trying to train your own models. You can open an issue if you find any bugs or start a discussion if you have ideas, questions or would like to showcase your results.

If you would like to contribute in the form of adding or fixing code, you can do so by cloning this repo and creating a PR. Ideally, it's better for PR to be precise and not changing many parts of the code at the same time, so it can be reviewed and tested. If possible, open an issue or discussion prior to creating the PR and we can talk about any ideas.

You can also join the discord servers and share results and questions with other users.

Lastly, after it has been suggested many times before, now there are options to donate to show your support to the project and help stir it in directions that will make it even more useful. Below you will find those options that were suggested.

Patreon

Bitcoin Address: 1JyWsAu7aVz5ZeQHsWCBmRuScjNhCEJuVL

Ethereum Address: 0xa26AAb3367D34457401Af3A5A0304d6CbE6529A2


Additional Help

If you have any questions, we have a couple of discord servers (game upscale and animation upscale) where you can ask them and a Wiki with more information.


Acknowledgement

Code architecture is originally inspired by pytorch-cyclegan and the first version of BasicSR.

Google Landmark Recogntion and Retrieval 2021 Solutions

Google Landmark Recogntion and Retrieval 2021 Solutions In this repository you can find solution and code for Google Landmark Recognition 2021 and Goo

Vadim Timakin 5 Nov 25, 2022
An executor that loads ONNX models and embeds documents using the ONNX runtime.

ONNXEncoder An executor that loads ONNX models and embeds documents using the ONNX runtime. Usage via Docker image (recommended) from jina import Flow

Jina AI 2 Mar 15, 2022
Unofficial Implement PU-Transformer

PU-Transformer-pytorch Pytorch unofficial implementation of PU-Transformer (PU-Transformer: Point Cloud Upsampling Transformer) https://arxiv.org/abs/

Lee Hyung Jun 7 Sep 21, 2022
An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners

An pytorch implementation of Masked Autoencoders Are Scalable Vision Learners This is a coarse version for MAE, only make the pretrain model, the fine

FlyEgle 214 Dec 29, 2022
Constraint-based geometry sketcher for blender

Constraint-based sketcher addon for Blender that allows to create precise 2d shapes by defining a set of geometric constraints like tangent, distance,

1.7k Dec 31, 2022
๐Ÿ“š A collection of all the Deep Learning Metrics that I came across which are not accuracy/loss.

๐Ÿ“š A collection of all the Deep Learning Metrics that I came across which are not accuracy/loss.

Rahul Vigneswaran 1 Jan 17, 2022
Multi-Person Extreme Motion Prediction

Multi-Person Extreme Motion Prediction Implementation for paper Wen Guo, Xiaoyu Bie, Xavier Alameda-Pineda, Francesc Moreno-Noguer, Multi-Person Extre

GUO-W 38 Nov 15, 2022
Process text, including tokenizing and representing sentences as vectors and Applying some concepts like RNN, LSTM and GRU to create a classifier can detect the language in which a sentence is written from among 17 languages.

Language Identifier What is this ? The goal of this project is to create a model that is able to predict a given sentence language through text proces

Hossam Asaad 9 Dec 15, 2022
A repository for benchmarking neural vocoders by their quality and speed.

License The majority of VocBench is licensed under CC-BY-NC, however portions of the project are available under separate license terms: Wavenet, Para

Meta Research 177 Dec 12, 2022
Fast, differentiable sorting and ranking in PyTorch

Torchsort Fast, differentiable sorting and ranking in PyTorch. Pure PyTorch implementation of Fast Differentiable Sorting and Ranking (Blondel et al.)

Teddy Koker 655 Jan 04, 2023
Solving reinforcement learning tasks which require language and vision

Multimodal Reinforcement Learning JAX implementations of the following multimodal reinforcement learning approaches. Dual-coding Episodic Memory from

Henry Prior 31 Feb 26, 2022
Source code for "UniRE: A Unified Label Space for Entity Relation Extraction.", ACL2021.

UniRE Source code for "UniRE: A Unified Label Space for Entity Relation Extraction.", ACL2021. Requirements python: 3.7.6 pytorch: 1.8.1 transformers:

Wang Yijun 109 Nov 29, 2022
This is a official repository of SimViT.

SimViT This is a official repository of SimViT. We will open our models and codes about object detection and semantic segmentation soon. Our code refe

ligang 57 Dec 15, 2022
diablo2 resurrected loot filter

Only For Chinese and Traditional Chinese The filter only for Chinese and Traditional Chinese, i didn't change it for other language.Maybe you could mo

elmagnifico 249 Dec 04, 2022
WHENet - ONNX, OpenVINO, TFLite, TensorRT, EdgeTPU, CoreML, TFJS, YOLOv4/YOLOv4-tiny-3L

HeadPoseEstimation-WHENet-yolov4-onnx-openvino ONNX, OpenVINO, TFLite, TensorRT, EdgeTPU, CoreML, TFJS, YOLOv4/YOLOv4-tiny-3L 1. Usage $ git clone htt

Katsuya Hyodo 49 Sep 21, 2022
repro_eval is a collection of measures to evaluate the reproducibility/replicability of system-oriented IR experiments

repro_eval repro_eval is a collection of measures to evaluate the reproducibility/replicability of system-oriented IR experiments. The measures were d

IR Group at Technische Hochschule Kรถln 9 May 25, 2022
Implementation of DropLoss for Long-Tail Instance Segmentation in Pytorch

[AAAI 2021]DropLoss for Long-Tail Instance Segmentation [AAAI 2021] DropLoss for Long-Tail Instance Segmentation Ting-I Hsieh*, Esther Robb*, Hwann-Tz

Tim 37 Dec 02, 2022
Code for KHGT model, AAAI2021

KHGT Code for KHGT accepted by AAAI2021 Please unzip the data files in Datasets/ first. To run KHGT on Yelp data, use python labcode_yelp.py For Movi

32 Nov 29, 2022
Self-Supervised Multi-Frame Monocular Scene Flow (CVPR 2021)

Self-Supervised Multi-Frame Monocular Scene Flow 3D visualization of estimated depth and scene flow (overlayed with input image) from temporally conse

Visual Inference Lab @TU Darmstadt 85 Dec 22, 2022
Hierarchical Attentive Recurrent Tracking

Hierarchical Attentive Recurrent Tracking This is an official Tensorflow implementation of single object tracking in videos by using hierarchical atte

Adam Kosiorek 147 Aug 07, 2021