Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset

Overview

SW-CV-ModelZoo

Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset


Framework: TF/Keras 2.7

Training SQLite DB built using fire-egg's tools: https://github.com/fire-eggs/Danbooru2019

Currently training on Danbooru2021, 512px SFW subset (sans the rating:q images that had been included in the 2022-01-21 release of the dataset)

Reference:

Anonymous, The Danbooru Community, & Gwern Branwen; “Danbooru2021: A Large-Scale Crowdsourced and Tagged Anime Illustration Dataset”, 2022-01-21. Web. Accessed 2022-01-28 https://www.gwern.net/Danbooru2021


Journal

06/02/2022: great news crew! TRC allowed me to use a bunch of TPUs!

To make better use of this amount of compute I had to overhaul a number of components, so a bunch of things are likely to have fallen to bitrot in the process. I can only guarantee NFNet can work pretty much as before with the right arguments.
NFResNet changes should have left it retrocompatible with the previous version.
ResNet has been streamlined to be mostly in line with the Bag-of-Tricks paper (arXiv:1812.01187) with the exception of the stem. It is not compatible with the previous version of the code.

The training labels have been included in the 2021_0000_0899 folder for convenience.
The list of files used for training is going to be uploaded as a GitHub Release.

Now for some numbers:
compared to my previous best run, the one that resulted in NFNetL1V1-100-0.57141:

  • I'm using 1.86x the amount of images: 2.8M vs 1.5M
  • I'm training bigger models: 61M vs 45M params
  • ... in less time: 232 vs 700 hours of processor time
  • don't get me started on actual wall clock time
  • with a few amenities thrown in: ECA for channel attention, SiLU activation

And it's all thanks to the folks at TRC, so shout out to them!

I currently have a few runs in progress across a couple of dimensions:

  • effect of model size with NFNet L0/L1/L2, with SiLU and ECA for all three of them
  • effect of activation function with NFNet L0, with SiLU/HSwish/ReLU, no ECA

Once the experiments are over, the plan is to select the network definitions that lay on the Pareto curve between throughput and F1 score and release the trained weights.

One last thing.
I'd like to call your attention to the tools/cleanlab_stuff.py script.
It reads two files: one with the binarized labels from the database, the other with the predicted probabilities.
It then uses the cleanlab package to estimate whether if an image in a set could be missing a given label. At the end it stores its conclusions in a json file.
This file could, potentially, be used in some tool to assist human intervention to add the missing tags.

You might also like...
Human head pose estimation using Keras over TensorFlow.
Human head pose estimation using Keras over TensorFlow.

RealHePoNet: a robust single-stage ConvNet for head pose estimation in the wild.

Graph Neural Networks with Keras and Tensorflow 2.

Welcome to Spektral Spektral is a Python library for graph deep learning, based on the Keras API and TensorFlow 2. The main goal of this project is to

QKeras: a quantization deep learning library for Tensorflow Keras

QKeras github.com/google/qkeras QKeras 0.8 highlights: Automatic quantization using QKeras; Stochastic behavior (including stochastic rouding) is disa

Hyperparameter Optimization for TensorFlow, Keras and PyTorch
Hyperparameter Optimization for TensorFlow, Keras and PyTorch

Hyperparameter Optimization for Keras Talos • Key Features • Examples • Install • Support • Docs • Issues • License • Download Talos radically changes

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.
MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

MMdnn MMdnn is a comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models. The "MM" stands for model manage

Deep GPs built on top of TensorFlow/Keras and GPflow

GPflux Documentation | Tutorials | API reference | Slack What does GPflux do? GPflux is a toolbox dedicated to Deep Gaussian processes (DGP), the hier

tf2onnx - Convert TensorFlow, Keras and Tflite models to ONNX.

tf2onnx converts TensorFlow (tf-1.x or tf-2.x), tf.keras and tflite models to ONNX via command line or python api.

Build tensorflow keras model pipelines in a single line of code. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.
Build tensorflow keras model pipelines in a single line of code. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.

deep_autoviml Build keras pipelines and models in a single line of code! Table of Contents Motivation How it works Technology Install Usage API Image

Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)
Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Releases(models_db2021_5500_2022_10_21)
  • models_db2021_5500_2022_10_21(Oct 21, 2022)

    ConvNext B, ViT B16
    Trained on Danbooru2021 512px SFW subset, modulos 0000-0899
    top 5500 tags (2021_0000_0899_5500/selected_tags.csv)
    alpha to white
    padding to make the image square is white
    channel order is BGR, input is 0...255, scaled to -1...1 within the model

    | run_name | definition_name | params_human | image_size | thres | F1 | |:---------------------------------|:------------------|:---------------|-------------:|--------:|-------:| | ConvNextBV1_09_25_2022_05h13m55s | B | 93.2M | 448 | 0.3673 | 0.6941 | | ViTB16_09_25_2022_04h53m38s | B16 | 90.5M | 448 | 0.3663 | 0.6918 |

    Source code(tar.gz)
    Source code(zip)
    ConvNextBV1_09_25_2022_05h13m55s.7z(322.58 MB)
    ViTB16_09_25_2022_04h53m38s.7z(312.96 MB)
  • convnexts_db2021_2022_03_22(Mar 22, 2022)

    ConvNext, T/S/B
    Trained on Danbooru2021 512px SFW subset, modulos 0000-0899
    alpha to white
    padding to make the image square is white
    channel order is BGR, input is 0...255, scaled to -1...1 within the model

    | run_name | definition_name | params_human | image_size | thres | F1 | |:---------------------------------|:------------------|:---------------|-------------:|--------:|-------:| | ConvNextBV1_03_10_2022_21h41m23s | B | 90.01M | 448 | 0.3372 | 0.6892 | | ConvNextSV1_03_11_2022_17h49m56s | S | 51.28M | 384 | 0.3301 | 0.6798 | | ConvNextTV1_03_05_2022_15h56m42s | T | 29.65M | 320 | 0.3259 | 0.6595 |

    Source code(tar.gz)
    Source code(zip)
    ConvNextBV1_03_10_2022_21h41m23s.7z(311.29 MB)
    ConvNextSV1_03_11_2022_17h49m56s.7z(177.36 MB)
    ConvNextTV1_03_05_2022_15h56m42s.7z(102.96 MB)
  • nfnets_db2021_2022_03_04(Mar 4, 2022)

    NFNet, L0/L1/L2 (based on timm Lx model definitions) Trained on Danbooru2021 512px SFW subset, modulos 0000-0899 alpha to white padding to make the image square is white channel order is BGR, input is 0...255, scaled to -1...1 within the model

    | run_name | definition_name | params_human | image_size | thres | F1 | |:---------------------------------|:------------------|:---------------|-------------:|--------:|-------:| | NFNetL2V1_02_20_2022_10h27m08s | L2 | 60.96M | 448 | 0.3231 | 0.6785 | | NFNetL1V1_02_17_2022_20h18m38s | L1 | 45.65M | 384 | 0.3259 | 0.6691 | | NFNetL0V1_02_10_2022_17h50m14s | L0 | 27.32M | 320 | 0.3190 | 0.6509 |

    Source code(tar.gz)
    Source code(zip)
    NFNetL0V1_02_10_2022_17h50m14s.7z(94.98 MB)
    NFNetL1V1_02_17_2022_20h18m38s.7z(157.97 MB)
    NFNetL2V1_02_20_2022_10h27m08s.7z(210.49 MB)
  • nfnet_tpu_training(Feb 6, 2022)

  • NFNetL1V1-100-0.57141(Dec 31, 2021)

    • NFNet, L1 (based on timm Lx model definitions), 100 epochs, F1 @ 0.4 at the end of the 100th epoch was 0.57141
    • Trained on Danbooru2020 512px SFW subset, modulos 0000-0599
    • 320px per side
    • alpha to white
    • padding to make the image square is white
    • channel order is BGR, scaled to 0-1
    • mixup alpha = 0.2 during epochs 76-100
    • analyze_metrics on Danbooru2020 original set, modulos 0984-0999: {'thres': 0.3485, 'F1': 0.6133, 'F2': 0.6133, 'MCC': 0.6094, 'A': 0.9923, 'R': 0.6133, 'P': 0.6133}
    • analyze_metrics on image IDs 4970000-5000000: {'thres': 0.3148, 'F1': 0.5942, 'F2': 0.5941, 'MCC': 0.5892, 'A': 0.9901, 'R': 0.5940, 'P': 0.5943}
    Source code(tar.gz)
    Source code(zip)
    NFNetL1V1-100-0.57141.7z(158.09 MB)
Efficient Lottery Ticket Finding: Less Data is More

The lottery ticket hypothesis (LTH) reveals the existence of winning tickets (sparse but critical subnetworks) for dense networks, that can be trained in isolation from random initialization to match

VITA 20 Sep 04, 2022
Deep Learning tutorials in jupyter notebooks.

DeepSchool.io Sign up here for Udemy Course on Machine Learning (Use code DEEPSCHOOL-MARCH to get 85% off course). Goals Make Deep Learning easier (mi

Sachin Abeywardana 1.8k Dec 28, 2022
Melanoma Skin Cancer Detection using Convolutional Neural Networks and Transfer Learning🕵🏻‍♂️

This is a Kaggle competition in which we have to identify if the given lesion image is malignant or not for Melanoma which is a type of skin cancer.

Vipul Shinde 1 Jan 27, 2022
Preprocessed Datasets for our Multimodal NER paper

Unified Multimodal Transformer (UMT) for Multimodal Named Entity Recognition (MNER) Two MNER Datasets and Codes for our ACL'2020 paper: Improving Mult

76 Dec 21, 2022
😊 Python module for face feature changing

PyWarping Python module for face feature changing Installation pip install pywarping If you get an error: No such file or directory: 'cmake': 'cmake',

Dopevog 10 Sep 10, 2021
Public Models considered for emotion estimation from EEG

Emotion-EEG Set of models for emotion estimation from EEG. Composed by the combination of two deep-learing models learning together (RNN and CNN) with

Victor Delvigne 21 Dec 23, 2022
SatelliteNeRF - PyTorch-based Neural Radiance Fields adapted to satellite domain

SatelliteNeRF PyTorch-based Neural Radiance Fields adapted to satellite domain.

Kai Zhang 46 Nov 20, 2022
CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection

CLOCs is a novel Camera-LiDAR Object Candidates fusion network. It provides a low-complexity multi-modal fusion framework that improves the performance of single-modality detectors. CLOCs operates on

Su Pang 254 Dec 16, 2022
Model search is a framework that implements AutoML algorithms for model architecture search at scale

Model search (MS) is a framework that implements AutoML algorithms for model architecture search at scale. It aims to help researchers speed up their exploration process for finding the right model a

Google 3.2k Dec 31, 2022
This is a official repository of SimViT.

SimViT This is a official repository of SimViT. We will open our models and codes about object detection and semantic segmentation soon. Our code refe

ligang 57 Dec 15, 2022
[CVPR 2022] Official Pytorch code for OW-DETR: Open-world Detection Transformer

OW-DETR: Open-world Detection Transformer (CVPR 2022) [Paper] Akshita Gupta*, Sanath Narayan*, K J Joseph, Salman Khan, Fahad Shahbaz Khan, Mubarak Sh

Akshita Gupta 127 Dec 27, 2022
The official implementation of Theme Transformer

Theme Transformer This is the official implementation of Theme Transformer. Checkout our demo and paper : Demo | arXiv Environment: using python versi

Ian Shih 85 Dec 08, 2022
Implementation of paper "DCS-Net: Deep Complex Subtractive Neural Network for Monaural Speech Enhancement"

DCS-Net This is the implementation of "DCS-Net: Deep Complex Subtractive Neural Network for Monaural Speech Enhancement" Steps to run the model Edit V

Jack Walters 10 Apr 04, 2022
Supervised Contrastive Learning for Downstream Optimized Sequence Representations

SupCL-Seq 📖 Supervised Contrastive Learning for Downstream Optimized Sequence representations (SupCS-Seq) accepted to be published in EMNLP 2021, ext

Hooman Sedghamiz 18 Oct 21, 2022
Deep Q-network learning to play flappybird.

AI Plays Flappy Bird I've trained a DQN that learns to play flappy bird on it's own. Try the pre-trained model First install the pip requirements and

Anish Shrestha 3 Mar 01, 2022
DGL-TreeSearch and the Gurobi-MWIS interface

Independent Set Benchmarking Suite This repository contains the code for our maximum independent set benchmarking suite as well as our implementations

Maximilian Böther 19 Nov 22, 2022
Official PyTorch implementation of "Adversarial Reciprocal Points Learning for Open Set Recognition"

Adversarial Reciprocal Points Learning for Open Set Recognition Official PyTorch implementation of "Adversarial Reciprocal Points Learning for Open Se

Guangyao Chen 78 Dec 28, 2022
ICLR 2021 i-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning

Introduction PyTorch code for the ICLR 2021 paper [i-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning]. @inproceedings{lee2021i

Kibok Lee 68 Nov 27, 2022
[CVPR 2021] Official PyTorch Implementation for "Iterative Filter Adaptive Network for Single Image Defocus Deblurring"

IFAN: Iterative Filter Adaptive Network for Single Image Defocus Deblurring Checkout for the demo (GUI/Google Colab)! The GUI version might occasional

Junyong Lee 173 Dec 30, 2022
Catbird is an open source paraphrase generation toolkit based on PyTorch.

Catbird is an open source paraphrase generation toolkit based on PyTorch. Quick Start Requirements and Installation The project is based on PyTorch 1.

Afonso Salgado de Sousa 5 Dec 15, 2022