Synthesize photos from PhotoDNA using machine learning 🌱

Related tags

Deep Learningribosome
Overview

Ribosome Build Status

Synthesize photos from PhotoDNA.

Ribosome demo

See the blog post for more information.

Installation

Dependencies

You can install Python dependencies using pip install -r requirements.txt. If you want to install the packages manually, here is a list:

Pre-trained models

Ribosome is released with 4 pre-trained models:

Use the models trained on NSFW data at your own risk.

Usage

Inference

Use the infer.py script to produce images from hashes:

python infer.py [--model MODEL] [--output OUTPUT] hash

The hash is a base64-encoded string, e.g. cVwhQ58OSCEOIwF+AigAkT0GAWdwAQs8o04KGYMfHBUANRUOAycUEFABCh6PABIghDBzCa4RTysQYVcvMDdkMypBPSyNAgRCcTf2AC9PfiYSWDw3KTcxPxM2HSqTDSIsgxJFFA+iihERcU4fHEY4Lj0xhw3QJN4OXQwbIzJjVTsUodIVVy3/FY8I/wcui11O.

Training

Datasets

Datasets consist of images paired with hashes, in the format of a CSV file with paths/hashes, and image files in a directory. The CSV file has two colums, path and hash (no header row). The hash is base64-encoded. Images are 100x100 in size. After producing such a CSV, it may be convenient to shuffle it and split it into a training set and validation set.

Example dataset

Ribosome includes an example dataset in this format, produced from COCO:

Preparing a dataset

To produce 100x100 images from an existing dataset, it may be convenient to use ImageMagick.

To resize image.jpg to 100x100 ignoring the original aspect ratio:

mogrify -resize '100x100!' image.jpg

To resize image.jpg to 100x100 by taking a center crop:

mogrify -resize '100x100^' -gravity Center -extent '100x100' image.jpg

You can process files in parallel using find / xargs, e.g. to convert all .jpg images using 24 threads:

find . -name '*.jpg' | xargs -n 1 -P 24 mogrify -resize '100x100!'

Ribosome does not provide code to compute PhotoDNA hashes, but such code is available in pyPhotoDNA.

Train a model

Use the train.py script to train a model on a dataset:

python train.py --train-data TRAIN_DATA ...
  • --train-data is the path to the train data CSV
  • Paths in the CSV are interpreted relative to --data-dir (or . if not supplied)
  • --val-data is the path to the validation data CSV; if provided, the script will report the validation loss after every epoch

See python train.py --help for all the options.

License

Copyright (c) Anish Athalye. Released under the MIT License. See LICENSE.md for details.

You might also like...
Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have undergone breast cancer surgery.

Patient-Survival - Using Python, I developed a Machine Learning model using classification techniques such as Random Forest and SVM classifiers to predict a patient's survival status that have undergone breast cancer surgery.

Intrusion Detection System using ensemble learning (machine learning)
Intrusion Detection System using ensemble learning (machine learning)

IDS-ML implementation of an intrusion detection system using ensemble machine learning methods Data set This project is carried out using the UNSW-15

Codes for realizing theories learned from Data Mining, Machine Learning, Deep Learning without using the present Python packages.

Codes-for-Algorithms Codes for realizing theories learned from Data Mining, Machine Learning, Deep Learning without using the present Python packages.

An open source machine learning library for performing regression tasks using RVM technique.

Introduction neonrvm is an open source machine learning library for performing regression tasks using RVM technique. It is written in C programming la

Knowledge Management for Humans using Machine Learning & Tags
Knowledge Management for Humans using Machine Learning & Tags

HyperTag HyperTag helps humans intuitively express how they think about their files using tags and machine learning.

Pneumonia Detection using machine learning - with PyTorch
Pneumonia Detection using machine learning - with PyTorch

Pneumonia Detection Pneumonia Detection using machine learning. Training was done in colab: DEMO: Result (Confusion Matrix): Data I uploaded my datase

Optimising chemical reactions using machine learning
Optimising chemical reactions using machine learning

Summit Summit is a set of tools for optimising chemical processes. We’ve started by targeting reactions. What is Summit? Currently, reaction optimisat

Providing the solutions for high-frequency trading (HFT) strategies using data science approaches (Machine Learning) on Full Orderbook Tick Data.
Providing the solutions for high-frequency trading (HFT) strategies using data science approaches (Machine Learning) on Full Orderbook Tick Data.

Modeling High-Frequency Limit Order Book Dynamics Using Machine Learning Framework to capture the dynamics of high-frequency limit order books. Overvi

Algorithmic trading using machine learning.
Algorithmic trading using machine learning.

Algorithmic Trading This machine learning algorithm was built using Python 3 and scikit-learn with a Decision Tree Classifier. The program gathers sto

Tooling for GANs in TensorFlow

TensorFlow-GAN (TF-GAN) TF-GAN is a lightweight library for training and evaluating Generative Adversarial Networks (GANs). Can be installed with pip

803 Dec 24, 2022
Self-supervised Point Cloud Prediction Using 3D Spatio-temporal Convolutional Networks

Self-supervised Point Cloud Prediction Using 3D Spatio-temporal Convolutional Networks This is a Pytorch-Lightning implementation of the paper "Self-s

Photogrammetry & Robotics Bonn 111 Dec 06, 2022
The Face Mask recognition system uses AI technology to detect the person with or without a mask.

Face Mask Detection Face Mask Detection system built with OpenCV, Keras/TensorFlow using Deep Learning and Computer Vision concepts in order to detect

Rohan Kasabe 4 Apr 05, 2022
MVFNet: Multi-View Fusion Network for Efficient Video Recognition (AAAI 2021)

MVFNet: Multi-View Fusion Network for Efficient Video Recognition (AAAI 2021) Overview We release the code of the MVFNet (Multi-View Fusion Network).

2 Jan 29, 2022
[ECCV 2020] Reimplementation of 3DDFAv2, including face mesh, head pose, landmarks, and more.

Stable Head Pose Estimation and Landmark Regression via 3D Dense Face Reconstruction Reimplementation of (ECCV 2020) Towards Fast, Accurate and Stable

Remilia Scarlet 221 Dec 30, 2022
Code for the ECIR'22 paper "Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators"

Query Variation Generators This repository contains the code and annotation data for the ECIR'22 paper "Evaluating the Robustness of Retrieval Pipelin

Gustavo Penha 12 Nov 20, 2022
[ArXiv 2021] One-Shot Generative Domain Adaptation

GenDA - One-Shot Generative Domain Adaptation One-Shot Generative Domain Adaptation Ceyuan Yang*, Yujun Shen*, Zhiyi Zhang, Yinghao Xu, Jiapeng Zhu, Z

GenForce: May Generative Force Be with You 46 Dec 19, 2022
Implementation of H-UCRL Algorithm

Implementation of H-UCRL Algorithm This repository is an implementation of the H-UCRL algorithm introduced in Curi, S., Berkenkamp, F., & Krause, A. (

Sebastian Curi 25 May 20, 2022
Flaxformer: transformer architectures in JAX/Flax

Flaxformer is a transformer library for primarily NLP and multimodal research at Google.

Google 116 Jan 05, 2023
BEAMetrics: Benchmark to Evaluate Automatic Metrics in Natural Language Generation

BEAMetrics: Benchmark to Evaluate Automatic Metrics in Natural Language Generation Installing The Dependencies $ conda create --name beametrics python

7 Jul 04, 2022
ShuttleNet: Position-aware Fusion of Rally Progress and Player Styles for Stroke Forecasting in Badminton (AAAI'22)

ShuttleNet: Position-aware Rally Progress and Player Styles Fusion for Stroke Forecasting in Badminton (AAAI 2022) Official code of the paper ShuttleN

Wei-Yao Wang 11 Nov 30, 2022
Intrusion Detection System using ensemble learning (machine learning)

IDS-ML implementation of an intrusion detection system using ensemble machine learning methods Data set This project is carried out using the UNSW-15

4 Nov 25, 2022
Code for the paper "Jukebox: A Generative Model for Music"

Status: Archive (code is provided as-is, no updates expected) Jukebox Code for "Jukebox: A Generative Model for Music" Paper Blog Explorer Colab Insta

OpenAI 6k Jan 02, 2023
Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

Training GANs with Stronger Augmentations via Contrastive Discriminator (ICLR 2021) This repository contains the code for reproducing the paper: Train

Jongheon Jeong 174 Dec 29, 2022
A python module for configuration of block devices

Blivet is a python module for system storage configuration. CI status Licence See COPYING Installation From Fedora repositories Blivet is available in

78 Dec 14, 2022
DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations

DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations This repository contains the data, scripts and baseline co

Alexa 51 Dec 17, 2022
Few-NERD: Not Only a Few-shot NER Dataset

Few-NERD: Not Only a Few-shot NER Dataset This is the source code of the ACL-IJCNLP 2021 paper: Few-NERD: A Few-shot Named Entity Recognition Dataset.

THUNLP 319 Dec 30, 2022
To build a regression model to predict the concrete compressive strength based on the different features in the training data.

Cement-Strength-Prediction Problem Statement To build a regression model to predict the concrete compressive strength based on the different features

Ashish Kumar 4 Jun 11, 2022
Model-based reinforcement learning in TensorFlow

Bellman Website | Twitter | Documentation (latest) What does Bellman do? Bellman is a package for model-based reinforcement learning (MBRL) in Python,

46 Nov 09, 2022
Pytorch implementation of DeepMind's differentiable neural computer paper.

DNC pytorch This is a Pytorch implementation of DeepMind's Differentiable Neural Computer (DNC) architecture introduced in their recent Nature paper:

Yuanpu Xie 91 Nov 21, 2022