code for Image Manipulation Detection by Multi-View Multi-Scale Supervision

Related tags

Deep LearningMVSS-Net
Overview

MVSS-Net

Code and models for ICCV 2021 paper: Image Manipulation Detection by Multi-View Multi-Scale Supervision

Image text

Update

To Be Done.

  • 21.12.17, Something new: MVSS-Net++

We now have an improved version of MVSS-Net, denoted as MVSS-Net++. Check here.

Environment

  • Ubuntu 16.04.6 LTS
  • Python 3.6
  • cuda10.1+cudnn7.6.3

Requirements

Usage

Dataset

An example of the dataset index file is given as data/CASIAv1plus.txt, where each line contains:

img_path mask_path label
  • 0 represents the authentic and 1 represents the manipulated.
  • For an authentic image, the mask_path is "None".
  • For wild images without mask groundtruth, the index should at least contain "img_path" per line.
Training sets
Test sets
  • DEFACTO-12k
  • Columbia
  • COVER
  • NIST16
  • CASIAv1plus: Note that some of the authentic images in CASIAv1 also appear in CASIAv2. With those images fully replaced by Corel images that are new to both CASIAv1 and CASIAv2, we constructed a revision of CASIAv1 termed as CASIAv1plus. We recommend to use CASIAv1plus as an alternative to the original CASIAv1.

Trained Models

We offer FCNs and MVSS-Nets trained on CASIAv2 and DEFACTO_84k, respectively. Please download the models and place them in the ckpt directory:

The performance of these models for image-level manipulation detection (metric: AUC and image-level F1) is as follows. More details are reported in the paper.

Performance metric: AUC
Model Training data CASIAv1plus Columbia COVER DEFACTO-12k
MVSS_Net CASIAv2 0.932 0.980 0.731 0.573
MVSS_Net DEFACTO-84k 0.771 0.563 0.525 0.886
FCN CASIAv2 0.769 0.762 0.541 0.551
FCN DEFACTO-84k 0.629 0.535 0.543 0.840
Performance metric: Image-level F1 (threshold=0.5)
Model Training data CASIAv1plus Columbia COVER DEFACTO-12k
MVSS_Net CASIAv2 0.759 0.802 0.244 0.404
MVSS_Net DEFACTO-84k 0.685 0.353 0.360 0.799
FCN CASIAv2 0.684 0.481 0.180 0.458
FCN DEFACTO-84k 0.561 0.492 0.511 0.709

Inference & Evaluation

You can specify which pre-trained model to use by setting model_path in do_pred_and_eval.sh. Given a test_collection (e.g. CASIAv1plus or DEFACTO12k-test), the prediction maps and evaluation results will be saved under save_dir. The default threshold is set as 0.5.

bash do_pred_and_eval.sh $test_collection
#e.g. bash do_pred_and_eval.sh CASIAv1plus

For inference only, use following command to skip evaluation:

bash do_pred.sh $test_collection
#e.g. bash do_pred.sh CASIAv1plus

Demo

  • demo.ipynb: A step-by-step notebook tutorial showing the usage of a pre-trained model to detect manipulation in a specific image.

Citation

If you find this work useful in your research, please consider citing:

@InProceedings{MVSS_2021ICCV,  
author = {Chen, Xinru and Dong, Chengbo and Ji, Jiaqi and Cao, juan and Li, Xirong},  
title = {Image Manipulation Detection by Multi-View Multi-Scale Supervision},  
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},  
year = {2021}  
}

Acknowledgments

Contact

If you enounter any issue when running the code, please feel free to reach us either by creating a new issue in the github or by emailing

Owner
dong_chengbo
dong_chengbo
FeTaQA: Free-form Table Question Answering

FeTaQA: Free-form Table Question Answering FeTaQA is a Free-form Table Question Answering dataset with 10K Wikipedia-based {table, question, free-form

Language, Information, and Learning at Yale 40 Dec 13, 2022
Sudoku solver - A sudoku solver with python

sudoku_solver A sudoku solver What is Sudoku? Sudoku (Japanese: 数独, romanized: s

Sikai Lu 0 May 22, 2022
ScriptProfilerPy - Module to visualize where your python script is slow

ScriptProfiler helps you track where your code is slow It provides: Code lines t

Lucas BLP 3 Jun 02, 2022
sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code

sequitur sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code. It implements three differ

Jonathan Shobrook 305 Dec 21, 2022
When in Doubt: Improving Classification Performance with Alternating Normalization

When in Doubt: Improving Classification Performance with Alternating Normalization Findings of EMNLP 2021 Menglin Jia, Austin Reiter, Ser-Nam Lim, Yoa

Menglin Jia 13 Nov 06, 2022
Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.

Conceptual 12M We introduce the Conceptual 12M (CC12M), a dataset with ~12 million image-text pairs meant to be used for vision-and-language pre-train

Google Research Datasets 226 Dec 07, 2022
Activating More Pixels in Image Super-Resolution Transformer

HAT [Paper Link] Activating More Pixels in Image Super-Resolution Transformer Xiangyu Chen, Xintao Wang, Jiantao Zhou and Chao Dong BibTeX @article{ch

XyChen 270 Dec 27, 2022
Twin-deep neural network for semi-supervised learning of materials properties

Deep Semi-Supervised Teacher-Student Material Synthesizability Prediction Citation: Semi-supervised teacher-student deep neural network for materials

MLEG 3 Dec 14, 2022
State-to-Distribution (STD) Model

State-to-Distribution (STD) Model In this repository we provide exemplary code on how to construct and evaluate a state-to-distribution (STD) model fo

<a href=[email protected]"> 2 Apr 07, 2022
Collection of TensorFlow2 implementations of Generative Adversarial Network varieties presented in research papers.

TensorFlow2-GAN Collection of tf2.0 implementations of Generative Adversarial Network varieties presented in research papers. Model architectures will

41 Apr 28, 2022
This program uses trial auth token of Azure Cognitive Services to do speech synthesis for you.

🗣️ aspeak A simple text-to-speech client using azure TTS API(trial). 😆 TL;DR: This program uses trial auth token of Azure Cognitive Services to do s

Levi Zim 359 Jan 05, 2023
TCube generates rich and fluent narratives that describes the characteristics, trends, and anomalies of any time-series data (domain-agnostic) using the transfer learning capabilities of PLMs.

TCube: Domain-Agnostic Neural Time series Narration This repository contains the code for the paper: "TCube: Domain-Agnostic Neural Time series Narrat

Mandar Sharma 7 Oct 31, 2021
GE2340 project source code without credentials.

GE2340-Project-Public GE2340 project source code without credentials. Run the bot.py to start the bot Telegram: @jasperwong_ge2340_bot If the bot does

0 Feb 10, 2022
SOTR: Segmenting Objects with Transformers [ICCV 2021]

SOTR: Segmenting Objects with Transformers [ICCV 2021] By Ruohao Guo, Dantong Niu, Liao Qu, Zhenbo Li Introduction This is the official implementation

186 Dec 20, 2022
Exploring the Dual-task Correlation for Pose Guided Person Image Generation

Dual-task Pose Transformer Network The source code for our paper "Exploring Dual-task Correlation for Pose Guided Person Image Generation“ (CVPR2022)

63 Dec 15, 2022
GEP (GDB Enhanced Prompt) - a GDB plug-in for GDB command prompt with fzf history search, fish-like autosuggestions, auto-completion with floating window, partial string matching in history, and more!

GEP (GDB Enhanced Prompt) GEP (GDB Enhanced Prompt) is a GDB plug-in which make your GDB command prompt more convenient and flexibility. Why I need th

Alan Li 23 Dec 21, 2022
Collection of sports betting AI tools.

sports-betting sports-betting is a collection of tools that makes it easy to create machine learning models for sports betting and evaluate their perf

George Douzas 109 Dec 31, 2022
Simple image captioning model - CLIP prefix captioning.

Simple image captioning model - CLIP prefix captioning.

688 Jan 04, 2023
YOLOX Win10 Project

Introduction 这是一个用于Windows训练YOLOX的项目,相比于官方项目,做了一些适配和修改: 1、解决了Windows下import yolox失败,No such file or directory: 'xxx.xml'等路径问题 2、CUDA out of memory等显存不

5 Jun 08, 2022
PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

简体中文 | English PaddleRobotics paddleRobotics是基于paddle的机器人开源算法库集,包括人机交互、复杂运动控制、环境感知、slam定位导航等开源算法部分。 人机交互 主动多模交互技术TFVT-HRI 主动多模交互技术是通过视觉、语音、触摸传感器等输入机器人

185 Dec 26, 2022