[CVPR 2021] A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts

Overview

Visual-Reasoning-eXplanation

[CVPR 2021 A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts]

Project Page | Video | Paper

Editor

Figure: An example result with the proposed VRX. To explain the prediction (i.e., fire engine and not alternatives like ambulance), VRX provides both visual and structural clues.

A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts
Yunhao Ge, Yao Xiao, Zhi Xu, Meng Zheng, Srikrishna Karanam, Terrence Chen, Laurent Itti, Ziyan Wu
IEEE/ CVF International Conference on Computer Vision and Pattern Recognition (CVPR), 2021

We considered the challenging problem of interpreting the reasoning logic of a neural network decision. We propose a novel framework to interpret neural networks which extracts relevant class-specific visual concepts and organizes them using structural concepts graphs based on pairwise concept relationships. By means of knowledge distillation, we show VRX can take a step towards mimicking the reasoning process of NNs and provide logical, concept-level explanations for final model decisions. With extensive experiments, we empirically show VRX can meaningfully answer “why” and “why not” questions about the prediction, providing easy-to-understand insights about the reasoning process. We also show that these insights can potentially provide guidance on improving NN’s performance.

Editor

Figure: Examples of representing images as structural concept graph.

Editor

Figure: Pipeline for Visual Reasoning Explanation framework.

Thanks for a re-implementation from sssufmug, we added more features and finish the whole pipeline.

Getting Started

Installation

  • Clone this repo:
git clone https://github.com/gyhandy/Visual-Reasoning-eXplanation.git
cd Visual-Reasoning-eXplanation
  • Dependencies
pip install -r requirements.txt

Datasets

  • We use a subset of ImageNet as our source data. There are intrested classes which want to do reasoning, such as fire angine, ambulance and school bus, and also other random images for discovering concepts. You can download the source data that we used in our paper here: source [http://ilab.usc.edu/andy/dataset/source.zip]

  • Input files for training GNN and doing reasoning. You can get these data by doing discover concepts and match concepts yourself, but we also provide those files to help you doing inference directly. You can download the result data here: result[http://ilab.usc.edu/andy/dataset/result.zip]

Datasets Preprocess

Unzip source.zip as well as result.zip, and then place them in ./source and ./result. If you only want to do inference, you can skip discover concept, match concept and training Structural Concept Graph (SCG).

Discover concept

For more information about discover concept, you can refer to ACE: Towards Automatic Concept Based Explanations. We use the pretrained model provided by tensorflow to discover cencept. With default setting you can simply run

python3 discover_concept.py

If you want to do this step with a custom model, you should write a wrapper for it containing the following methods:

run_examples(images, BOTTLENECK_LAYER): which basically returens the activations of the images in the BOTTLENECK_LAYER. 'images' are original images without preprocessing (float between 0 and 1)
get_image_shape(): returns the shape of the model's input
label_to_id(CLASS_NAME): returns the id of the given class name.
get_gradient(activations, CLASS_ID, BOTTLENECK_LAYER): computes the gradient of the CLASS_ID logit in the logit layer with respect to activations in the BOTTLENECK_LAYER.

If you want to discover concept with GradCam, please also implement a 'gradcam.py' for your model and place it into ./src. Then run:

python3 discover_concept.py --model_to_run YOUR_LOCAL_PRETRAINED_MODEL_NAME --model_path YOUR_LOCAL_PATH_OF_PRETRAINED_MODEL --labels_path LABEL_PATH_OF_YOUR_MODEL_LABEL --use_gradcam TRUE/FALSE

Match concept

This step will use the concepts you discovered in last step to match new images. If you want to match your own images, please put them into ./source and create a new folder named IMAGE_CLASS_NAME. Then run:

python3 macth_concept.py --model_to_run YOUR_LOCAL_PRETRAINED_MODEL_NAME --model_path YOUR_LOCAL_PATH_OF_PRETRAINED_MODEL --labels_path LABEL_PATH_OF_YOUR_MODEL_LABEL --use_gradcam TRUE/FALSE

Training Structural Concept Graph (SCG)

python3 VR_training_XAI.py

Then you can find the checkpoints of model in ./result/model.

Reasoning a image

For images you want to do reasoning, you should first doing match concept to extract concept knowledge. Once extracted graph knowledge for SCG, you can do the inference. For example, if you want to inference ./source/fire_engine/n03345487_19835.JPEG, the "img_class" is "ambulance" and "img_idx" is 10367, then run:

python3 Xception_WhyNot.py --img_class fire_engine --img_idx 19835

Some visualize results

Editor
Editor
Editor

Contact / Cite

Got Questions? We would love to answer them! Please reach out by email! You may cite us in your research as:

@inproceedings{ge2021peek,
  title={A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts},
  author={Ge, Yunhao and Xiao, Yao and Xu, Zhi and Zheng, Meng and Karanam, Srikrishna and Chen, Terrence and Itti, Laurent and Wu, Ziyan},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={2195--2204},
  year={2021}
}

We will post other relevant resources, implementations, applications and extensions of this work here. Please stay tuned

Owner
Andy_Ge
Ph.D. Student in USC, interested in Computer Vision, Machine Learning, and AGI
Andy_Ge
Spectrum is an AI that uses machine learning to generate Rap song lyrics

Spectrum Spectrum is an AI that uses deep learning to generate rap song lyrics. View Demo Report Bug Request Feature Open In Colab About The Project S

39 Dec 16, 2022
A PyTorch implementation for V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation

A PyTorch implementation of V-Net Vnet is a PyTorch implementation of the paper V-Net: Fully Convolutional Neural Networks for Volumetric Medical Imag

Matthew Macy 606 Dec 21, 2022
Energy consumption estimation utilities for Jetson-based platforms

This repository contains a utility for measuring energy consumption when running various programs in NVIDIA Jetson-based platforms. Currently TX-2, NX, and AGX are supported.

OpenDR 10 Jun 17, 2022
An LSTM based GAN for Human motion synthesis

GAN-motion-Prediction An LSTM based GAN for motion synthesis has a few issues reading H3.6M data from A.Jain et al , will fix soon. Prediction of the

Amogh Adishesha 9 Jun 17, 2022
End-to-end image segmentation kit based on PaddlePaddle.

English | 简体中文 PaddleSeg PaddleSeg has released the new version including the following features: Our team won the 6.2k Jan 02, 2023

Implementation of the algorithm shown in the article "Modelo de Predicción de Éxito de Canciones Basado en Descriptores de Audio"

Success Predictor Implementation of the algorithm shown in the article "Modelo de Predicción de Éxito de Canciones Basado en Descriptores de Audio". B

Rodrigo Nazar Meier 4 Mar 17, 2022
Simple, efficient and flexible vision toolbox for mxnet framework.

MXbox: Simple, efficient and flexible vision toolbox for mxnet framework. MXbox is a toolbox aiming to provide a general and simple interface for visi

Ligeng Zhu 31 Oct 19, 2019
TF Image Segmentation: Image Segmentation framework

TF Image Segmentation: Image Segmentation framework The aim of the TF Image Segmentation framework is to provide/provide a simplified way for: Convert

Daniil Pakhomov 546 Dec 17, 2022
CVNets: A library for training computer vision networks

CVNets: A library for training computer vision networks This repository contains the source code for training computer vision models. Specifically, it

Apple 1.1k Jan 03, 2023
Code for reproducing our paper: LMSOC: An Approach for Socially Sensitive Pretraining

LMSOC: An Approach for Socially Sensitive Pretraining Code for reproducing the paper LMSOC: An Approach for Socially Sensitive Pretraining to appear a

Twitter Research 11 Dec 20, 2022
Code for "Typilus: Neural Type Hints" PLDI 2020

Typilus A deep learning algorithm for predicting types in Python. Please find a preprint here. This repository contains its implementation (src/) and

47 Nov 08, 2022
Demo for the paper "Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation"

Streaming speaker diarization Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation by Juan Manuel Coria, Hervé

Juanma Coria 187 Jan 06, 2023
A Python Reconnection Tool for alt:V

altv-reconnect What? It invokes a reconnect in the altV Client Dev Console. You get to determine when your local client should reconnect when developi

8 Jun 30, 2022
BC3407-Group-5-Project - BC3407 Group Project With Python

BC3407-Group-5-Project As the world struggles to contain the ever-changing varia

1 Jan 26, 2022
Python tools for 3D face: 3DMM, Mesh processing(transform, camera, light, render), 3D face representations.

face3d: Python tools for processing 3D face Introduction This project implements some basic functions related to 3D faces. You can use this to process

Yao Feng 2.3k Dec 30, 2022
Generative Models as a Data Source for Multiview Representation Learning

GenRep Project Page | Paper Generative Models as a Data Source for Multiview Representation Learning Ali Jahanian, Xavier Puig, Yonglong Tian, Phillip

Ali 81 Dec 03, 2022
Official implementation of CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21 For more information, check out the paper on [arXiv]. Training with different

Sunghwan Hong 120 Jan 04, 2023
simple_pytorch_example project is a toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset

simple_pytorch_example project is a toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset

Ramón Casero 1 Jan 07, 2022
The Multi-Mission Maximum Likelihood framework (3ML)

PyPi Conda The Multi-Mission Maximum Likelihood framework (3ML) A framework for multi-wavelength/multi-messenger analysis for astronomy/astrophysics.

The Multi-Mission Maximum Likelihood (3ML) 62 Dec 30, 2022
Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.

Conceptual 12M We introduce the Conceptual 12M (CC12M), a dataset with ~12 million image-text pairs meant to be used for vision-and-language pre-train

Google Research Datasets 226 Dec 07, 2022