ParmeSan: Sanitizer-guided Greybox Fuzzing

Related tags

Deep Learningparmesan
Overview

ParmeSan: Sanitizer-guided Greybox Fuzzing

License

ParmeSan is a sanitizer-guided greybox fuzzer based on Angora.

Published Work

USENIX Security 2020: ParmeSan: Sanitizer-guided Greybox Fuzzing.

The paper can be found here: ParmeSan: Sanitizer-guided Greybox Fuzzing

Building ParmeSan

See the instructions for Angora.

Basically run the following scripts to install the dependencies and build ParmeSan:

build/install_rust.sh
PREFIX=/path/to/install/llvm build/install_llvm.sh
build/install_tools.sh
build/build.sh

ParmeSan also builds a tool bin/llvm-diff-parmesan, which can be used for target acquisition.

Building a target

First build your program into a bitcode file using clang (e.g., base64.bc). Then build your target in the same way, but with your selected sanitizer enabled. To get a single bitcode file for larger projects, the easiest solution is to use gllvm.

# Build the bitcode files for target acquisition
USE_FAST=1 $(pwd)/bin/angora-clang -emit-llvm -o base64.fast.bc -c base64.bc
USE_FAST=1 $(pwd)/bin/angora-clang -fsanitize=address -emit-llvm -o base64.fast.asan.bc -c base64.bc
# Build the actual binaries to be fuzzed
USE_FAST=1 $(pwd)/bin/angora-clang -o base64.fast -c base64.bc
USE_TRACK=1 $(pwd)/bin/angora-clang -o base64.track -c base64.bc

Then acquire the targets using:

bin/llvm-diff-parmesan -json base64.fast.bc base64.fast.asan.bc

This will output a file targets.json, which you provide to ParmeSan with the -c flag.

For example:

$(pwd)/bin/fuzzer -c ./targets.json -i in -o out -t ./base64.track -- ./base64.fast -d @@

Options

ParmeSan's SanOpt option can speed up the fuzzing process by dynamically switching over to a sanitized binary only once the fuzzer reaches one of the targets specified in the targets.json file.

Enable using the -s [SANITIZED_BIN] option.

Build the sanitized binary in the following way:

USE_FAST=1 $(pwd)/bin/angora-clang -fsanitize=address -o base64.asan.fast -c base64.bc

Targets input file

The targets input file consisit of a JSON file with the following format:

{
  "targets":  [1,2,3,4],
  "edges":   [[1,2], [2,3]],
  "callsite_dominators": {"1": [3,4,5]}
}

Where the targets denote the identify of the cmp instruction to target (i.e., the id assigned by the __angora_trace_cmp() calls) and edges is the overlay graph of cmp ids (i.e., which cmps are connected to each other). The edges filed can be empty, since ParmeSan will add newly discovered edges automatically, but note that the performance will be better if you provide the static CFG.

It is also possible to run ParmeSan in pure directed mode (-D option), meaning that it will only consider new seeds if the seed triggers coverage that is on a direct path to one of the specified targets. Note that this requires a somewhat complete static CFG to work (an incomplete CFG might contain no paths to the targets at all, which would mean that no new coverage will be considered at all).

ParmeSan Screenshot

How to get started

Have a look at BUILD_TARGET.md for a step-by-step tutorial on how to get started fuzzing with ParmeSan.

FAQ

  • Q: I get a warning like ==1561377==WARNING: DataFlowSanitizer: call to uninstrumented function gettext when running the (track) instrumented program.
  • A: In many cases you can ignore this, but it will lose the taint (meaning worse performance). You need to add the function to the abilist (e.g., llvm_mode/dfsan_rt/dfsan/done_abilist.txt) and add a custom DFSan wrapper (in llvm_mode/dfsan_rt/dfsan/dfsan_custom.cc). See the Angora documentation for more info.
  • Q: I get an compiler error when building the track binary.
  • A: ParmeSan/ Angora uses DFSan for dynamic data-flow analysis. In certain cases building target applications can be a bit tricky (especially in the case of C++ targets). Make sure to disable as much inline assembly as possible and make sure that you link the correct libraries/ llvm libc++. Some programs also do weird stuff like an indirect call to a vararg function. This is not supported by DFSan at the moment, so the easy solution is to patch out these calls, or do something like indirect call promotion.
  • Q: llvm-diff-parmesan generates too many targets!
  • A: You can do target pruning using the scripts in tools/ (in particular tools/prune.py) or use ASAP to generate a target bitcode file with fewer sanitizer targets.

Docker image

You can also get the pre-built docker image of ParmeSan.

docker pull vusec/parmesan
docker run --rm -it vusec/parmesan
# In the container you can build objdump
/parmesan/misc/build_objdump.sh
Owner
VUSec
VUSec
A Research-oriented Federated Learning Library and Benchmark Platform for Graph Neural Networks. Accepted to ICLR'2021 - DPML and MLSys'21 - GNNSys workshops.

FedGraphNN: A Federated Learning System and Benchmark for Graph Neural Networks A Research-oriented Federated Learning Library and Benchmark Platform

FedML-AI 175 Dec 01, 2022
Quantized models with python

quantized-network download .pth files to qmodels/: googlenet : https://download.

adreamxcj 2 Dec 28, 2021
[ICCV 2021] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

Counterfactual Attention Learning Created by Yongming Rao*, Guangyi Chen*, Jiwen Lu, Jie Zhou This repository contains PyTorch implementation for ICCV

Yongming Rao 90 Dec 31, 2022
Official pytorch implementation for Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion (CVPR 2022)

Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion This repository contains a pytorch implementation of "Learning to Listen: Modeling

50 Dec 17, 2022
Deeper DCGAN with AE stabilization

AEGeAN Deeper DCGAN with AE stabilization Parallel training of generative adversarial network as an autoencoder with dedicated losses for each stage.

Tyler Kvochick 36 Feb 17, 2022
DexterRedTool - Dexter's Red Team Tool that creates cronjob/task scheduler to consistently creates users

DexterRedTool Author: Dexter Delandro CSEC 473 - Spring 2022 This tool persisten

2 Feb 16, 2022
"Structure-Augmented Text Representation Learning for Efficient Knowledge Graph Completion"(WWW 2021)

STAR_KGC This repo contains the source code of the paper accepted by WWW'2021. "Structure-Augmented Text Representation Learning for Efficient Knowled

Bo Wang 60 Dec 26, 2022
A hobby project which includes a hand-gesture based virtual piano using a mobile phone camera and OpenCV library functions

Overview This is a hobby project which includes a hand-gesture controlled virtual piano using an android phone camera and some OpenCV library. My moti

Abhinav Gupta 1 Nov 19, 2021
FaRL for Facial Representation Learning

FaRL for Facial Representation Learning This repo hosts official implementation of our paper General Facial Representation Learning in a Visual-Lingui

Microsoft 19 Jan 05, 2022
This repo provides function call to track multi-objects in videos

Custom Object Tracking Introduction This repo provides function call to track multi-objects in videos with a given trained object detection model and

Jeff Lo 51 Nov 22, 2022
A Pytorch implement of paper "Anomaly detection in dynamic graphs via transformer" (TADDY).

TADDY: Anomaly detection in dynamic graphs via transformer This repo covers an reference implementation for the paper "Anomaly detection in dynamic gr

Yue Tan 21 Nov 24, 2022
[ICCV'21] PlaneTR: Structure-Guided Transformers for 3D Plane Recovery

PlaneTR: Structure-Guided Transformers for 3D Plane Recovery This is the official implementation of our ICCV 2021 paper News There maybe some bugs in

73 Nov 30, 2022
Lightweight Python library for adding real-time object tracking to any detector.

Norfair is a customizable lightweight Python library for real-time 2D object tracking. Using Norfair, you can add tracking capabilities to any detecto

Tryolabs 1.7k Jan 05, 2023
[AI6101] Introduction to AI & AI Ethics is a core course of MSAI, SCSE, NTU, Singapore

[AI6101] Introduction to AI & AI Ethics is a core course of MSAI, SCSE, NTU, Singapore. The repository corresponds to the AI6101 of Semester 1, AY2021-2022, starting from 08/2021. The instructors of

AccSrd 1 Sep 22, 2022
'Solving the sampling problem of the Sycamore quantum supremacy circuits

solve_sycamore This repo contains data, contraction code, and contraction order for the paper ''Solving the sampling problem of the Sycamore quantum s

Feng Pan 29 Nov 28, 2022
Detecting and Tracking Small and Dense Moving Objects in Satellite Videos: A Benchmark

This dataset is a large-scale dataset for moving object detection and tracking in satellite videos, which consists of 40 satellite videos captured by Jilin-1 satellite platforms.

Qingyong 87 Dec 22, 2022
Towards Debiasing NLU Models from Unknown Biases

Towards Debiasing NLU Models from Unknown Biases Abstract: NLU models often exploit biased features to achieve high dataset-specific performance witho

Ubiquitous Knowledge Processing Lab 22 Jun 14, 2022
Repository aimed at compiling code, papers, demos etc.. related to my PhD on 3D vision and machine learning for fruit detection and shape estimation at the university of Lincoln

PhD_3DPerception Repository aimed at compiling code, papers, demos etc.. related to my PhD on 3D vision and machine learning for fruit detection and s

lelouedec 2 Oct 06, 2022
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Pyserini Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations. Retrieval using sparse re

Castorini 706 Dec 29, 2022
PyTorch implementation of SCAFFOLD (Stochastic Controlled Averaging for Federated Learning, ICML 2020).

Scaffold-Federated-Learning PyTorch implementation of SCAFFOLD (Stochastic Controlled Averaging for Federated Learning, ICML 2020). Environment numpy=

KI 30 Dec 29, 2022