Code for csig audio deepfake detection

Related tags

AudioCSIG_audio
Overview

FMFCC Audio Deepfake Detection Solution

This repo provides an solution for the 多媒体伪造取证大赛. Our solution achieve the 1st in the Audio Deepfake Detection track . The ranking can be seen here

Authors

Institution: Shenzhen Key Laboratory of Media Information Content Security(MICS)

Team name: Forensics_SZU

Username:

Pipeline

EfficientNet-B2 + mel features (不同mel参数的训练两个模型ensemble)

Requirements:

  • 激活已经配置好的环境[建议使用这个]
conda activate audio
  • 或者执行这个重新配置
pip install -r requirements.txt
  • 可能存在部分包漏了安装,可以通过pip install 方式继续安装完整

Submission

  • cd到代码位置:/home/audio5/py_project/CSIG_audio
cd /home/audio5/py_project/CSIG_audio
  • 接着,修改inference.py 文件的第98行中的root_path路劲为自己语音测试集的路劲

1. 预测第一个模型

  • 然后执行
python inference.py
  • 运行结束后,可以在 "./output/results/efficientnet-b2_96_k2e6_hop48_t' 看到results.json文件

2. 预测第二个模型

  • 2.1 修改json_root的路劲:把inference.py的105行的代码注释掉, 然后解除106行的注释,更新了第二个模型预测结果保存的路劲,修改后的代码如下
105        # json_root = f'./output/results/{model_name}_96_k2e6_hop48_t'
106        json_root = f'./output/results/{model_name}_96_k2e9_t'
  • 2.2 修改model_path的路劲(即改变模型权值):把inference.py的114行的代码注释掉, 然后解除115行的注释,更新了模型权值的路劲,修改后的代码如下
114        # model_path = './output/weights/efficientnet-b2_96_hop48_k2/audio6_acc0.9832.pth'
115        model_path = './output/weights/efficientnet-b2_96_k2/audio9_acc0.9752.pth'
  • 2.3 修改mel feature的配置参数:把dataset/preprocess/vggish_params.py中的 35行代码"EXAMPLE_HOP_SECONDS = 0.48" 改为"EXAMPLE_HOP_SECONDS = 0.96" 特别要注意EXAMPLE_HOP_SECONDS这个参数,验证结束后,需要重新修改为0.48, 方便最终决赛运行
  • 修改上述三点后,然后执行, 另外前面的两个修改也需要恢复。
python inference.py
  • 运行结束后,可以在 "./output/results/efficientnet-b2_96_k2e9_t' 看到results.json文件

Ensemble

  • 集成刚刚生成的两个json文件:把inference.py 第88行的"json_ensemble = False"改为"json_ensemble = True"
  • 修改后执行:
python inference.py
  • 运行结束后,可以在 "./output/results/ensemble_01_t' 看到results.json文件,该文件就是我们最终提交的结果json文件

注意:验证结束后,需要恢复上述修改到原始状态,方便最终的决赛

Owner
BokingChen
BokingChen
Multi-Track Music Generation with the Transfomer and the Johann Sebastian Bach Chorales dataset

MMM: Exploring Conditional Multi-Track Music Generation with the Transformer and the Johann Sebastian Bach Chorales Dataset. Implementation of the pap

102 Dec 08, 2022
ianZiPu is a way to write notation for Guqin (古琴) music.

PyBetween Wrapper for Between - 비트윈을 위한 파이썬 라이브러리 Legal Disclaimer 오직 교육적 목적으로만 사용할수 있으며, 비트윈은 VCNC의 자산입니다. 악의적 공격에 이용할시 처벌 받을수 있습니다. 사용에 따른 책임은 사용자가

Nancy Yi Liang 8 Nov 25, 2022
LibXtract is a simple, portable, lightweight library of audio feature extraction functions.

LibXtract LibXtract is a simple, portable, lightweight library of audio feature extraction functions. The purpose of the library is to provide a relat

Jamie Bullock 215 Nov 16, 2022
Codes for "Efficient Long-Range Attention Network for Image Super-resolution"

ELAN Codes for "Efficient Long-Range Attention Network for Image Super-resolution", arxiv link. Dependencies & Installation Please refer to the follow

xindong zhang 124 Dec 22, 2022
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Audiomentations A Python library for audio data augmentation. Inspired by albumentations. Useful for deep learning. Runs on CPU. Supports mono audio a

Iver Jordal 1.2k Jan 07, 2023
🎵 A music bot for discord servers!

music bot A music bot for Discord Servers Features Play songs in your discord server Get the lyrics without going on a web explorer Commands Command P

1 Jul 25, 2022
music library manager and MusicBrainz tagger

beets Beets is the media library management system for obsessive music geeks. The purpose of beets is to get your music collection right once and for

beetbox 11.3k Dec 31, 2022
Datamoshing with FFmpeg

ffmosher Datamoshing with FFmpeg Drag and drop video onto mosh.bat to create a datamoshed video. To datamosh an image, please ensure the file is in a

18 Sep 11, 2022
Audio augmentations library for PyTorch for audio in the time-domain

Audio augmentations library for PyTorch for audio in the time-domain, with support for stochastic data augmentations as used often in self-supervised / contrastive learning.

Janne 166 Jan 08, 2023
Analysis of voices based on the Mel-frequency band

Speaker_partition_module Analysis of voices based on the Mel-frequency band. Goal: Identification of voices speaking (diarization) and calculation of

1 Feb 06, 2022
In this project we can see how we can generate automatic music using character RNN.

Automatic Music Genaration Table of Contents Project Description Approach towards the problem Limitations Libraries Used Summary Applications Referenc

Pronay Ghosh 2 May 27, 2022
A useful tool to generate chord progressions according to melody MIDIs

Auto chord generator, pure python package that generate chord progressions according to given melodies

Billy Yi 53 Dec 30, 2022
Accompanying code for our paper "Point Cloud Audio Processing"

Point Cloud Audio Processing Krishna Subramani1, Paris Smaragdis1 1UIUC Paper For the necessary libraries/prerequisites, please use conda/anaconda to

Krishna Subramani 17 Nov 17, 2022
DCL - An easy to use diacritic library used for diacritic and accent manipulation.

Diacritics Library This library is used for adding, and removing diacritics from strings. Getting started Start by importing the module: import dcl DC

Kreus Amredes 6 Jun 03, 2022
An audio guide for destroying oracles in Destiny's Vault of Glass raid

prophet An audio guide for destroying oracles in Destiny's Vault of Glass raid. This project allows you to make any encounter with oracles without hav

24 Sep 15, 2022
This is my voice assistant Patric!

voice-assistant This is my voice assistant Patric! You can add can add commands and even modify his name Indice How to use Installation guide How to u

Norbert Gabos 1 Jun 28, 2022
:notes: Cross-platform music player

Exaile Exaile is a music player with a simple interface and powerful music management capabilities. Features include automatic fetching of album art,

Exaile 327 Dec 19, 2022
pyo is a Python module written in C to help digital signal processing script creation.

pyo is a Python module written in C to help digital signal processing script creation.

Olivier Bélanger 1.1k Jan 01, 2023
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

Basic Pitch is a Python library for Automatic Music Transcription (AMT), using lightweight neural network developed by Spotify's Audio Intelligence La

Spotify 1.4k Jan 01, 2023
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

⚠️ Checkout develop branch to see what is coming in pyannote.audio 2.0: a much smaller and cleaner codebase Python-first API (the good old pyannote-au

pyannote 2.1k Dec 31, 2022