Analysis of voices based on the Mel-frequency band

Overview

Speaker_partition_module

Analysis of voices based on the Mel-frequency band.
Goal: Identification of voices speaking (diarization) and calculation of speech partition (in %).

Methodology:

  • Collect voice data
  • Sample audio data of x speakers that talk y times to represent a round of people talking
  • Annotate samples with labels and merge audio file
  • Create train & test split of samples
  • Train unsupervised clustering module to detect number of people
  • Train supervised RNN classifier to determine who is speaking at time x

Preprocessing

  • Convert files to .wav convertFlac2Wav.py
  • Collect data via LibriSpeech voices library (audiofiles) audio_manipulation02.py
  • Extract x random speakers with y audio samples per speaker Result: Generated audio samples of length 30-60 seconds

Feature extraction:

  • Create mel-frequency spectrum for each audio file feature_extraction.py
  • Define overlapping feature window for training

Training:

  • Implementation of google-diarizer module
  • Training accuracy is only at 40 %

Further activity

  • Create own unsupervised clustering module
  • Try out different libraries
๐™ฐ ๐™ผ๐šž๐šœ๐š’๐šŒ ๐™ฑ๐š˜๐š ๐™ฒ๐š›๐šŽ๐šŠ๐š๐šŽ๐š ๐™ฑ๐šข ๐šƒ๐šŽ๐šŠ๐š–๐™ณ๐š•๐š ๐Ÿ’–

TeamDltmusic ๐™ฐ ๐™ผ๐šž๐šœ๐š’๐šŒ ๐™ฑ๐š˜๐š ๐™ฒ๐š›๐šŽ๐šŠ๐š๐šŽ๐š ๐™ฑ๐šข ๐šƒ๐šŽ๐šŠ๐š–๐™ณ๐š•๐š ๐Ÿ’– Deploy String Session String Click hear you can find string session OR join He

TeamDlt 5 Jan 18, 2022
๐ŸŽต A music bot for discord servers!

music bot A music bot for Discord Servers Features Play songs in your discord server Get the lyrics without going on a web explorer Commands Command P

1 Jul 25, 2022
Nayeli: cool telegram groups vc music project

Nayeli-music Nayeli ๐Ÿฅ€ is cool telegram ๐ŸŽ groups vc music project ๐ŸŽ‹ . Nayeli-music Nayeli Deployment ๐ŸŽ‹ ๐Ÿ“ฒ Esy deploy ๐Ÿพ๏ธ Source Owner โ™ฅ๏ธ โ„๏ธ He is s

Kasun bandara 2 Dec 20, 2021
The official repository for Audio ALBERT

AALBERT Here is also the official repository of AALBERT, which is Pytorch lightning reimplementation of the paper, Audio ALBERT: A Lite Bert for Self-

pohan 55 Dec 11, 2022
This is a short program that takes the input from your microphone and uses OpenGL to draw a live colourful pattern

Visual-Music This is a short program that takes the input from your microphone and uses OpenGL to draw a live colourful pattern Installation and Setup

Tom Jebbo 1 Dec 26, 2021
Code for csig audio deepfake detection

FMFCC Audio Deepfake Detection Solution This repo provides an solution for the ๅคšๅช’ไฝ“ไผช้€ ๅ–่ฏๅคง่ต›. Our solution achieve the 1st in the Audio Deepfake Detection

BokingChen 9 Jun 04, 2022
Anaphones are like anagrams, but for sounds.

Anaphones Anaphones are like anagrams but for sounds (phonemes). Examples include: salami-awesomely, atari-tiara, and beefy-phoebe. Anaphones can be a

James Murphy 18 Nov 02, 2022
A music player designed for a University Project.

A music player designed for a University Project. Very flexibe and easy to use, a real life working application with user friendly controls. Hope u enjoy!!

Aditya Johorey 1 Nov 19, 2021
Music generation using ml / dl

Data analysis Document here the project: deep_music Description: Project Description Data Source: Type of analysis: Please document the project the be

0 Jul 03, 2022
Delta TTA(Text To Audio) SoftWare

Text-To-Audio-Windows Delta TTA(Text To Audio) SoftWare Info You Can Use It For Convert Your Text To Audio File You Just Write Your Text And Your End

Delta Inc. 2 Dec 14, 2021
Mina - A Telegram Music Bot 5 mandatory Assistant written in Python using Pyrogram and Py-Tgcalls

Mina - A Telegram Music Bot 5 mandatory Assistant written in Python using Pyrogram and Py-Tgcalls

3 Feb 07, 2022
A Music Player Bot for Discord Servers

A Music Player Bot for Discord Servers

Halil Acar 2 Oct 25, 2021
Audio pitch-shifting & re-sampling utility, based on the EMU SP-1200

Pitcher.py Free & OS emulation of the SP-12 & SP-1200 signal chain (now with GUI) Pitch shift / bitcrush / resample audio files Written and tested in

morgan 13 Oct 03, 2022
A Python library and tools AUCTUS A6 based radios.

A Python library and tools AUCTUS A6 based radios.

Jonathan Hart 6 Nov 23, 2022
SomaFM Plugin for Kodi

SomaFM XBMC Plugin This description is a bit outdated. You can simply install this addon by browsing the official repositories from within Kodi. Insta

7 Jan 21, 2022
TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music

TONet Introduction The official implementation of "TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music", in ICASSP 2022 We

Knut(Ke) Chen 29 Dec 01, 2022
An audio guide for destroying oracles in Destiny's Vault of Glass raid

prophet An audio guide for destroying oracles in Destiny's Vault of Glass raid. This project allows you to make any encounter with oracles without hav

24 Sep 15, 2022
Voice helper on russian

Voice helper on russian

KreO 1 Jun 30, 2022
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Audiomentations A Python library for audio data augmentation. Inspired by albumentations. Useful for deep learning. Runs on CPU. Supports mono audio a

Iver Jordal 1.2k Jan 07, 2023
Royal Music You can play music and video at a time in vc

Royals-Music Royal Music You can play music and video at a time in vc Commands SOON String STRING_SESSION Deployment ๐ŸŽ– Credits โ€ข ๐Ÿ‡ธแดแดสแด€โƒ๐Ÿ‡ฏแด‡แด‡แด› โ€ข ๐Ÿ‡ดา“า“ษช

2 Nov 23, 2021