A Python wrapper for the high-quality vocoder "World"

Overview

PyWORLD - A Python wrapper of WORLD Vocoder

Linux Windows
Build Status Build Status

WORLD Vocoder is a fast and high-quality vocoder which parameterizes speech into three components:

  1. f0: Pitch contour
  2. sp: Harmonic spectral envelope
  3. ap: Aperiodic spectral envelope (relative to the harmonic spectral envelope)

It can also (re)synthesize speech using these features (see examples below).

For more information, please visit Dr. Morise's WORLD repository and the official website of WORLD Vocoder

APIs

Vocoder Functions

import pyworld as pw
_f0, t = pw.dio(x, fs)    # raw pitch extractor
f0 = pw.stonemask(x, _f0, t, fs)  # pitch refinement
sp = pw.cheaptrick(x, f0, t, fs)  # extract smoothed spectrogram
ap = pw.d4c(x, f0, t, fs)         # extract aperiodicity

y = pw.synthesize(f0, sp, ap, fs) # synthesize an utterance using the parameters

Utility

# Convert speech into features (using default arguments)
f0, sp, ap = pw.wav2world(x, fs)

You can change the default arguments of the function, too. See more info using help.

Installation

Using Pip

pip install pyworld

Building from Source

git clone https://github.com/JeremyCCHsu/Python-Wrapper-for-World-Vocoder.git
cd Python-Wrapper-for-World-Vocoder
git submodule update --init
pip install -U pip
pip install -r requirements.txt
pip install .

It will automatically git clone Morise's World Vocoder (C++ version).
(It seems to me that using virtualenv or conda is the best practice.)

Installation Validation

You can validate installation by running

cd demo
python demo.py

to see if you get results in test/ direcotry. (Please avoid writing and executing codes in the Python-Wrapper-for-World-Vocoder folder for now.)

Environment/Dependencies

  • Operating systems
    • Linux Ubuntu 14.04+
    • Windows (thanks to wuaalb)
    • WSL
  • Python
    • 2.7 (Windows is currently not supported)
    • 3.7/3.6/3.5

You can install dependencies these by pip install -r requirements.txt

Notice

  • WORLD vocoder is designed for speech sampled ≥ 16 kHz. Applying WORLD to 8 kHz speech will fail. See a possible workaround here.
  • When the SNR is low, extracting pitch using harvest instead of dio is a better option.

Troubleshooting

  1. Upgrade your Cython version to 0.24.
    (I failed to build it on Cython 0.20.1post0)
    It'll require you to download Cython form http://cython.org/
    Unzip it, and python setup.py install it.
    (I tried pip install Cython but the upgrade didn't seem correct)
    (Again, add --user if you don't have root access.)
  2. Upon executing demo/demo.py, the following code might be needed in some environments (e.g. when you're working on a remote Linux server):
import matplotlib
matplotlib.use('Agg')
  1. If you encounter library not found: sndfile error upon executing demo.py,
    you might have to install it by apt-get install libsoundfile1.
    You can also replace pysoundfile with scipy or librosa, but some modification is needed:

    • librosa:
      • load(fiilename, dtype=np.float64)
      • output.write_wav(filename, wav, fs)
      • remember to pass dtype argument to ensure that the method gives you a double.
    • scipy:
      • You'll have to write a customized utility function based on the following methods
      • scipy.io.wavfile.read (but this gives you short)
      • scipy.io.wavfile.write
  2. If you have installation issue on Windows, I probably could not provide much help because my development environment is Ubuntu and Windows Subsystem for Linux (read this if you are interested in installing it).

Other Installation Suggestions

  1. Use pip install . is safer and you can easily uninstall pyworld by pip uninstall pyworld
  • For Mac users: You might need to do MACOSX_DEPLOYMENT_TARGET=10.9 pip install . See issue.
  1. Another way to install pyworld is via
    python setup.py install
    • Add --user if you don't have root access
    • Add --record install.txt to track the installation dir
  2. If you just want to try out some experiments, execute
    python setup.py build_ext --inplace
    Then you can use PyWorld from this directory.
    You can also copy the resulting pyworld.so (pyworld.{arch}.pyd on Windows) file to ~/.local/lib/python2.7/site-packages (or corresponding Windows directory) so that you can use it everywhere like an installed package.
    Alternatively you can copy/symlink the compiled files using pip, e.g. pip install -e .

Acknowledgement

Thank all contributors (tats-u, wuaalb, r9y9, rikrd, kudan2510) for making this repo better and sotelo whose world.py inspired this repo.

Owner
Jeremy Hsu
A PhD student drowning in the ocean of generative models.
Jeremy Hsu
Spotify Song Recommendation Program

Spotify-Song-Recommendation-Program Made by Esra Nur Özüm Written in Python The aim of this project was to build a recommendation system that recommen

esra nur özüm 1 Jun 30, 2022
Code for csig audio deepfake detection

FMFCC Audio Deepfake Detection Solution This repo provides an solution for the 多媒体伪造取证大赛. Our solution achieve the 1st in the Audio Deepfake Detection

BokingChen 9 Jun 04, 2022
Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21

Y-Net Official implementation of A cappella: Audio-visual Singing VoiceSeparation, British Machine Vision Conference 2021 Project page: ipcv.github.io

Juan F. Montesinos 12 Oct 22, 2022
Jarvis From Basic to Advance - make a voice assistant similar to JARVIS (in iron man movie)

JARVIS (Basic to Advance) This was my attempt to make a voice assistant similar to JARVIS (in iron man movie) Let's be honest, it's not as intelligent

codesempai 17 Dec 25, 2022
Nayeli: cool telegram groups vc music project

Nayeli-music Nayeli 🥀 is cool telegram 🍎 groups vc music project 🎋 . Nayeli-music Nayeli Deployment 🎋 📲 Esy deploy 🐾️ Source Owner ♥️ ❄️ He is s

Kasun bandara 2 Dec 20, 2021
Guide & Examples to create deeplearning gstreamer plugins and use them in your pipeline

upai-gst-dl-plugins Guide & Examples to create deeplearning gstreamer plugins and use them in your pipeline Introduction Thanks to the work done by @j

UPAI.IO 11 Dec 11, 2022
SU Music Player — The first open-source PyTgCalls based Pyrogram bot to play music in voice chats

SU Music Player — The first open-source PyTgCalls based Pyrogram bot to play music in voice chats Note Neither this, or PyTgCalls are fully

SU Projects 58 Jan 02, 2023
Hide Your Secret Message in any Wave Audio File.

HiddenWave Embedding secret messages in wave audio file What is HiddenWave Hiddenwave is a python based program for simple audio steganography. You ca

TechChip 99 Dec 28, 2022
Code for paper 'Audio-Driven Emotional Video Portraits'.

Audio-Driven Emotional Video Portraits [CVPR2021] Xinya Ji, Zhou Hang, Kaisiyuan Wang, Wayne Wu, Chen Change Loy, Xun Cao, Feng Xu [Project] [Paper] G

197 Dec 31, 2022
Noinoi music is smoothly playing music on voice chat of telegram.

NOINOI MUSIC BOT ✨ Features Music & Video stream support MultiChat support Playlist & Queue support Skip, Pause, Resume, Stop feature Music & Video do

2 Feb 13, 2022
Music player - endlessly plays your music

Music player First, if you wonder about what is supposed to be a music player or what makes a music player different from a simple media player, read

Albert Zeyer 482 Dec 19, 2022
A library for augmenting annotated audio data

muda A library for Musical Data Augmentation. muda package implements annotation-aware musical data augmentation, as described in the muda paper. The

Brian McFee 214 Nov 22, 2022
This is a realtime voice translator program which gets input from user at any language and converts it to the desired language that the user asks

This is a realtime voice translator program which gets input from user at any language and converts it to the desired language that the user asks ...

Mohan Ram S 1 Dec 30, 2021
Conferencing Speech Challenge

ConferencingSpeech 2021 challenge This repository contains the datasets list and scripts required for the ConferencingSpeech challenge. For more detai

73 Nov 29, 2022
In this project we can see how we can generate automatic music using character RNN.

Automatic Music Genaration Table of Contents Project Description Approach towards the problem Limitations Libraries Used Summary Applications Referenc

Pronay Ghosh 2 May 27, 2022
Audio library for modelling loudness

Loudness Loudness is a C++ library with Python bindings for modelling perceived loudness. The library consists of processing modules which can be casc

Dominic Ward 33 Oct 02, 2022
XA Music Player - Telegram Music Bot

XA Music Player Requirements 📝 FFmpeg (Latest) NodeJS nodesource.com (NodeJS 17+) Python (3.10+) PyTgCalls (Lastest) MongoDB (3.12.1) 2nd Telegram Ac

RexAshh 3 Jun 30, 2022
Open-Source bot to play songs in your Telegram's Group Voice Chat. Powered by @Akki_ThePro

VcPlayer Telegram Voice-Chat Bot [PyTGCalls] ⇝ Requirements ⇜ Account requirements A Telegram account to use as the music bot, You cannot use regular

Akki ThePro 2 Dec 25, 2021
cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

audioread Decode audio files using whichever backend is available. The library currently supports: Gstreamer via PyGObject. Core Audio on Mac OS X via

beetbox 419 Dec 26, 2022
Linear Prediction Coefficients estimation from mel-spectrogram implemented in Python based on Levinson-Durbin algorithm.

LPC_for_TTS Linear Prediction Coefficients estimation from mel-spectrogram implemented in Python based on Levinson-Durbin algorithm. 基于Levinson-Durbin

Zewang ZHANG 58 Nov 17, 2022