Zoom , GoogleMeets에서 Vtuber 데뷔하기

Overview

EasyVtuber

  • Facial landmark와 GAN을 이용한 Character Face Generation
  • Google Meets, Zoom 등에서 자신만의 웹툰, 만화 캐릭터로 대화해보세요!
  • 악세사리는 어느정도 추가해도 잘 작동해요!
  • 안타깝게도 RTX 2070 미만에서는 실시간으로 잘 작동하지 않을 수도 있어요 ㅠㅠ



Demo



Requirements

  • Python >= 3.8
  • Pytorch >= 1.7
  • pyvirtualcam
  • mediapipe
  • opencv-python



Quick Start

  • ※ 이 프로젝트는 사용 전 OBS 설치가 필수입니다
  • 아래 설치 순서를 지켜주세요!
  1. OBS studio 설치

    • OBS virtualcam을 사용하기 위해서 먼저 OBS Studio를 설치해야합니다
  2. pip install -r requirements.txt

    • OBS virtualcam을 설치되어있어야 requirements에 포함된 pyvirtualcam이 정상적으로 설치되어 사용할 수 있습니다
  3. pretrianed model download

    • 아래 파일들을 pretrained folder에 넣어주세요
      • combiner.pt
      • eyebrow_decomposer.pt
      • eyebrow_morphing_combiner.pt
      • face_morpher.pt
      • two_algo_face_rotator.pt
  4. character image를 character folder에 넣어주세요

    • character image 파일은 다음의 조건을 충족해야합니다
      • alpha 채널을 포함할 것(png 확장자)
      • 1명의 인간형 캐릭터일 것
      • 캐릭터가 정면을 볼 것
      • 캐릭터의 머리가 128 x 128 pixel 내에 들어올 것 (기본적으로 256 x 256으로 resize되기 때문에 256 x 256 기준 128x128 안에 들어와야함)

    Example image is refenced by TalkingHeadAnime2

5.python main.py --webcam_output

  • 실제 facial feature가 어떻게 잡히는지 보고 싶다면 --debug 옵션을 추가하여 실행해주세요



How to make Custom Character

  1. 네이버, 구글 등에서 본인이 원하는 캐릭터를 찾으세요!
    • 되도록이면 위의 4가지 조건을 맞춰주세요! google search

  2. 찾은 이미지에서 캐릭터 얼굴이 중앙으로 가도록 가로세로 1:1 비율로 이미지를 잘라주세요!
  3. 이미지 배경을 제거해서 alpha 채널을 만들어 주세요!
  4. 완성!
    • character folder에 이미지를 넣고 python main.py --output_webcam --character (.png_제외한_캐릭터파일_이름) 실행!



Folder Structure

      │
      ├── character/ - character images 
      ├── pretrained/ - save pretrained models 
      ├── tha2/ - Talking Head Anime2 Library source files 
      ├── facial_points.py - facial feature point constants
      ├── main.py - main script to excute
      ├── models.py - GAN models defined
      ├── pose.py - process facial landmark to pose vector
      └── utils.py - util fuctions for pre/postprocessing image



Usage

webcam으로 송출 시

  • python main.py --output_webcam

캐릭터 지정

  • python main.py --character (character folder에 있는 .png를 제외한 캐릭터 파일 이름)

facial feature 확인 시

  • python main.py --debug

동영상 파일 inference

  • python main.py --input video파일_경로 --output_dir frame_저장할_디렉토리



TODOs

  • Add eyebrow feature
  • Parameter Controller GUI
  • Automation of Making Drivable Character



Thanks to



Acknowledgements

  • EasyVtuber는 TalkingHeadAnime2를 기반으로 제작되었습니다.
  • tha2 folder의 source와 pretrained model file은 원저작자 repo의 Liscense를 확인하고 사용하시기 바랍니다.
Owner
Gunwoo Han
EE student / Deep Learning / Computer Vision / Manager of 파이썬 처음처럼
Gunwoo Han
The code for “Oriented RepPoints for Aerail Object Detection”

Oriented RepPoints for Aerial Object Detection The code for the implementation of “Oriented RepPoints”, Under review. (arXiv preprint) Introduction Or

WentongLi 207 Dec 24, 2022
[EMNLP 2021] Improving and Simplifying Pattern Exploiting Training

ADAPET This repository contains the official code for the paper: "Improving and Simplifying Pattern Exploiting Training". The model improves and simpl

Rakesh R Menon 138 Dec 26, 2022
Image Smoothing and Blurring Using OpenCV

Image-Smoothing-and-Blurring-Using-OpenCV This repository contains codes for performing image smoothing and blurring using OpenCV. There are different

Happy N. Monday 3 Feb 15, 2022
Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

Dataset and Code for RealVSR Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme Xi Yang, Wangmeng Xiang,

Xi Yang 91 Nov 22, 2022

Installations for running keras-theano on GPU Upgrade pip and install opencv2 cd ~ pip install --upgrade pip pip install opencv-python Upgrade keras

Berat Kurar Barakat 14 Sep 30, 2022
OCR of Chicago 1909 Renumbering Plan

Requirements: Python 3 (probably at least 3.4) pipenv (pip3 install pipenv) tesseract (brew install tesseract, at least if you have a mac and homebrew

ted whalen 2 Nov 21, 2021
ocroseg - This is a deep learning model for page layout analysis / segmentation.

ocroseg This is a deep learning model for page layout analysis / segmentation. There are many different ways in which you can train and run it, but by

NVIDIA Research Projects 71 Dec 06, 2022
OCR software for recognition of handwritten text

Handwriting OCR The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vis

Břetislav Hájek 562 Jan 03, 2023
Solution for Problem 1 by team codesquad for AIDL 2020. Uses ML Kit for OCR and OpenCV for image processing

CodeSquad PS1 Solution for Problem Statement 1 for AIDL 2020 conducted by @unifynd technologies. Problem Given images of bills/invoices, the task was

Burhanuddin Udaipurwala 111 Nov 27, 2022
Generic framework for historical document processing

dhSegment dhSegment is a tool for Historical Document Processing. Its generic approach allows to segment regions and extract content from different ty

Digital Humanities Laboratory 343 Dec 24, 2022
An expandable and scalable OCR pipeline

Overview Nidaba is the central controller for the entire OGL OCR pipeline. It oversees and automates the process of converting raw images into citable

81 Jan 04, 2023
Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images"

TableNet Unofficial implementation of ICDAR 2019 paper : TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from

Jainam Shah 243 Dec 30, 2022
Table recognition inside douments using neural networks

TableTrainNet A simple project for training and testing table recognition in documents. This project was developed to make a neural network which reco

Giovanni Cavallin 93 Jul 24, 2022
Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

PPE ✨ Repository for our CVPR'2022 paper: Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-

Zipeng Xu 34 Nov 28, 2022
A webcam-based 3x3x3 rubik's cube solver written in Python 3 and OpenCV.

Qbr Qbr, pronounced as Cuber, is a webcam-based 3x3x3 rubik's cube solver written in Python 3 and OpenCV. 🌈 Accurate color detection 🔍 Accurate 3x3x

Kim 金可明 502 Dec 29, 2022
A simple component to display annotated text in Streamlit apps.

Annotated Text Component for Streamlit A simple component to display annotated text in Streamlit apps. For example: Installation First install Streaml

Thiago Teixeira 312 Dec 30, 2022
MXNet OCR implementation. Including text recognition and detection.

insightocr Text Recognition Accuracy on Chinese dataset by caffe-ocr Network LSTM 4x1 Pooling Gray Test Acc SimpleNet N Y Y 99.37% SE-ResNet34 N Y Y 9

Deep Insight 99 Nov 01, 2022
Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Head Detector Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection mod

Ramana Subramanyam 76 Dec 06, 2022
make a better chinese character recognition OCR than tesseract

deep ocr See README_en.md for English installation documentation. 只在ubuntu下面测试通过,需要virtualenv安装,安装路径可自行调整: git clone https://github.com/JinpengLI/deep

Jinpeng 1.5k Dec 28, 2022
PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Description This is a PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector. Only RBOX part is implemented. Using dice loss

365 Dec 20, 2022