A repository for generating stylized talking 3D and 3D face

Overview

style_avatar

A repository for generating stylized talking 3D faces and 2D videos. This is the repository for paper Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis, MM 2021 The demo video can be viewed in this link: https://hcsi.cs.tsinghua.edu.cn/demo/MM21-HAOZHEWU.mp4.

framework


Quick start

Installation

conda create -n python36 python=3.6 
conda activate python36
  • Install necessary packages through pip install -r requirements.txt
  • Download the pretrained deepspeech model from the Link, and then unzip the zipped file to ./deepspeech folder.
  • Same as the instructions of Deep 3D Face Reconstruction.
    • Download the Basel Face Model. Due to the license agreement of Basel Face Model, you have to download the BFM09 model after submitting an application on its home page. After getting the access to BFM data, download "01_MorphableModel.mat" and put it into ./deep_3drecon/BFM subfolder.
    • Download Download the Expression Basis provided by Guo et al. You can find a link named "CoarseData" in the first row of Introduction part in their repository. Download and unzip the Coarse_Dataset.zip. Put "Exp_Pca.bin" into ./deep_3drecon/BFM subfolder. The expression basis are constructed using Facewarehouse data and transferred to BFM topology. Download the pre-trained reconstruction network, unzip it and put "FaceReconModel.pb" into ./deep_3drecon/network subfolder.
    • Run git lfs checkout ./deep_3drecon/BFM/BFM_model_front.mat
  • Download the pretrained audio2motion model, put it into ./audio2motion/model
  • Download the pretrained texture encoder and render, put it into ./render/model

Run

To run our demo, you need at least one GPU with 11G GPU memory.

python demo.py --in_img [*.png] --in_audio [*.wav] --output_path [path]

We provide 10 example talking styles in style.npy, you can also calculate your own style codes with the following code. Where the exp is the 3DMM series and pose is the pose matrix reconstructed from Deep 3D Face Reconstruction. Usually we calculate style codes with videos of 5-20 seconds.

def get_style_code(exp, pose):
  exp_mean_std = pkl.load(open("./data/ted_hd/exp_mean_std.pkl", 'rb'))
  exp_std_mean = exp_mean_std['s_m']
  exp_std_std = exp_mean_std['s_s']
  exp_diff_std_mean = exp_mean_std['d_s_m']
  exp_diff_std_std = exp_mean_std['d_s_s']

  pose_mean_std = pkl.load(open("./data/ted_hd/pose_mean_std.pkl", 'rb'))
  pose_diff_std_mean = pose_mean_std['d_s_m']
  pose_diff_std_std = pose_mean_std['d_s_s']

  diff_exp = exp[:-1, :] - exp[1:, :]
  exp_std = (np.std(exp, axis = 0) - exp_std_mean) / exp_std_std
  diff_exp_std = (np.std(diff_exp, axis = 0) - exp_diff_std_mean) / exp_diff_std_std

  diff_pose = pose[:-1, :] - pose[1:, :]
  diff_pose_std = (np.std(diff_pose, axis = 0) - pose_diff_std_mean) / pose_diff_std_std

  return np.concatenate((exp_std, diff_exp_std, diff_pose_std))

Notice that the pose of each talking face is static in current demo, you can control the pose of face by modifying the coeff_array in demo.py in line 93. The coeff_array has shape of $N * 257$ , where $N$ is framesize, vector of $257$ dimensions has same definition as deep 3d face reconstruction, where $254-257$ dim controls the translation, and $224-227$ dim controls euler angles for pose.


Project Overview

Our project organizes the files as follows:

├── README.md
├── data_process
├── deepspeech
├── face_alignment
├── deep_3drecon
├── render
├── audio2motion

Data process

The data process folder contains processing code of several datasets.

DeepSpeech

We leverage the DeepSpeech project to extract audio related features. Please download the pretrained deepspeech model from the Link. In deepspeech/evaluate.py, we implement the funtion get_prob to get the latent deepspeech features with input audio path. The latent deepspeech features have 50 frames per second. We should align the deepspeech features to 25 fps videos in subsequent implementations.

Face Alignment

We modify Face Alignment for data preprocess. Different from the original project, we enforce the face alignment to detect only the largest face in each frame for speed-up.

Deep 3D Face Reconstruction

We modify Deep 3D Face Reconstruction for data preprocess. We add batch-api, uv-texture unwarpping api and uv coodinate image generation api in deep_3drecon/utils.py.

Render

We implement our texture encoder and rendering model in the render folder. We also implement some other renders like neural voice puppertry.

Audio to Motion

We implement our stylized audio to facial motion model in audio2motion folder.


Data

Ted-HD data

We leverage lmdb to store the fragmented data. The data can be downloaded from link, and then run cat xa* > data.mdb. You can obtain the train/test video with the code bellow. We use the Ted-HD data to train the audio2motion model. We also provide the reconstructed 3D param and landmarks in the lmdb.

import lmdb

def test():
    lmdb_path = "./lmdb"
    env = lmdb.open(lmdb_path, map_size=1099511627776, max_dbs = 64)

    train_video = env.open_db("train_video".encode())
    train_audio = env.open_db("train_audio".encode())
    train_lm5 = env.open_db("train_lm5".encode())
    test_video = env.open_db("test_video".encode())
    test_audio = env.open_db("test_audio".encode())
    test_lm5 = env.open_db("test_lm5".encode())

    with env.begin(write = False) as txn:
        video = txn.get(str(0).encode(), db=test_video)
        audio = txn.get(str(0).encode(), db=test_audio)
        video_file = open("test.mp4", "wb")
        audio_file = open("test.wav", "wb")
        video_file.write(video)
        audio_file.write(audio)
        video_file.close()
        audio_file.close()
        print(txn.stat(db=train_video))
        print(txn.stat(db=test_video)) # we can obtain the database size here  

For the training of render, we will not provide the processed dataset due to the license of LRW.


Citation

@inproceedings{wu2021imitating,
  title={Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis},
  author={Wu, Haozhe and Jia, Jia and Wang, Haoyu and Dou, Yishun and Duan, Chao and Deng, Qingshan},
  booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
  pages={1478--1486},
  year={2021}
}

Further works

  • Current render is still buggy, there are noisy dots in the synthesized videos, we will fix this problem.
  • We will optimize the rendering results of particular person with video footage of only 2-3 seconds.
  • We will blend the synthesized results with backgrounds.
  • We will add controllable dynamic textures and light control.
Comments
  • tex_encode.pkl load failed

    tex_encode.pkl load failed

    hi, i have loaded 'backbone.pkl' successfully, but failed to load tex_encode.pkl in the same way, the error is "./render/model/tex_encoder.pkl is a zip archive (did you mean to use torch.jit.load()?)", is the model upload error ? @wuhaozhe

    opened by liyuanyaun 11
  • Getting the UV mapping from UVAtlas

    Getting the UV mapping from UVAtlas

    @wuhaozhe Thank you very much for sharing your code! I have a question on how you generate the UV mappings In the paper, you have mentioned that you use UVAtlas, but as I am going through your demo.py code, it seems that the UV maps are created by passing the 3DMM face model to the google Mesh_UV renderer. Just to be exact: In this method of the Face3D, the mesh_uv from mesh_renderer is called. Could you elaborate more on this, on how the UV maps of shape (H, W, 2) are generated.

    opened by Armen-J 9
  • unknown mat file type, version 49, 50

    unknown mat file type, version 49, 50

    when i load "BFM_model_front.mat", the error "unknown mat file type, version 49, 50" happened? can you share what version scipy you install @wuhaozhe

    opened by liyuanyaun 3
  • question

    question

    This is amazing work. I was wondering, how would I add eye blinking to the generated output video? Would I need to train a new talking style?

    thank you

    opened by skunkwerk 2
  • The code implementation is inconsistent with the paper

    The code implementation is inconsistent with the paper

    Thank you share awesome work! In the code, you extract the speech features and energy features from the audio clips. However, in your paper, you only mentioned leveraging the DeepSpeech model to extract speech features. Could you give me some advice for the above situation?

    opened by xiao-keeplearning 2
  • Any ideas for imitating the expression, head and body movement at the same time?

    Any ideas for imitating the expression, head and body movement at the same time?

    Thanks for your nice work. I have a question: If I want to imitate the expression, head and body movement at the same time (Given a source full-body image, a driving full-body video and a corresponding audio), any good ideas?

    opened by aishoot 0
  • RuntimeError: CUDA error: out of memory CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

    RuntimeError: CUDA error: out of memory CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

    Hi! Thanks for your great work! Hovewer, I faced problem when I run the python demo.py --in_img ./example/example.png --in_audio ./example/example.wav --output_path ./output Can, you please, give me some tips to overcome this issue?

    opened by muxiddin19 2
  • Reading the LMDB data issue

    Reading the LMDB data issue

    I have download the xa* data and followed the commands to create "data.mdb", but I run this code:

    env = lmdb.open(lmdb_path, map_size=1099511627776, max_dbs = 64)
    train_video = env.open_db("train_video".encode())
    

    I get this error: "lmdb.PageNotFoundError: mdb_dbi_open: MDB_PAGE_NOTFOUND: Requested page not found"

    Well, it seems strange because the data.mdb is almost 17 gigabytes

    When I print the env.stat() I get this: {'psize': 4096, 'depth': 1, 'branch_pages': 0, 'leaf_pages': 1, 'overflow_pages': 0, 'entries': 35}

    opened by Armen-J 6
  • error

    error

    Downloading: "https://www.adrianbulat.com/downloads/python-fan/2DFAN4-11f355bf06.pth.tar" to /root/.cache/torch/hub/checkpoints/2DFAN4-11f355bf06.pth.tar
    100% 91.2M/91.2M [00:05<00:00, 19.1MB/s]
    WARNING: Logging before flag parsing goes to stderr.
    W1124 05:18:17.878163 140281471424384 module_wrapper.py:139] From /content/style_avatar/deep_3drecon/utils.py:68: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
    
    W1124 05:18:17.879213 140281471424384 module_wrapper.py:139] From /content/style_avatar/deep_3drecon/utils.py:14: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.
    
    W1124 05:18:17.879425 140281471424384 module_wrapper.py:139] From /content/style_avatar/deep_3drecon/utils.py:15: The name tf.GraphDef is deprecated. Please use tf.compat.v1.GraphDef instead.
    
    W1124 05:18:18.932533 140281471424384 module_wrapper.py:139] From /content/style_avatar/deep_3drecon/face_decoder.py:129: The name tf.cross is deprecated. Please use tf.linalg.cross instead.
    
    W1124 05:18:18.933541 140281471424384 deprecation.py:506] From /content/style_avatar/deep_3drecon/face_decoder.py:131: calling l2_normalize (from tensorflow.python.ops.nn_impl) with dim is deprecated and will be removed in a future version.
    Instructions for updating:
    dim is deprecated, use axis instead
    W1124 05:18:19.503320 140281471424384 deprecation.py:323] From /content/style_avatar/deep_3drecon/mesh_renderer/mesh_renderer.py:165: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
    Instructions for updating:
    Use tf.where in 2.0, which has the same broadcast rule as np.where
    W1124 05:18:19.969749 140281471424384 module_wrapper.py:139] From /content/style_avatar/deep_3drecon/utils.py:85: The name tf.GPUOptions is deprecated. Please use tf.compat.v1.GPUOptions instead.
    
    W1124 05:18:19.970036 140281471424384 module_wrapper.py:139] From /content/style_avatar/deep_3drecon/utils.py:86: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.
    
    W1124 05:18:19.970237 140281471424384 module_wrapper.py:139] From /content/style_avatar/deep_3drecon/utils.py:86: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.
    
    2021-11-24 05:18:19.983109: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2299995000 Hz
    2021-11-24 05:18:19.983801: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5565daed5d40 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
    2021-11-24 05:18:19.983842: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
    2021-11-24 05:18:19.987771: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
    2021-11-24 05:18:19.992980: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:19.993832: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5565daed5b80 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
    2021-11-24 05:18:19.993868: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Tesla K80, Compute Capability 3.7
    2021-11-24 05:18:19.994966: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:19.995591: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties: 
    name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
    pciBusID: 0000:00:04.0
    2021-11-24 05:18:20.022505: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
    2021-11-24 05:18:20.211194: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
    2021-11-24 05:18:20.237375: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
    2021-11-24 05:18:20.260865: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
    2021-11-24 05:18:20.509169: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
    2021-11-24 05:18:20.528118: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
    2021-11-24 05:18:20.896984: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
    2021-11-24 05:18:20.897234: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:20.898117: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:20.898852: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0
    2021-11-24 05:18:20.902326: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
    2021-11-24 05:18:20.903938: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix:
    2021-11-24 05:18:20.903995: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186]      0 
    2021-11-24 05:18:20.904026: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0:   N 
    2021-11-24 05:18:20.905518: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:20.906512: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:20.907194: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10199 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
    2021-11-24 05:18:24.483044: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:24.483631: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties: 
    name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
    pciBusID: 0000:00:04.0
    2021-11-24 05:18:24.483758: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
    2021-11-24 05:18:24.483833: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
    2021-11-24 05:18:24.483898: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
    2021-11-24 05:18:24.483972: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
    2021-11-24 05:18:24.484039: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
    2021-11-24 05:18:24.484101: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
    2021-11-24 05:18:24.484165: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
    2021-11-24 05:18:24.484282: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:24.484860: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:24.485335: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0
    2021-11-24 05:18:24.486420: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:24.486925: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties: 
    name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
    pciBusID: 0000:00:04.0
    2021-11-24 05:18:24.487000: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
    2021-11-24 05:18:24.487065: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
    2021-11-24 05:18:24.487127: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
    2021-11-24 05:18:24.487192: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
    2021-11-24 05:18:24.487254: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
    2021-11-24 05:18:24.487315: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
    2021-11-24 05:18:24.487376: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
    2021-11-24 05:18:24.487489: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:24.488060: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:24.488592: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0
    2021-11-24 05:18:24.488651: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix:
    2021-11-24 05:18:24.488687: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186]      0 
    2021-11-24 05:18:24.488714: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0:   N 
    2021-11-24 05:18:24.488897: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:24.489464: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:24.489974: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10199 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
    /content/style_avatar/align_img.py:21: FutureWarning: `rcond` parameter will change to the default of machine precision times ``max(M, N)`` where M and N are the input matrix dimensions.
    To use the future default and silence this warning we advise to pass `rcond=None`, to keep using the old, explicitly pass `rcond=-1`.
      k,_,_,_ = np.linalg.lstsq(A,b)
    /content/style_avatar/align_img.py:97: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
      trans_params = np.array([w0,h0,102.0/s,t[0],t[1]])
    2021-11-24 05:18:27.965252: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
    2021-11-24 05:18:30.431787: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
    2021-11-24 05:18:32.221454: I tensorflow/stream_executor/cuda/cuda_driver.cc:831] failed to allocate 4.00G (4294967296 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
    2021-11-24 05:18:32.233011: I tensorflow/stream_executor/cuda/cuda_driver.cc:831] failed to allocate 3.60G (3865470464 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
    2021-11-24 05:18:32.244708: I tensorflow/stream_executor/cuda/cuda_driver.cc:831] failed to allocate 3.24G (3478923264 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
    2021-11-24 05:18:32.255462: I tensorflow/stream_executor/cuda/cuda_driver.cc:831] failed to allocate 2.92G (3131030784 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
    2021-11-24 05:18:32.268217: I tensorflow/stream_executor/cuda/cuda_driver.cc:831] failed to allocate 2.62G (2817927680 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
    2021-11-24 05:18:32.280419: I tensorflow/stream_executor/cuda/cuda_driver.cc:831] failed to allocate 2.36G (2536134912 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
    2021-11-24 05:18:32.280486: W tensorflow/core/common_runtime/bfc_allocator.cc:305] Garbage collection: deallocate free memory regions (i.e., allocations) so that we can re-allocate a larger region to avoid OOM due to memory fragmentation. If you see this message frequently, you are running near the threshold of the available device memory and re-allocation may incur great performance overhead. You may try smaller batch sizes to observe the performance impact. Set TF_ENABLE_GPU_GARBAGE_COLLECTION=false if you'd like to disable this feature.
    2021-11-24 05:18:34.034494: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 75497472 exceeds 10% of system memory.
    rm: cannot remove '/content/outt/*.png': No such file or directory
    2021-11-24 05:18:35.471827: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:35.472222: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties: 
    name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
    pciBusID: 0000:00:04.0
    2021-11-24 05:18:35.472343: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
    2021-11-24 05:18:35.472443: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
    2021-11-24 05:18:35.472528: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
    2021-11-24 05:18:35.472621: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
    2021-11-24 05:18:35.472738: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
    2021-11-24 05:18:35.472816: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
    2021-11-24 05:18:35.472911: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
    2021-11-24 05:18:35.473096: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:35.476636: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:35.477987: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0
    2021-11-24 05:18:35.479073: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix:
    2021-11-24 05:18:35.479127: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186]      0 
    2021-11-24 05:18:35.479157: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0:   N 
    2021-11-24 05:18:35.479720: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:35.480132: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2021-11-24 05:18:35.480518: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10199 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
    2021-11-24 05:18:36.163937: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 57802752 exceeds 10% of system memory.
    FATAL Flags parsing error: Unknown command line flag 'in_img'
    Pass --helpshort or --helpfull to see help on flags.
    
    opened by loboere 7
  • 如何调整成对中文输入也支持?

    如何调整成对中文输入也支持?

    大佬您好,感谢您的分享! 如果想尝试调整成对中文输入也适用的方案,不知道要调整哪些内容? 这边也看到其他的有类似的疑问: https://github.com/wuhaozhe/style_avatar/issues/1#issuecomment-968499818 “The deepspeech is trained on English, you can test it in Chinese, but the result wouldn't be satisfactory.” 还有一个疑问,输入的音频是训练集中不存在的,那么音画同步效果如何?

    opened by DWCTOD 1
Releases(0.1)
Owner
Haozhe Wu
Research interests in Computer Vision and Machine Learning.
Haozhe Wu
atmaCup #11 の Public 4th / Pricvate 5th Solution のリポジトリです。

#11 atmaCup 2021-07-09 ~ 2020-07-21 に行われた #11 [初心者歓迎! / 画像編] atmaCup のリポジトリです。結果は Public 4th / Private 5th でした。 フレームワークは PyTorch で、実装は pytorch-image-m

Tawara 12 Apr 07, 2022
ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis

ImageBART NeurIPS 2021 Patrick Esser*, Robin Rombach*, Andreas Blattmann*, Björn Ommer * equal contribution arXiv | BibTeX | Poster Requirements A sui

CompVis Heidelberg 110 Jan 01, 2023
Real life contra a deep learning project built using mediapipe and openc

real-life-contra Description A python script that translates the body movement into in game control. Welcome to all new real life contra a deep learni

Programminghut 7 Jan 26, 2022
PyTorch implementation of Deformable Convolution

PyTorch implementation of Deformable Convolution !!!Warning: There is some issues in this implementation and this repo is not maintained any more, ple

Wei Ouyang 893 Dec 18, 2022
RefineGNN - Iterative refinement graph neural network for antibody sequence-structure co-design (RefineGNN)

Iterative refinement graph neural network for antibody sequence-structure co-des

Wengong Jin 83 Dec 31, 2022
Intel® Nervana™ reference deep learning framework committed to best performance on all hardware

DISCONTINUATION OF PROJECT. This project will no longer be maintained by Intel. Intel will not provide or guarantee development of or support for this

Nervana 3.9k Dec 20, 2022
Geneva is an artificial intelligence tool that defeats censorship by exploiting bugs in censors

Geneva is an artificial intelligence tool that defeats censorship by exploiting bugs in censors

Kevin Bock 1.5k Jan 06, 2023
TensorFlow port of PyTorch Image Models (timm) - image models with pretrained weights.

TensorFlow-Image-Models Introduction Usage Models Profiling License Introduction TensorfFlow-Image-Models (tfimm) is a collection of image models with

Martins Bruveris 227 Dec 20, 2022
A PyTorch implementation of "SimGNN: A Neural Network Approach to Fast Graph Similarity Computation" (WSDM 2019).

SimGNN ⠀⠀⠀ A PyTorch implementation of SimGNN: A Neural Network Approach to Fast Graph Similarity Computation (WSDM 2019). Abstract Graph similarity s

Benedek Rozemberczki 534 Dec 25, 2022
Official repository of ICCV21 paper "Viewpoint Invariant Dense Matching for Visual Geolocalization"

Viewpoint Invariant Dense Matching for Visual Geolocalization: PyTorch implementation This is the implementation of the ICCV21 paper: G Berton, C. Mas

Gabriele Berton 44 Jan 03, 2023
Arch-Net: Model Distillation for Architecture Agnostic Model Deployment

Arch-Net: Model Distillation for Architecture Agnostic Model Deployment The official implementation of Arch-Net: Model Distillation for Architecture A

MEGVII Research 22 Jan 05, 2023
[CVPR'2020] DeepDeform: Learning Non-rigid RGB-D Reconstruction with Semi-supervised Data

DeepDeform (CVPR'2020) DeepDeform is an RGB-D video dataset containing over 390,000 RGB-D frames in 400 videos, with 5,533 optical and scene flow imag

Aljaz Bozic 165 Jan 09, 2023
Code for Fully Context-Aware Image Inpainting with a Learned Semantic Pyramid

SPN: Fully Context-Aware Image Inpainting with a Learned Semantic Pyramid Code for Fully Context-Aware Image Inpainting with a Learned Semantic Pyrami

12 Jun 27, 2022
In this project I played with mlflow, streamlit and fastapi to create a training and prediction app on digits

Fastapi + MLflow + streamlit Setup env. I hope I covered all. pip install -r requirements.txt Start app Go in the root dir and run these Streamlit str

76 Nov 23, 2022
Fast and simple implementation of RL algorithms, designed to run fully on GPU.

RSL RL Fast and simple implementation of RL algorithms, designed to run fully on GPU. This code is an evolution of rl-pytorch provided with NVIDIA's I

Robotic Systems Lab - Legged Robotics at ETH Zürich 68 Dec 29, 2022
[Machine Learning Engineer Basic Guide] 부스트캠프 AI Tech - Product Serving 자료

Boostcamp-AI-Tech-Product-Serving 부스트캠프 AI Tech - Product Serving 자료 Repository 구조 part1(MLOps 개론, Model Serving, 머신러닝 프로젝트 라이프 사이클은 별도의 코드가 없으며, part

Sung Yun Byeon 269 Dec 21, 2022
A flag generation AI created using DeepAIs API

Vex AI or Vexiology AI is an Artifical Intelligence created to generate custom made flag design texts. It uses DeepAIs API. Please be aware that you must include your own DeepAI API key. See instruct

Bernie 10 Apr 06, 2022
Intelligent Video Analytics toolkit based on different inference backends.

English | 中文 OpenIVA OpenIVA is an end-to-end intelligent video analytics development toolkit based on different inference backends, designed to help

Quantum Liu 15 Oct 27, 2022
SMD-Nets: Stereo Mixture Density Networks

SMD-Nets: Stereo Mixture Density Networks This repository contains a Pytorch implementation of "SMD-Nets: Stereo Mixture Density Networks" (CVPR 2021)

Fabio Tosi 115 Dec 26, 2022
Implementation and replication of ProGen, Language Modeling for Protein Generation, in Jax

ProGen - (wip) Implementation and replication of ProGen, Language Modeling for Protein Generation, in Pytorch and Jax (the weights will be made easily

Phil Wang 71 Dec 01, 2022