OneFlow is a performance-centered and open-source deep learning framework.

Overview

OneFlow

OneFlow is a performance-centered and open-source deep learning framework.

Simple CI Nightly Docker Image Nightly Release Documentation

Latest News

  • Version 0.5.0 is out!
    • First class support for eager execution. The deprecated APIs are moved to oneflow.compatible.single_client
    • Drop-in replacement of import torch for existing Pytorch projects. You could test it by inter-changing import oneflow as torch and import torch as flow.
    • Full changelog

Install OneFlow

System Requirements

  • Python 3.6, 3.7, 3.8, 3.9

  • (Highly recommended) Upgrade pip

    python3 -m pip install --upgrade pip #--user
    
  • CUDA Toolkit Linux x86_64 Driver

    • CUDA runtime is statically linked into OneFlow. OneFlow will work on a minimum supported driver, and any driver beyond. For more information, please refer to CUDA compatibility documentation.

    • Please upgrade your Nvidia driver to version 440.33 or above and install OneFlow for CUDA 10.2 if possible.

Install with Pip Package

  • To install latest stable release of OneFlow with CUDA support:

    python3 -m pip install -f https://release.oneflow.info oneflow==0.5.0+cu102
  • To install nightly release of OneFlow with CUDA support:

    python3 -m pip install oneflow -f https://staging.oneflow.info/branch/master/cu102
  • To install other available builds for different variants:

    • Stable
      python3 -m pip install --find-links https://release.oneflow.info oneflow==0.5.0+[PLATFORM]
    • Nightly
      python3 -m pip install oneflow -f https://staging.oneflow.info/branch/master/[PLATFORM]
      
    • All available [PLATFORM]:
      Platform CUDA Driver Version Supported GPUs
      cu112 >= 450.80.02 GTX 10xx, RTX 20xx, A100, RTX 30xx
      cu111 >= 450.80.02 GTX 10xx, RTX 20xx, A100, RTX 30xx
      cu110, cu110_xla >= 450.36.06 GTX 10xx, RTX 20xx, A100
      cu102, cu102_xla >= 440.33 GTX 10xx, RTX 20xx
      cu101, cu101_xla >= 418.39 GTX 10xx, RTX 20xx
      cu100, cu100_xla >= 410.48 GTX 10xx, RTX 20xx
      cpu N/A N/A
  • If you are in China, you could run this to have pip download packages from domestic mirror of pypi:

    python3 -m pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
    

    For more information on this, please refer to pypi 镜像使用帮助

Use docker image

docker pull oneflowinc/oneflow:nightly-cuda10.2
docker pull oneflowinc/oneflow:nightly-cuda11.1

Build from Source

Clone Source Code
  • Option 1: Clone source code from GitHub

    git clone https://github.com/Oneflow-Inc/oneflow --depth=1
  • Option 2: Download from Aliyun

    If you are in China, please download OneFlow source code from: https://oneflow-public.oss-cn-beijing.aliyuncs.com/oneflow-src.zip

    curl https://oneflow-public.oss-cn-beijing.aliyuncs.com/oneflow-src.zip -o oneflow-src.zip
    unzip oneflow-src.zip
Build OneFlow
  • Option 1: Build with Conda (recommended)

    Please refer to this repo

  • Option 2: Build in docker container (recommended)

    • Pull a docker image:

      docker pull oneflowinc/oneflow-manylinux2014-cuda10.2:0.1
      

      All images available : https://hub.docker.com/u/oneflowinc

    • In the root directory of OneFlow source code, run:

      python3 docker/package/manylinux/build_wheel.py --python_version=3.6
      

      This should produce .whl files in the directory wheelhouse

    • If you are in China, you might need to add these flags:

      --use_tuna --use_system_proxy --use_aliyun_mirror
      
    • You can choose CUDA/Python versions of wheel by adding:

      --cuda_version=10.1 --python_version=3.6,3.7
      
    • For more useful flags, plese run the script with flag --help or refer to the source code of the script.

  • Option 3: Build on bare metal

    • Install dependencies

      • on Ubuntu 20.04, run:
        sudo apt install -y libopenblas-dev nasm g++ gcc python3-pip cmake autoconf libtool
        
      • on macOS, run:
        brew install nasm
        
    • In the root directory of OneFlow source code, run:

      mkdir build
      cd build
      
    • Config the project, inside build directory:

      • If you are in China

        run this to config for CUDA:

        cmake .. -C ../cmake/caches/cn/cuda.cmake
        

        run this to config for CPU-only:

        cmake .. -C ../cmake/caches/cn/cpu.cmake
        
      • If you are not in China

        run this to config for CUDA:

        cmake .. -C ../cmake/caches/international/cuda.cmake
        

        run this to config for CPU-only:

        cmake .. -C ../cmake/caches/international/cpu.cmake
        
    • Build the project, inside build directory, run:

      make -j$(nproc)
      
    • Add oneflow to your PYTHONPATH, inside build directory, run:

      source source.sh
      

      Please note that this change is not permanent.

    • Simple validation

      python3 -m oneflow --doctor
      

Troubleshooting

Please refer to troubleshooting for common issues you might encounter when compiling and running OneFlow.

Advanced features

XRT
  • You can check this doc to obtain more details about how to use XLA and TensorRT with OneFlow.

Getting Started

3 minutes to run MNIST.
  • Clone the demo code from OneFlow documentation

    git clone https://github.com/Oneflow-Inc/oneflow-documentation.git
    cd oneflow-documentation/cn/docs/single_client/code/quick_start/
    
  • Run it in Python

    python mlp_mnist.py
    
  • Oneflow is running and you got the training loss

    2.7290366
    0.81281316
    0.50629824
    0.35949975
    0.35245502
    ...
    
  • More info on this demo, please refer to doc on quick start.

Documentation

Model Zoo and Benchmark

Communication

The Team

OneFlow was originally developed by OneFlow Inc and Zhejiang Lab.

License

Apache License 2.0

Comments
  • source op support s and fixed generator bug

    source op support s and fixed generator bug

    这个PR的目的

    • [x] randperm op支持 S0、添加单测
    • [x] 处理 random op 支持s 过程中,各个rank间local tensor 总是相等的bug 相关背景记录在https://github.com/Oneflow-Inc/oneflow/pull/7434#issuecomment-1033306931

    random op 支持 global tensor 一致性

    1. 在处理 randint op 和 rand op 支持B/S 保持global tensor 的一致性所采取的方案是利用 GetOpKernelRandomSeed(ctx)这个工具函数进行设计,当op 支持 S时 不同rank 间调用GetOpKernelRandomSeed(ctx) 返回一个不同的seed,再通过generator->set_current_seed(ctx->Attr<int64_t>("seed") + GetOpKernelRandomSeed(ctx)) 就可以为每个rank 设计不同的seed,这样能保证uniform 类的kernel 经过S 生成 同分布 不同数值的local tensor ,当op支持B时 每个rank 上 kernel 调用GetOpKernelRandomSeed(ctx) 时会生成相同的seed ,再通过generator->set_current_seed(ctx->Attr<int64_t>("seed") + GetOpKernelRandomSeed(ctx)) 就保证了每个rank 都拿到了相同的seed,这样就可以保持global tensor 的一致性
    2. 在处理 randperm op 和 arange op 支持 B/S 时 保持 global tensor 的一致性,目前打算采用的处理方案是让多个rank 公用seed 然后在 先在每个rank上生成完整的tensor再根据 infer physic shape信息利GetTensorSliceView4ParallelId(parallel_hierarchy, nd_sbp, logical_shape, parallel_id) 这个工具函数,获得本rank_id 和 physic shape所对应的tensor 上的索引信息,再把对应的位置的数据拷贝到 本rank 的local tensor 上

    以上方案 是通过与xiaoyu,yinggang开会总结出来的

    fixed: https://github.com/Oneflow-Inc/OneTeam/issues/1167

    enhancement automerge op test graph global 
    opened by grybd 82
  • Dev non-contiguous view ops

    Dev non-contiguous view ops

    从https://github.com/Oneflow-Inc/oneflow/tree/dev_contiguous_view_ops 分支剥离出的pr,完成以下功能: 1.ods注册op时支持添加SupportNonContiguous属性,标识是否支持non-contiguous的输入tensor,不支持,则会在interpreter处统一进行tensor->contiguous()操作 2.~~导出接口flow._oneflow_internal.has_same_tensor_storage用于检查原tensor和view tensor是否共享storage~~ 3.支持以下none-contiguous view ops:

    • [x] transpose
    • [x] permute
    • [x] narrow
    • [x] expand/expand_as
    • [x] split
    • [x] chunk
    • [x] unfold_tensor
    • [x] movedim
    • [x] as_strided
    • [x] select
    • [x] swapaxes
    • [x] T/t
    • [x] hsplit/vsplit/tensor_split
    • [ ] ~~TODO(再其他pr中完成):slice/slice_update~~
    enhancement automerge eager op api 
    opened by Flowingsun007 67
  • Fix fill_

    Fix fill_

    解决 https://github.com/Oneflow-Inc/oneflow/issues/8278 提出的 oneflow.Tensor.fill_ 速度慢问题。 image

    实现 fill_ kernel 使用了两种写法:

    • 如果 value 为 Scalar,使用 fill primitive 实现
    • 如果 value 为 Tensor,分别实现算子的 GPU 和 CPU 逻辑

    性能测试结果如下: | OP | Args | Library | Kernel Time (us, GPU) | Kernel Time (us, 1 CPU) | End-to-end Time (us, 1 CPU) | Kernel Time (us, 32 CPUs) | End-to-end Time (us, 32 CPUs) | | ------------ | ----------------------------- | ------- | --------------------- | ----------------------- | --------------------------- | ------------------------- | ----------------------------- | | Tensor.fill_ | ones(1, 8, 16, 16), 2 | OneFlow | 7 | 2.5 | 10.5 | 2.4 | 9.8 | | Tensor.fill_ | ones(1, 8, 16, 16), 2 | PyTorch | 1.1 | 2.4 | 7 | 1.2 | 3.7 | | Tensor.fill_ | ones(1000, 1000), 2 | OneFlow | 21.6 | 187.6 | 189.2 | 183 | 184.6 | | Tensor.fill_ | ones(1000, 1000), 2 | PyTorch | 11 | 186.4 | 191.3 | 26.4 | 30.7 | | Tensor.fill_ | ones(1, 8, 16, 16), tensor(2) | OneFlow | 20.4 | 3.1 | 21.5 | 3.1 | 21.8 | | Tensor.fill_ | ones(1, 8, 16, 16), tensor(2) | PyTorch | 1.2 | 7.8 | 9.3 | 3.6 | 5.7 | | Tensor.fill_ | ones(1000, 1000), tensor(2) | OneFlow | 26.7 | 180.4 | 184.4 | 175.9 | 179.8 | | Tensor.fill_ | ones(1000, 1000), tensor(2) | PyTorch | 11 | 184.2 | 187.8 | 23.8 | 25.9 |

    enhancement automerge eager 
    opened by zhongshsh 64
  • Graph rename v2

    Graph rename v2

    本 pr 去掉 Block 上的 attribute 和 config

    • 1、彻底避免重名问题;
    • 2、去掉 block config;

    实现的方案: | | Eager original | Proxy ,基类叫Proxy | GraphBlock ,基类 GraphBlock | |--------|-----------------|-------------------------------------------------------------------------------|--------------------------------------------------------------------| | 功能 | 支持拿到原始的 eager类型 | 代理执行能力,使用执行接口和 Module 和 Tensor 一样,但是行为已经变化,比如是 lazy 的,可能执行的 op 也被改写了 | GraphBlock, 对应的 一个 Graph代码块,保存graph执行需要的信息,比如name/scope/lazy op or tensor,一些 graph 上的分模块的优化开关 | | Module | Module | ProxyModule,内含了一个Module成员和一个GraphModule成员 | GraphModule | | Tensor | Tensor | ProxyTensor,内含了一个Tensor成员和一个GraphTensor成员 | GraphTensor |

    用例

    from  oneflow.nn.graph import GraphModule
    import  oneflow.nn as nn
    
    class AGraph(nn.Graph):
        def __init__(self, module: nn.Module):
            super().__init__()
    
            self.m = module
            # self.m is a ProxyModule
            # ProxyModule中有两大部分,一部分是原 module,一部分是 GraphModule
            self.m.name  // 默认取 eager module 的 name
            self.m.to(GraphModule).name // 取 GraphModule 的 name
            self.m.to(nn.Module) // 取得原 nn.Module
            
            # 取到 GraphModule 上的 config 的方法
            self.m.to(GraphModule).set_stage(id, placement)
    
    

    Fix issue: https://github.com/Oneflow-Inc/oneflow/issues/9193

    另外支持 nn.Module 多重继承时的property获取

    Fix issue:https://github.com/Oneflow-Inc/oneflow/issues/9345 and https://github.com/Oneflow-Inc/oneflow/issues/9186

    enhancement automerge bug api python 
    opened by strint 60
  • add searchsorted op

    add searchsorted op

    背景:NERF网络需要用到这个算子 算子描述: 参考pytorch的实现https://pytorch.org/docs/stable/generated/torch.searchsorted.html?highlight=searchsorted#torch.searchsorted 接口与pytorch 1.10 版本实现完全对齐。

    enhancement automerge op 
    opened by yoonlee888 59
  • Optimize slice and tensor getitem

    Optimize slice and tensor getitem

    • [x] 基于issue:https://github.com/Oneflow-Inc/OneTeam/issues/1268#issuecomment-1085433728 中提到的,tensor getitem优化,对所有使用eager dataloader的网络都有效。
    • [x] test case
    enhancement feature automerge eager 
    opened by Flowingsun007 57
  • Decouple vm mem and compute

    Decouple vm mem and compute

    让vm worker线程集中注意力做OpKernel::Compute,如果除此之外其他部分的性能优化到位,理论上eager能达到最高的性能。

    指令的执行现在分为两步:

    1. Infer。包括内存分配释放,以及opkernel state和cache的准备。
    2. Compute。只执行user_op::OpKernel::Compute函数。

    Infer阶段总是在scheduler线程里执行。Compute阶段默认在Worker线程里执行,通过设置ONEFLOW_VM_WORKLOAD_ON_SCHEDULER_THREAD=1,令其在scheduler线程工作执行。

    本pr 依赖其他几个pr或分支: vm优化pr:

    1. https://github.com/Oneflow-Inc/oneflow/pull/7923 将指令实现迁移到ep
    2. https://github.com/Oneflow-Inc/oneflow/pull/7623 合并InstructionMsg和Instruction Call指令优化pr:
    3. https://github.com/Oneflow-Inc/oneflow/pull/7617 让StatefullOpKernel变得线程安全。
    4. https://github.com/Oneflow-Inc/oneflow/tree/refactor_eager_tmp_buffer_x_merge_instruction_msg_to_instruction 完全重构指令对temp storage的处理,使得Infer/Compute可以异步工作。
    enhancement eager system 
    opened by lixinqi 55
  • Refactor MemoryCase to eliminate determine statements of device_type

    Refactor MemoryCase to eliminate determine statements of device_type

    重构 MemoryCase 结构体来消除代码逻辑中对 device 的特判逻辑。

    MemoryCase 改为开放性结构,避免每次增加 DeviceType 枚举类型时,都需对 MemoryCase 进行修改。

    MemoryCase 改为开放性结构后,也可消除很多地方的 if (device_type == DeviceType::kGPU)if (mem_case.has_device_cuda_mem()) 等特判逻辑。

    重构完后理论上唯一会剩下的就是对 device mem 是否是 host mem 的逻辑判断,因为有些地方的逻辑处理要特别对待 host mem。

    重构完后并不能完全消除对 gpu device 的特判逻辑,有些特判写法是与 mem_case 无关的,目前可能重点集中在内存复用那一块的逻辑,task graph 也可能有一些残余,待后续 pr 进一步重构。

    enhancement graph need-test-distributed 
    opened by leaves-zwx 53
  • Implement oneflow.embedding op

    Implement oneflow.embedding op

    概述

    这个PR补充了oneflow.nn.Embedding的实现,之前的实现并没有考虑到padding_idxmax_normnorm_typescale_grad_by_freq 四个参数,所以直接使用了oneflow.gather,但引入上述参数之后,无法直接复用gather op,需要自定义Embedding op。

    1

    pytorch接口链接

    功能 CheckList

    注意 : 功能复选框均为可选项,若未选择,说明理由即可。例如:该 Op 由 Python 接口拼接而成,因此无 SetBatchAxisInferFn Op 注册;再比如:该 Op 无输入,因此无 SetInputArgModifyFn

    Op

    • [ ] Op SetBatchAxisInferFn
    • [x] Op SetGetSbpFn
    • [x] Op SetInputArgModifyFn
    • [x] Op 反向梯度注册

    CPU Kernel

    • [x] CPU in:float32
    • [x] CPU in:float64
    • [ ] CPU in:int32
    • [ ] CPU in:int64
    • [ ] CPU in:int8

    GPU Kernel

    • [x] GPU in:float32
    • [x] GPU in:float64
    • [ ] GPU in:int32
    • [ ] GPU in:int64
    • [x] GPU in:float16
    • [ ] GPU in:int8

    Python Wrapper

    • [x] Python API 参数检查及异常提示
    • [x] 接口注释
    • [x] Example

    测试

    • [x] 单机单卡 CPU Test Case
    • [x] 单机单卡 GPU Test Case
    • [ ] 单机多卡 CPU Test Case
    • [ ] 单机多卡 GPU Test Case
    • [ ] 分布式 CPU Test Case
    • [ ] 分布式 GPU Test Case

    GPU 有效带宽

    带 GPU 的 Op,请参考 https://github.com/Oneflow-Inc/OneTeam/issues/167 测试有效带宽,并附带测试报告。 以下是报告样例:

    理论带宽:

     Device to Device Bandwidth, 1 Device(s)
     PINNED Memory Transfers
       Transfer Size (Bytes)	Bandwidth(MB/s)
       33554432			250798.5
    

    实际带宽:

    PROFILER::KERNEL::CUDA_MEMORY_BANDWIDTH op_name: sqrt_2 elapsed(ms): 0.196064 memory_size(Byte): 50331648 bandwidth(GB/s): 239.08
    PROFILER::KERNEL::CUDA_MEMORY_BANDWIDTH op_name: sqrt_2_grad elapsed(ms): 0.29072 memory_size(Byte): 75497472 bandwidth(GB/s): 241.856
    

    PR Checklist

    • [x] PR 标题语句通畅,明确表达 PR 内容,适合直接作为新版本发布时的 changelog
    • [x] 代码格式化
    • [x] 已经本地编译通过
    • [x] 已本地针对改动测试
    • [x] 已添加 type 标签:(填写 type 标签名,如 bug, enhancement, purge, feature, documentation)
    • [x] 已添加 component 标签:(填写 component 标签名,如 op, system, eager, build, xla, python, ci, test, tooling)
    • [x] Draft 转正式 PR 前已请人 Review
    enhancement automerge op 
    opened by EsdeathYZH 46
  • check graph op global test

    check graph op global test

    This PR is done:

    • [x] 执行一些 op 的 Graph Global test(only cuda)。

    还有一些未打开 graph 测试的 global op,情况见 https://github.com/Oneflow-Inc/oneflow/pull/8614#issuecomment-1185097594 。

    enhancement automerge test graph global 
    opened by lixiang007666 39
  • Implement exponential_ and multinomial

    Implement exponential_ and multinomial

    需求来源: https://github.com/Oneflow-Inc/OneTeam/issues/1184#issuecomment-1232440993

    Todo lists

    • [x] 实现 exponential_ 算子
      • [x] functor 逻辑
      • [x] cpu kernel
      • [x] cuda kernel
      • [x] 测试
    • [x] 实现 multinomial 算子
      • [x] functor 逻辑
      • [x] cpu kernel
      • [x] cuda kernel
      • [x] 测试
    • [x] 添加 Distribution 模块
      • [x] 实现 Categorical
    feature automerge op api python need-clean-ccache 
    opened by Ldpe2G 37
  • dev add spectral_norm

    dev add spectral_norm

    @BBuf 修复了一些 spectral_norm 实现过程中遇到的bug

    • [x] 修复 dot 在 cpu 下不支持 int32 与 int64 计算的 bug (因为matmul)
    • [x] 增加 spectral_norm 的基本功能
    • [x] 修复 kaiming_uniform_ 和 kaiming_normal_ 在输入0size tensor 的时候的除0 bug
    • [x] 新增 oneflow.linalg.multi_dot()
    • [ ] oneflow.contiguous_format
    • [ ] spectral_norm 的 load_state_dict 测试与 global 测试, load 与 hook
    • [ ] spectral_norm 和 multi_dot 的文档

    好像有一些多余头文件我后面检查一下

    feature bug op 
    opened by hhhfccz 0
  • Oneflow fails in einops CI, likely due to conflict with new numpy

    Oneflow fails in einops CI, likely due to conflict with new numpy

    Summary

    ___________________ ERROR collecting tests/test_examples.py ____________________
    tests/test_examples.py:5: in <module>
        from tests.test_ops import imp_op_backends
    <frozen importlib._bootstrap>:1007: in _find_and_load
        ???
    <frozen importlib._bootstrap>:986: in _find_and_load_unlocked
        ???
    <frozen importlib._bootstrap>:680: in _load_unlocked
        ???
    /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/_pytest/assertion/rewrite.py:168: in exec_module
        exec(co, module.__dict__)
    tests/test_ops.py:10: in <module>
        imp_op_backends = collect_test_backends(symbolic=False, layers=False)
    tests/__init__.py:64: in collect_test_backends
        result.append(backend_type())
    einops/_backends.py:554: in __init__
        import oneflow as flow
    ../../../.local/lib/python3.9/site-packages/oneflow/__init__.py:199: in <module>
        import oneflow.framework.register_class_method_util as register_class_method_util
    ../../../.local/lib/python3.9/site-packages/oneflow/framework/register_class_method_util.py:17: in <module>
        import oneflow.framework.check_point_v2 as check_point_v2
    ../../../.local/lib/python3.9/site-packages/oneflow/framework/check_point_v2.py:30: in <module>
        import oneflow.framework.dtype as dtype_util
    ../../../.local/lib/python3.9/site-packages/oneflow/framework/dtype.py:49: in <module>
        oneflow.bool: np.bool,
    /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/numpy/__init__.py:284: in __getattr__
        raise AttributeError("module {!r} has no attribute "
    E   AttributeError: module 'numpy' has no attribute 'bool'
    ------------------------------- Captured stderr --------------------------------
    2022-12-27 07:50:33.696556: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.9.16/x64/lib
    2022-12-27 07:50:33.696647: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.9.16/x64/lib
    2022-12-27 07:50:33.696656: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
    

    Code to reproduce bug

    See CI job for full detailed messages and configuration:

    https://github.com/arogozhnikov/einops/actions/runs/3785978910/jobs/6436456017

    System Information

    • What is your OneFlow installation (pip, source, dockerhub): pip
    • OS: linux
    • OneFlow version (run python3 -m oneflow --doctor):
    • Python version: 3.9
    • CUDA driver version: None
    • GPU models: None
    • Other info:
    bug community 
    opened by arogozhnikov 8
Releases(v0.8.0)
Semi-Supervised Learning with Ladder Networks in Keras. Get 98% test accuracy on MNIST with just 100 labeled examples !

Semi-Supervised Learning with Ladder Networks in Keras This is an implementation of Ladder Network in Keras. Ladder network is a model for semi-superv

Divam Gupta 101 Sep 07, 2022
This is the official code for the paper "Ad2Attack: Adaptive Adversarial Attack for Real-Time UAV Tracking".

Ad^2Attack:Adaptive Adversarial Attack on Real-Time UAV Tracking Demo video 📹 Our video on bilibili demonstrates the test results of Ad^2Attack on se

Intelligent Vision for Robotics in Complex Environment 10 Nov 07, 2022
RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

RIFE - Real Time Video Interpolation arXiv | YouTube | Colab | Tutorial | Demo Table of Contents Introduction Collection Usage Evaluation Training and

hzwer 3k Jan 04, 2023
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

Deepvoice3_pytorch PyTorch implementation of convolutional networks-based text-to-speech synthesis models: arXiv:1710.07654: Deep Voice 3: Scaling Tex

Ryuichi Yamamoto 1.8k Jan 08, 2023
Source code for our EMNLP'21 paper 《Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning》

Child-Tuning Source code for EMNLP 2021 Long paper: Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning. 1. Environ

46 Dec 12, 2022
Pytorch implementation of AngularGrad: A New Optimization Technique for Angular Convergence of Convolutional Neural Networks

AngularGrad Optimizer This repository contains the oficial implementation for AngularGrad: A New Optimization Technique for Angular Convergence of Con

mario 124 Sep 16, 2022
DL & CV-based indicator toolset for the vehicle drivers via live dash-cam footage.

Vehicle Indicator Toolset Deep Learning and Computer Vision based indicator toolset for vehicle drivers using live dash-cam footages. Tracking of vehi

Alex Xu 12 Dec 28, 2021
MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

Deformable 3D Convolution for Video Super-Resolution Pytorch implementation of l

Xinyi Ying 28 Dec 15, 2022
A library for optimization on Riemannian manifolds

TensorFlow RiemOpt A library for manifold-constrained optimization in TensorFlow. Installation To install the latest development version from GitHub:

Oleg Smirnov 83 Dec 27, 2022
Post-training Quantization for Neural Networks with Provable Guarantees

Post-training Quantization for Neural Networks with Provable Guarantees Authors: Jinjie Zhang ( Yixuan Zhou 2 Nov 29, 2022

OrienMask: Real-time Instance Segmentation with Discriminative Orientation Maps

OrienMask This repository implements the framework OrienMask for real-time instance segmentation. It achieves 34.8 mask AP on COCO test-dev at the spe

45 Dec 13, 2022
Official code for "Mean Shift for Self-Supervised Learning"

MSF Official code for "Mean Shift for Self-Supervised Learning" Requirements Python = 3.7.6 PyTorch = 1.4 torchvision = 0.5.0 faiss-gpu = 1.6.1 In

UMBC Vision 44 Nov 21, 2022
Res2Net for Instance segmentation and Object detection using MaskRCNN

Res2Net for Instance segmentation and Object detection using MaskRCNN Since the MaskRCNN-benchmark of facebook is deprecated, we suggest to use our mm

Res2Net Applications 55 Oct 30, 2022
Contrastive Feature Loss for Image Prediction

Contrastive Feature Loss for Image Prediction We provide a PyTorch implementation of our contrastive feature loss presented in: Contrastive Feature Lo

Alex Andonian 44 Oct 05, 2022
Real-time analysis of intracranial neurophysiology recordings.

py_neuromodulation Click this button to run the "Tutorial ML with py_neuro" notebooks: The py_neuromodulation toolbox allows for real time capable pro

Interventional Cognitive Neuromodulation - Neumann Lab Berlin 15 Nov 03, 2022
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Daft-Exprt - PyTorch Implementation PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis The

Keon Lee 47 Dec 18, 2022
Code, environments, and scripts for the paper: "How Private Is Your RL Policy? An Inverse RL Based Analysis Framework"

Privacy-Aware Inverse RL (PRIL) Analysis Framework Code, environments, and scripts for the paper: "How Private Is Your RL Policy? An Inverse RL Based

1 Dec 06, 2021
Create and implement a deep learning library from scratch.

In this project, we create and implement a deep learning library from scratch. Table of Contents Deep Leaning Library Table of Contents About The Proj

Rishabh Bali 22 Aug 23, 2022
This project uses Template Matching technique for object detecting by detection of template image over base image.

Object Detection Project Using OpenCV This project uses Template Matching technique for object detecting by detection the template image over base ima

Pratham Bhatnagar 7 May 29, 2022
WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose Yijun Zhou and James Gregson - BMVC2020 Abstract: We present an end-to-end head-pos

368 Dec 26, 2022