MediaPipe is a an open-source framework from Google for building multimodal

Last update: Sep 30, 2022

Related tags

Overview

MediaPipe is a an open-source framework from Google for building multimodal (eg. video, audio, any time series data), cross platform (i.e Android, iOS, web, edge devices) applied ML pipelines. It is performance optimized with end-to-end on device inference.

To visit the official site click here

This repository enlists the following implementations:

Hand Key Point Detection
Face Key Point Detection
Pose Detection
Image Segmentation

Other curations are on their way :)

Media Credits: Mediapipe

Owner

Bhavishya Pandit

To be able to learn something new , is the biggest gift to the human race.

GitHub Repository

计算机视觉中用到的注意力模块和其他即插即用模块PyTorch Implementation Collection of Attention Module and Plug&Play Module

PyTorch实现多种计算机视觉中网络设计中用到的Attention机制，还收集了一些即插即用模块。由于能力有限精力有限，可能很多模块并没有包括进来，有任何的建议或者改进，可以提交issue或者进行PR。

599 Dec 23, 2022

Learning Calibrated-Guidance for Object Detection in Aerial Images

Learning Calibrated-Guidance for Object Detection in Aerial Images arxiv We propose a simple yet effective Calibrated-Guidance (CG) scheme to enhance

51 Sep 22, 2022

This repo is the code release of EMNLP 2021 conference paper "Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories".

Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories This repo is the code release of EMNLP 2021 con

12 Nov 22, 2022

MediaPipe is a an open-source framework from Google for building multimodal

Related tags

Overview

Owner

Bhavishya Pandit

计算机视觉中用到的注意力模块和其他即插即用模块PyTorch Implementation Collection of Attention Module and Plug&Play Module

Learning Calibrated-Guidance for Object Detection in Aerial Images

This repo is the code release of EMNLP 2021 conference paper "Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories".

Joint Detection and Identification Feature Learning for Person Search

Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

A set of examples around hub for creating and processing datasets

Official git for "CTAB-GAN: Effective Table Data Synthesizing"

Video Frame Interpolation without Temporal Priors (a general method for blurry video interpolation)

Scales, Chords, and Cadences: Practical Music Theory for MIR Researchers

code associated with ACL 2021 DExperts paper

Code for Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021)

Fight Recognition from Still Images in the Wild @ WACVW2022, Real-world Surveillance Workshop

Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch

Leaf: Multiple-Choice Question Generation

A real-time speech emotion recognition application using Scikit-learn and gradio

Melanoma Skin Cancer Detection using Convolutional Neural Networks and Transfer Learning🕵🏻‍♂️

Chinese named entity recognization with BiLSTM using Keras

A library for uncertainty quantification based on PyTorch