Ego4d dataset repository. Download the dataset, visualize, extract features & example usage of the dataset

Last update: Jan 07, 2023

Overview

Ego4D

EGO4D is the world's largest egocentric (first person) video ML dataset and benchmark suite, with 3,600 hrs (and counting) of densely narrated video and a wide range of annotations across five new benchmark tasks. It covers hundreds of scenarios (household, outdoor, workplace, leisure, etc.) of daily life activity captured in-the-wild by 926 unique camera wearers from 74 worldwide locations and 9 different countries. Portions of the video are accompanied by audio, 3D meshes of the environment, eye gaze, stereo, and/or synchronized videos from multiple egocentric cameras at the same event. The approach to data collection was designed to uphold rigorous privacy and ethics standards with consenting participants and robust de-identification procedures where relevant.

Public Documentation/Start Here: Ego4D Docs

For the CLI readme (to download/access): CLI README

For a demo notebook: Annotation Notebook

For the visualization engine: Viz README

For feature extraction: Feature README

License

Ego4D is released under the MIT License.

Ego4d dataset repository. Download the dataset, visualize, extract features & example usage of the dataset

Related tags

Overview

Ego4D

License

Owner

Meta Research

Unofficial TensorFlow implementation of the Keyword Spotting Transformer model

Rename Images with Auto Generated Neural Image Captions

Official implementation of ETH-XGaze dataset baseline

The official implementation of ELSA: Enhanced Local Self-Attention for Vision Transformer

Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

An Active Automata Learning Library Written in Python

Real-Time and Accurate Full-Body Multi-Person Pose Estimation&Tracking System

Code release for "COTR: Correspondence Transformer for Matching Across Images"

CycleTransGAN-EVC: A CycleGAN-based Emotional Voice Conversion Model with Transformer

EgGateWayGetShell py脚本

"SOLQ: Segmenting Objects by Learning Queries", SOLQ is an end-to-end instance segmentation framework with Transformer.

This repository contains code used to audit the stability of personality predictions made by two algorithmic hiring systems

The all new way to turn your boring vector meshes into the new fad in town; Voxels!

Suite of 500 procedurally-generated NLP tasks to study language model adaptability

Cross-modal Retrieval using Transformer Encoder Reasoning Networks (TERN). With use of Metric Learning and FAISS for fast similarity search on GPU

Tensorforce: a TensorFlow library for applied reinforcement learning

Bayesian dessert for Lasagne

Code of the paper "Deep Human Dynamics Prior" in ACM MM 2021.

Google Recaptcha solver.

Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.