Contextual speed detection for python

Overview

Speed Prediction using Optical Flow and 2D CNN

About the challenge:

Comma.AI Speed Challenge This challenge was developed by Comma.AI to predict the speed of a car from a video.

Pipeline

Model

Tensorflow Version: 2.2.0

Steps for implementing speed estimation:

  1. Save the images from the train.mp4 and test.mp4 video using DatasetConverter.py.
  2. Convert the images from the videos, computer dense optical flow on the image sequence and save optical flow images using VideoToOpticalFlowImage.py.
  3. Train the network below on optical flow images and save the best performing model using custom callback.
  4. Use the saved model on the testing dataset using UseModel.py.

Optical Flow

Optical flow is computed on two adjacent image frames in a video, converted it to grayscal and applying cv2.calcOpticalFlowFarneback() which outputs two matrices of same shape as compared to the input shape. Each pixel of the output images denotes the change in its position and speed respectively with respect to the previous image frame. For visualization and training, the output images are combined into single HSV color channel based image.

Data Augmentation

Every single images is flipped horizontally having the target value same as the images from which it is derived. This data augmentation played significant role in reducing validation loss.

Model

The following model is a 2D CNN based model made to be used on optical flow images. As compared to a 3D CNN based model trained on images from video, using optical flow with 2D CNN is faster to train and has lower MSE loss.

Training:

Trained the 2D CNN for 150 epochs to get a validation MSE loss of 0.18 and training MSE loss of 0.05

Output:

This gif below has the prediction vs ground truth for the images on which the model is trained:

Train Prediction

This gif is the prediction on the test images:

Test Prediction

Learning:

  1. Image augmentation significantly improves the speed estimation of the model
  2. Writing custom data generators for reading batches of images and ground truth
  3. 2D CNN with optical flow performs better than 3D CNN in terms of training time and accuracy

Reference:

  1. speed-estimation-of-car-with-optical-flow
  2. speed-prediction-challenge
Owner
Mahimana Bhatt
Solving problems through code
Mahimana Bhatt
Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

An Image is Worth 16x16 Words, What is a Video Worth? paper Official PyTorch Implementation Gilad Sharir, Asaf Noy, Lihi Zelnik-Manor DAMO Academy, Al

213 Nov 12, 2022
A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database.

A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database. The structure, shape and proportions of the faces are comp

Pavankumar Khot 4 Mar 19, 2022
A curated list of resources dedicated to scene text localization and recognition

Scene Text Localization & Recognition Resources A curated list of resources dedicated to scene text localization and recognition. Any suggestions and

CarlosTao 1.6k Dec 22, 2022
The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

3D Human Pose Estimation with Spatial and Temporal Transformers This repo is the official implementation for 3D Human Pose Estimation with Spatial and

Ce Zheng 363 Dec 28, 2022
Motion Detection Squid Game with OpenCV Python

*Motion Detection Squid Game with OpenCV Python i am newbie in python. In this project I made a simple game to follow the trend about the red light gr

Nayan 17 Nov 22, 2022
Distort a video using Seam Carving (video) and Vibrato effect (sound)

Distort videos Applies a Seam Carving algorithm (aka liquid rescale) on every frame of a video, and a vibrato effect on the audio to distort the video

AlexZeGamer 6 Dec 06, 2022
EAST for ICPR MTWI 2018 Challenge II (Text detection of network images)

EAST_ICPR2018: EAST for ICPR MTWI 2018 Challenge II (Text detection of network images) Introduction This is a repository forked from argman/EAST for t

QichaoWu 49 Dec 24, 2022
✌️Using this you can control your PC/Laptop volume by Hand Gestures created with Python.

Hand Gesture Volume Controller ✋ Hand recognition 👆 Finger recognition 🔊 you can decrease and increase volume Demo Code Firstly I have created a Mod

Abbas Ataei 19 Nov 17, 2022
Thresholding-and-masking-using-OpenCV - Image Thresholding is used for image segmentation

Image Thresholding is used for image segmentation. From a grayscale image, thresholding can be used to create binary images. In thresholding we pick a threshold T.

Grace Ugochi Nneji 3 Feb 15, 2022
An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss This is an unofficial implementation of AutoVC based on the official one. The reposi

Chien-yu Huang 27 Jun 16, 2022
Automatically resolve RidderMaster based on TensorFlow & OpenCV

AutoRiddleMaster Automatically resolve RidderMaster based on TensorFlow & OpenCV 基于 TensorFlow 和 OpenCV 实现的全自动化解御迷士小马谜题 Demo How to use Deploy the ser

神龙章轩 5 Nov 19, 2021
PyQT5 app that colorize black & white pictures using CNN(use pre-trained model which was made with OpenCV)

About PyQT5 app that colorize black & white pictures using CNN(use pre-trained model which was made with OpenCV) Colorizor Приложение для проекта Yand

1 Apr 04, 2022
Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless. This is the official Roboflow python package that interfaces with the Roboflow API.

Roboflow 52 Dec 23, 2022
Textboxes implementation with Tensorflow (python)

tb_tensorflow A python implementation of TextBoxes Dependencies TensorFlow r1.0 OpenCV2 Code from Chaoyue Wang 03/09/2017 Update: 1.Debugging optimize

Jayne Shin (신재인) 20 May 31, 2019
Solution for Problem 1 by team codesquad for AIDL 2020. Uses ML Kit for OCR and OpenCV for image processing

CodeSquad PS1 Solution for Problem Statement 1 for AIDL 2020 conducted by @unifynd technologies. Problem Given images of bills/invoices, the task was

Burhanuddin Udaipurwala 111 Nov 27, 2022
code for our ICCV 2021 paper "DeepCAD: A Deep Generative Network for Computer-Aided Design Models"

DeepCAD This repository provides source code for our paper: DeepCAD: A Deep Generative Network for Computer-Aided Design Models Rundi Wu, Chang Xiao,

Rundi Wu 85 Dec 31, 2022
A dataset handling library for computer vision datasets in LOST-fromat

A dataset handling library for computer vision datasets in LOST-fromat

8 Dec 15, 2022
A simple component to display annotated text in Streamlit apps.

Annotated Text Component for Streamlit A simple component to display annotated text in Streamlit apps. For example: Installation First install Streaml

Thiago Teixeira 312 Dec 30, 2022
This is a GUI program which consist of 4 OpenCV projects

Tkinter-OpenCV Project Using Tkinter, Opencv, Mediapipe This is a python GUI program using Tkinter which consist of 4 OpenCV projects 1. Finger Counte

Arya Bagde 3 Feb 22, 2022
Text page dewarping using a "cubic sheet" model

page_dewarp Page dewarping and thresholding using a "cubic sheet" model - see full writeup at https://mzucker.github.io/2016/08/15/page-dewarping.html

Matt Zucker 1.2k Dec 29, 2022