a spacial-temporal pattern detection system for home automation

Related tags

Deep Learningargos
Overview

Argos

docker pulls

a spacial-temporal pattern detection system for home automation. Based on OpenCV and Tensorflow, can run on raspberry pi and notify HomeAssistant via MQTT or webhooks.

Demo

Have a spare raspberry pi or jetson nano (or old laptop/mac mini) lying around? Have wifi connected security cams in your house (or a raspi camera)? Want to get notified when someone exits or enters your main door? When someone waters your plants (or forgets to)? When your dog hasn't been fed food in a while, or hasn't eaten? When someone left the fridge door open and forgot? left the gas stove running and forgot? when birds are drinking from your dog's water bowl? Well, you're not alone, and you're at the right place :)

Architecture

argos

  • Take a video input (a raspberry pi camera if run on a rpi, an RTMP stream of a security cam, or a video file)
  • Run a simple motion detection algorithm on the stream, applying minimum box thresholds, negative masks and masks
  • Run object detection on either the cropped frame where motion was detected or even the whole frame if needed, using tensorflow object detection API. There is support for both tensorflow 1 and 2 as well as tensorflow lite, and custom models as well
  • Serves a flask webserver to allow you to see the motion detection and object detection in action, serve a mpeg stream which can be configured as a camera in HomeAssistant
  • Object detection is also highly configurable to threshold or mask out false positives
  • Object detection features an optional "detection buffer' which can be used to get the average detection in moving window of frames before reporting the maximum cumulative average detection
  • Supports sending notifications to HomeAssistant via MQTT or webhooks. Webhook notification send the frame on which the detection was triggered, to allow you to create rich media notifications from it via the HA android or iOS apps.
  • Pattern detection: both the motion-detector and object-detector send events to a queue which is monitored and analyzed by a pattern detector. You can configure your own "movement patterns" - e.g. a person is exiting a door or entering a door, or your dog is going to the kitchen. It keeps a configurable history of states (motion detected in a mask, outside a mask, object detected (e.g. person), etc.) and your movement patterns are pre-configured sequence of states which identify that movement. door_detect.py provides a movement pattern detector to detect if someone is entering or exiting a door
  • All of the above functionality is provided by running stream.py. There's also serve.py which serves as an object detection service which can be called remotely from a low-grade CPU device like a raspberry pi zero w which cannot run tensorflow lite on its own. The motion detector can still be run on the pi zero, and only object detection can be done remotely by calling this service, making a distributed setup.
  • Architected to be highly concurrent and asynchronous (uses threads and queue's between all the components - flask server, motion detector, object detector, pattern detector, notifier, mqtt, etc)
  • Has tools to help you generate masks, test and tune the detectors, etc.
  • Every aspect of every detector can be tuned in the config files (which are purposefully kept as python classes and not yaml), every aspect is logged with colored output on the console for you to debug what is going on.

Installation

On a pi, as a systemd service
cd ~
git clone https://github.com/angadsingh/argos
sudo apt-get install python3-pip
sudo apt-get install python3-venv
pip3 install --upgrade pip
python3 -m venv argos-venv/
source argos-venv/bin/activate
pip install https://github.com/bitsy-ai/tensorflow-arm-bin/releases/download/v2.4.0/tensorflow-2.4.0-cp37-none-linux_armv7l.whl
pip install wheel
pip install -r argos/requirements.txt

#only required for tf2
git clone https://github.com/tensorflow/models.git
cd models/research/object_detection/packages/tf2
python -m pip install . --no-deps

make a systemd service to run it automatically

cd ~/argos
sudo cp resources/systemd/argos_serve.service /etc/systemd/system/
sudo cp resources/systemd/argos_stream.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable argos_serve.service
sudo systemctl enable argos_stream.service
sudo systemctl start argos_serve
sudo systemctl start argos_stream

see the logs

journalctl --unit argos_stream.service -f
As a docker container

You can use the following instructions to install argos as a docker container (e.g. if you already use docker on your rpi for hassio-supervised, or you intend to install it on your synology NAS which has docker, or you just like docker)

Install docker (optional)

curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh

Run argos as a docker container

Note: replace the docker tag name below for your cpu architecture

image example device notes
angadsingh/argos:armv7 raspberry pi 2/3/4+
angadsingh/argos:x86_64 PC, Mac
angadsingh/argos:x86_64_gpu PC, Mac tensorflow with gpu support. run with docker flag --runtime=nvidia

stream.py:

docker run --rm -p8081:8081 -v configs:/configs \
						-v /home/pi/detections:/output_detections \
						-v /home/pi/argos-ssh:/root/.ssh angadsingh/argos:armv7 \
						/usr/src/argos/stream.py --ip 0.0.0.0 --port 8081 \
						--config configs.your_config

serve.py:

docker run --rm -p8080:8080 -v configs:/configs \
						-v /home/pi/upload:/upload angadsingh/argos:armv7 \
						/usr/src/argos/serve.py --ip 0.0.0.0 --port 8080 \
						--config configs.your_config  --uploadfolder "/upload"

make a systemd service to run it automatically. these services automatically download the latest docker image and run them for you: (note: you'll have to change the docker tag inside the service file for your cpu architecture)

sudo wget https://raw.githubusercontent.com/angadsingh/argos/main/resources/systemd/argos_serve_docker.service -P /etc/systemd/system/
sudo wget https://raw.githubusercontent.com/angadsingh/argos/main/resources/systemd/argos_stream_docker.service -P /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable argos_serve_docker.service
sudo systemctl enable argos_stream_docker.service
sudo systemctl start argos_serve_docker
sudo systemctl start argos_stream_docker

see the logs

journalctl --unit argos_serve_docker.service -f
journalctl --unit argos_stream_docker.service -f

Usage

stream.py - runs the motion detector, object detector (with detection buffer) and pattern detector

stream.py --ip 0.0.0.0 --port 8081 --config configs.config_tflite_ssd_example
Method Endpoint Description
Browse / will show a web page with the real time processing of the input video stream, and a separate video stream showing the object detector output
GET /status status shows the current load on the system
GET /config shows the config
GET /config?= will let you edit any config parameter without restarting the service
GET /image returns the latest frame as a JPEG image (useful in HA generic camera platform)
GET /video_feed streams an MJPEG video stream of the motion detector (useful in HA generic camera platform)
GET /od_video_feed streams an MJPEG video stream of the object detector

serve.py

serve.py --ip 0.0.0.0 --port 8080 --config configs.config_tflite_ssd_example --uploadfolder upload
Method Endpoint Description
POST /detect params:

file: the jpeg file to run the object detector on
threshold: object detector threshold (override config.tf_accuracy_threshold)
nmask: base64 encoded negative mask to apply. format: (xmin, ymin, xmax, ymax)

Home assistant automations

ha_automations/notify_door_movement_at_entrance.yaml - triggered by pattern detector ha_automations/notify_person_is_at_entrance.yaml - triggered by object detector

both of these use HA webhooks. i used MQTT earlier but it was too delayed and unreliable for my taste. the project still supports MQTT though and you'll have to make mqtt sensors in HA for the topics you're sending the notifications to here.

Configuration

both stream.py and serve.py share some configuration for the object detection, but stream.py builds on top of that with a lot more configuration for the motion detector, object detection buffer, pattern detector, and stream input configuration, etc. The example config documents the meaning of all the parameters

Performance

This runs at the following FPS with every component enabled:

device component fps
raspberry pi 4B motion detector 18 fps
raspberry pi 4B object detector (tflite) 5 fps

I actually run multiple of these for different RTMP cameras, each at 1 fps (which is more than enough for all real time home automation use cases)

Note:

This is my own personal project. It is not really written in a readable way with friendly abstractions, as that wasn't the goal. The goal was to solve my home automation problem quickly so that I can get back to real work :) So feel free to pick and choose snippets of code as you like or the whole solution if it fits your use case. No compromises were made in performance or accuracy, only 'coding best practices'. I usually keep such projects private but thought this is now meaty enough to be usable to someone else in ways I cannot imagine, so don't judge this project on its maturity or reuse readiness level ;) . Feel free to fork this project and make this an extendable framework if you have the time.

If you have any questions feel free to raise a github issue and i'll respond as soon as possible

Special thanks to these resources on the web for helping me build this.

Owner
Angad Singh
Angad Singh
Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction [Paper] [PaddlePaddle Implementation] Homepage of paper: Paint Transformer: Fee

442 Dec 16, 2022
This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their coordinates and detected labels.

This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their

Liron Bdolah 8 May 22, 2022
COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping

COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping Version 1.0 COVINS is an accurate, scalable, and versatile vis

ETHZ V4RL 183 Dec 27, 2022
Parametric Contrastive Learning (ICCV2021)

Parametric-Contrastive-Learning This repository contains the implementation code for ICCV2021 paper: Parametric Contrastive Learning (https://arxiv.or

DV Lab 156 Dec 21, 2022
Official implementation of "Implicit Neural Representations with Periodic Activation Functions"

Implicit Neural Representations with Periodic Activation Functions Project Page | Paper | Data Vincent Sitzmann*, Julien N. P. Martel*, Alexander W. B

Vincent Sitzmann 1.4k Jan 06, 2023
Very Deep Convolutional Networks for Large-Scale Image Recognition

pytorch-vgg Some scripts to convert the VGG-16 and VGG-19 models [1] from Caffe to PyTorch. The converted models can be used with the PyTorch model zo

Justin Johnson 217 Dec 05, 2022
Code for Discriminative Sounding Objects Localization (NeurIPS 2020)

Discriminative Sounding Objects Localization Code for our NeurIPS 2020 paper Discriminative Sounding Objects Localization via Self-supervised Audiovis

51 Dec 11, 2022
Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)

StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021 Oral) Run this model on Replicate Optimization: Global directions: Mapper: Check ou

3.3k Jan 05, 2023
PyTorch implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation.

PyTorch implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation. Warning: the master branch might collapse. To ob

559 Dec 14, 2022
A Convolutional Transformer for Keyword Spotting

☢️ Audiomer ☢️ Audiomer: A Convolutional Transformer for Keyword Spotting [ arXiv ] [ Previous SOTA ] [ Model Architecture ] Results on SpeechCommands

49 Jan 27, 2022
Use CLIP to represent video for Retrieval Task

A Straightforward Framework For Video Retrieval Using CLIP This repository contains the basic code for feature extraction and replication of results.

Jesus Andres Portillo Quintero 54 Dec 22, 2022
Collapse by Conditioning: Training Class-conditional GANs with Limited Data

Collapse by Conditioning: Training Class-conditional GANs with Limited Data Moha

Mohamad Shahbazi 33 Dec 06, 2022
Prefix-Tuning: Optimizing Continuous Prompts for Generation

Prefix Tuning Files: . ├── gpt2 # Code for GPT2 style autoregressive LM │ ├── train_e2e.py # high-level script

530 Jan 04, 2023
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

Taming Visually Guided Sound Generation • [Project Page] • [ArXiv] • [Poster] • • Listen for the samples on our project page. Overview We propose to t

Vladimir Iashin 226 Jan 03, 2023
DeRF: Decomposed Radiance Fields

DeRF: Decomposed Radiance Fields Daniel Rebain, Wei Jiang, Soroosh Yazdani, Ke Li, Kwang Moo Yi, Andrea Tagliasacchi Links Paper Project Page Abstract

UBC Computer Vision Group 24 Dec 02, 2022
Implementation of DropLoss for Long-Tail Instance Segmentation in Pytorch

[AAAI 2021]DropLoss for Long-Tail Instance Segmentation [AAAI 2021] DropLoss for Long-Tail Instance Segmentation Ting-I Hsieh*, Esther Robb*, Hwann-Tz

Tim 37 Dec 02, 2022
Cognition-aware Cognate Detection

Cognition-aware Cognate Detection The repository which contains our code for our EACL 2021 paper titled, "Cognition-aware Cognate Detection". This wor

Prashant K. Sharma 1 Feb 01, 2022
This repository provides a basic implementation of our GCPR 2021 paper "Learning Conditional Invariance through Cycle Consistency"

Learning Conditional Invariance through Cycle Consistency This repository provides a basic TensorFlow 1 implementation of the proposed model in our GC

BMDA - University of Basel 1 Nov 04, 2022
A task-agnostic vision-language architecture as a step towards General Purpose Vision

Towards General Purpose Vision Systems By Tanmay Gupta, Amita Kamath, Aniruddha Kembhavi, and Derek Hoiem Overview Welcome to the official code base f

AI2 79 Dec 23, 2022
Manim is an engine for precise programmatic animations, designed for creating explanatory math videos

Manim is an engine for precise programmatic animations, designed for creating explanatory math videos. Note, there are two versions of manim. This rep

Grant Sanderson 49k Jan 09, 2023