PyTorch code for ICPR 2020 paper Future Urban Scene Generation Through Vehicle Synthesis

Overview

Future urban scene generation through vehicle synthesis

This repository contains Pytorch code for the ICPR2020 paper "Future Urban Scene Generation Through Vehicle Synthesis" [arXiv]

Model architecture

Our framework is composed by two stages:

  1. Interpretable information extraction: high level interpretable information is gathered from raw RGB frames (bounding boxes, trajectories, keypoints).
  2. Novel view completion: condition a reprojected 3D model with the original 2D appearance.

Multi stage pipeline

Abstract

In this work we propose a deep learning pipeline to predict the visual future appearance of an urban scene. Despite recent advances, generating the entire scene in an end-to-end fashion is still far from being achieved. Instead, here we follow a two stage approach, where interpretable information are included in the loop and each actor is modelled independently. We leverage a per-object novel view synthesis paradigm; i.e. generating a synthetic representation of an object undergoing a geometrical roto-translation in the 3D space. Our model can be easily conditioned with constraints (e.g. input trajectories) provided by state-of-the-art tracking methods or by the user.

Sequence result example


Code

Code was tested with an Anaconda environment (Python version 3.6) on both Linux and Windows based systems.

Install

Run the following commands to install all requirements in a new virtual environment:

conda create -n <env_name> python=3.6
conda activate <env_name>
pip install -r requirements.txt

Install PyTorch package (version 1.3 or above).

How to run test

To run the demo of our project, please firstly download all the required data at this link and save them in a of your choice. We tested our pipeline on the Cityflow dataset that already have annotated bounding boxes and trajectories of vehicles.

The test script is run_test.py that expects some arguments as mandatory: video, 3D keypoints and checkpoints directories.

python run_test.py <data_dir>/<video_dir> <data_dir>/pascal_cads <data_dir>/checkpoints --det_mode ssd512|yolo3|mask_rcnn --track_mode tc|deepsort|moana --bbox_scale 1.15 --device cpu|cuda

Add the parameter --inpaint to use the inpainting on the vehicle instead of the static background.

Description and GUI usage

If everything went well, you should see the main GUI in which you can choose whichever vehicle you want that was detected in the video frame or change the video frame.

GUI window

The commands working on this window are:

  1. RIGHT ARROW = go to next frame
  2. LEFT ARROW = go to previous frame
  3. SINGLE MOUSE LEFT BUTTON CLICK = visualize car trajectory
  4. BACKSPACE = delete the drawn trajectories
  5. DOUBLE MOUSE LEFT BUTTON CLICK = select one of the vehicles bounding boxes

Once you selected some vehicles of your chioce by double-clicking in their bounding boxes, you can push the RUN button to start the inference. The resulting frames will be saved in ./results directory.

Cite

If you find this repository useful for your research, please cite the following paper:

@inproceedings{simoni2021future,
  title={Future urban scenes generation through vehicles synthesis},
  author={Simoni, Alessandro and Bergamini, Luca and Palazzi, Andrea and Calderara, Simone and Cucchiara, Rita},
  booktitle={2020 25th International Conference on Pattern Recognition (ICPR)},
  pages={4552--4559},
  year={2021},
  organization={IEEE}
}
Owner
Alessandro Simoni
PhD Student @ University of Modena and Reggio Emilia (@aimagelab)
Alessandro Simoni
MEND: Model Editing Networks using Gradient Decomposition

MEND: Model Editing Networks using Gradient Decomposition Setup Environment This codebase uses Python 3.7.9. Other versions may work as well. Create a

Eric Mitchell 141 Dec 02, 2022
Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy" (ICLR 2022 Spotlight)

About Code release for Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy (ICLR 2022 Spotlight)

THUML @ Tsinghua University 221 Dec 31, 2022
Densely Connected Convolutional Networks, In CVPR 2017 (Best Paper Award).

Densely Connected Convolutional Networks (DenseNets) This repository contains the code for DenseNet introduced in the following paper Densely Connecte

Zhuang Liu 4.5k Jan 03, 2023
Neural Point-Based Graphics

Neural Point-Based Graphics Project   Video   Paper Neural Point-Based Graphics Kara-Ali Aliev1 Artem Sevastopolsky1,2 Maria Kolos1,2 Dmitry Ulyanov3

Ali Aliev 252 Dec 13, 2022
Code for "Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks", CVPR 2021

Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks This repository contains the code that accompanies our CVPR 20

Despoina Paschalidou 161 Dec 20, 2022
Official implementation for the paper: Multi-label Classification with Partial Annotations using Class-aware Selective Loss

Multi-label Classification with Partial Annotations using Class-aware Selective Loss Paper | Pretrained models Official PyTorch Implementation Emanuel

99 Dec 27, 2022
Revisiting Video Saliency: A Large-scale Benchmark and a New Model (CVPR18, PAMI19)

DHF1K =========================================================================== Wenguan Wang, J. Shen, M.-M Cheng and A. Borji, Revisiting Video Sal

Wenguan Wang 126 Dec 03, 2022
Refactoring dalle-pytorch and taming-transformers for TPU VM

Text-to-Image Translation (DALL-E) for TPU in Pytorch Refactoring Taming Transformers and DALLE-pytorch for TPU VM with Pytorch Lightning Requirements

Kim, Taehoon 61 Nov 07, 2022
Fine-tuning StyleGAN2 for Cartoon Face Generation

Cartoon-StyleGAN 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation Abstract Recent studies have shown remarkable success in the unsupervised imag

Jihye Back 520 Jan 04, 2023
[2021][ICCV][FSNet] Full-Duplex Strategy for Video Object Segmentation

Full-Duplex Strategy for Video Object Segmentation (ICCV, 2021) Authors: Ge-Peng Ji, Keren Fu, Zhe Wu, Deng-Ping Fan*, Jianbing Shen, & Ling Shao This

Daniel-Ji 55 Dec 22, 2022
Code for paper Adaptively Aligned Image Captioning via Adaptive Attention Time

Adaptively Aligned Image Captioning via Adaptive Attention Time This repository includes the implementation for Adaptively Aligned Image Captioning vi

Lun Huang 45 Aug 27, 2022
Single/multi view image(s) to voxel reconstruction using a recurrent neural network

3D-R2N2: 3D Recurrent Reconstruction Neural Network This repository contains the source codes for the paper Choy et al., 3D-R2N2: A Unified Approach f

Chris Choy 1.2k Dec 27, 2022
The official repository for Deep Image Matting with Flexible Guidance Input

FGI-Matting The official repository for Deep Image Matting with Flexible Guidance Input. Paper: https://arxiv.org/abs/2110.10898 Requirements easydict

Hang Cheng 51 Nov 10, 2022
Implementation of SwinTransformerV2 in TensorFlow.

SwinTransformerV2-TensorFlow A TensorFlow implementation of SwinTransformerV2 by Microsoft Research Asia, based on their official implementation of Sw

Phan Nguyen 2 May 30, 2022
Implementation of Axial attention - attending to multi-dimensional data efficiently

Axial Attention Implementation of Axial attention in Pytorch. A simple but powerful technique to attend to multi-dimensional data efficiently. It has

Phil Wang 250 Dec 25, 2022
Kinetics-Data-Preprocessing

Kinetics-Data-Preprocessing Kinetics-400 and Kinetics-600 are common video recognition datasets used by popular video understanding projects like Slow

Kaihua Tang 7 Oct 27, 2022
Unsupervised Image to Image Translation with Generative Adversarial Networks

Unsupervised Image to Image Translation with Generative Adversarial Networks Paper: Unsupervised Image to Image Translation with Generative Adversaria

Hao 71 Oct 30, 2022
A GPT, made only of MLPs, in Jax

MLP GPT - Jax (wip) A GPT, made only of MLPs, in Jax. The specific MLP to be used are gMLPs with the Spatial Gating Units. Working Pytorch implementat

Phil Wang 53 Sep 27, 2022
CATE: Computation-aware Neural Architecture Encoding with Transformers

CATE: Computation-aware Neural Architecture Encoding with Transformers Code for paper: CATE: Computation-aware Neural Architecture Encoding with Trans

16 Dec 27, 2022
[ICCV 2021 Oral] NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo

NerfingMVS Project Page | Paper | Video | Data NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo Yi Wei, Shaohui

Yi Wei 369 Dec 24, 2022