Official implementation of "Implicit Neural Representations with Periodic Activation Functions"

Related tags

Deep Learningsiren
Overview

Implicit Neural Representations with Periodic Activation Functions

Project Page | Paper | Data

Explore Siren in Colab

Vincent Sitzmann*, Julien N. P. Martel*, Alexander W. Bergman, David B. Lindell, Gordon Wetzstein
Stanford University, *denotes equal contribution

This is the official implementation of the paper "Implicit Neural Representations with Periodic Activation Functions".

siren_video

Google Colab

If you want to experiment with Siren, we have written a Colab. It's quite comprehensive and comes with a no-frills, drop-in implementation of SIREN. It doesn't require installing anything, and goes through the following experiments / SIREN properties:

  • Fitting an image
  • Fitting an audio signal
  • Solving Poisson's equation
  • Initialization scheme & distribution of activations
  • Distribution of activations is shift-invariant
  • Periodicity & behavior outside of the training range.

Tensorflow Playground

You can also play arond with a tiny SIREN interactively, directly in the browser, via the Tensorflow Playground here. Thanks to David Cato for implementing this!

Get started

If you want to reproduce all the results (including the baselines) shown in the paper, the videos, point clouds, and audio files can be found here.

You can then set up a conda environment with all dependencies like so:

conda env create -f environment.yml
conda activate siren

High-Level structure

The code is organized as follows:

  • dataio.py loads training and testing data.
  • training.py contains a generic training routine.
  • modules.py contains layers and full neural network modules.
  • meta_modules.py contains hypernetwork code.
  • utils.py contains utility functions, most promintently related to the writing of Tensorboard summaries.
  • diff_operators.py contains implementations of differential operators.
  • loss_functions.py contains loss functions for the different experiments.
  • make_figures.py contains helper functions to create the convergence videos shown in the video.
  • ./experiment_scripts/ contains scripts to reproduce experiments in the paper.

Reproducing experiments

The directory experiment_scripts contains one script per experiment in the paper.

To monitor progress, the training code writes tensorboard summaries into a "summaries"" subdirectory in the logging_root.

Image experiments

The image experiment can be reproduced with

python experiment_scripts/train_img.py --model_type=sine

The figures in the paper were made by extracting images from the tensorboard summaries. Example code how to do this can be found in the make_figures.py script.

Audio experiments

This github repository comes with both the "counting" and "bach" audio clips under ./data.

They can be trained with

python experiment_scipts/train_audio.py --model_type=sine --wav_path=<path_to_audio_file>

Video experiments

The "bikes" video sequence comes with scikit-video and need not be downloaded. The cat video can be downloaded with the link above.

To fit a model to a video, run

python experiment_scipts/train_video.py --model_type=sine --experiment_name bikes_video

Poisson experiments

For the poisson experiments, there are three separate scripts: One for reconstructing an image from its gradients (train_poisson_grad_img.py), from its laplacian (train_poisson_lapl_image.py), and to combine two images (train_poisson_gradcomp_img.py).

Some of the experiments were run using the BSD500 datast, which you can download here.

SDF Experiments

To fit a Signed Distance Function (SDF) with SIREN, you first need a pointcloud in .xyz format that includes surface normals. If you only have a mesh / ply file, this can be accomplished with the open-source tool Meshlab.

To reproduce our results, we provide both models of the Thai Statue from the 3D Stanford model repository and the living room used in our paper for download here.

To start training a SIREN, run:

python experiments_scripts/train_single_sdf.py --model_type=sine --point_cloud_path=<path_to_the_model_in_xyz_format> --batch_size=250000 --experiment_name=experiment_1

This will regularly save checkpoints in the directory specified by the rootpath in the script, in a subdirectory "experiment_1". The batch_size is typically adjusted to fit in the entire memory of your GPU. Our experiments show that with a 256, 3 hidden layer SIREN one can set the batch size between 230-250'000 for a NVidia GPU with 12GB memory.

To inspect a SDF fitted to a 3D point cloud, we now need to create a mesh from the zero-level set of the SDF. This is performed with another script that uses a marching cubes algorithm (adapted from the DeepSDF github repo) and creates the mesh saved in a .ply file format. It can be called with:

python experiments_scripts/test_single_sdf.py --checkpoint_path=<path_to_the_checkpoint_of_the_trained_model> --experiment_name=experiment_1_rec 

This will save the .ply file as "reconstruction.ply" in "experiment_1_rec" (be patient, the marching cube meshing step takes some time ;) ) In the event the machine you use for the reconstruction does not have enough RAM, running test_sdf script will likely freeze. If this is the case, please use the option --resolution=512 in the command line above (set to 1600 by default) that will reconstruct the mesh at a lower spatial resolution.

The .ply file can be visualized using a software such as Meshlab (a cross-platform visualizer and editor for 3D models).

Helmholtz and wave equation experiments

The helmholtz and wave equation experiments can be reproduced with the train_wave_equation.py and train_helmholtz.py scripts.

Torchmeta

We're using the excellent torchmeta to implement hypernetworks. We realized that there is a technical report, which we forgot to cite - it'll make it into the camera-ready version!

Citation

If you find our work useful in your research, please cite:

@inproceedings{sitzmann2019siren,
    author = {Sitzmann, Vincent
              and Martel, Julien N.P.
              and Bergman, Alexander W.
              and Lindell, David B.
              and Wetzstein, Gordon},
    title = {Implicit Neural Representations
              with Periodic Activation Functions},
    booktitle = {arXiv},
    year={2020}
}

Contact

If you have any questions, please feel free to email the authors.

Owner
Vincent Sitzmann
Incoming Assistant Professor @mit EECS. I'm researching neural scene representations - the way neural networks learn to represent information on our world.
Vincent Sitzmann
noisy labels; missing labels; semi-supervised learning; entropy; uncertainty; robustness and generalisation.

ProSelfLC: CVPR 2021 ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks For any specific discussion or potential fu

amos_xwang 57 Dec 04, 2022
A PaddlePaddle implementation of STGCN with a few modifications in the model architecture in order to forecast traffic jam.

About This repository contains the code of a PaddlePaddle implementation of STGCN based on the paper Spatio-Temporal Graph Convolutional Networks: A D

Tianjian Li 1 Jan 11, 2022
Lightweight, Python library for fast and reproducible experimentation :microscope:

Steppy What is Steppy? Steppy is a lightweight, open-source, Python 3 library for fast and reproducible experimentation. Steppy lets data scientist fo

minerva.ml 134 Jul 10, 2022
Defocus Map Estimation and Deblurring from a Single Dual-Pixel Image

Defocus Map Estimation and Deblurring from a Single Dual-Pixel Image This repository is an implementation of the method described in the following pap

21 Dec 15, 2022
Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

Han Xu 129 Dec 11, 2022
Implementations of LSTM: A Search Space Odyssey variants and their training results on the PTB dataset.

An LSTM Odyssey Code for training variants of "LSTM: A Search Space Odyssey" on Fomoro. Check out the blog post. Training Install TensorFlow. Clone th

Fomoro AI 95 Apr 13, 2022
(NeurIPS '21 Spotlight) IQ-Learn: Inverse Q-Learning for Imitation

Inverse Q-Learning (IQ-Learn) Official code base for IQ-Learn: Inverse soft-Q Learning for Imitation, NeurIPS '21 Spotlight IQ-Learn is an easy-to-use

Divyansh Garg 102 Dec 20, 2022
Code for the paper One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation, CVPR 2021.

One Thing One Click One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation (CVPR2021) Code for the paper One Thi

44 Dec 12, 2022
Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian (CVPR 2022)

Pop-Out Motion Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian (CVPR 2022) Jihyun Lee*, Minhyuk Sung*, Hyunjin Kim, Tae-Ky

Jihyun Lee 88 Nov 22, 2022
Improving adversarial robustness by a coupling rejection strategy

Adversarial Training with Rectified Rejection The code for the paper Adversarial Training with Rectified Rejection. Environment settings and libraries

Tianyu Pang 29 Jan 06, 2023
[ICCV'21] Pri3D: Can 3D Priors Help 2D Representation Learning?

Pri3D: Can 3D Priors Help 2D Representation Learning? [ICCV 2021] Pri3D leverages 3D priors for downstream 2D image understanding tasks: during pre-tr

Ji Hou 124 Jan 06, 2023
Line-level Handwritten Text Recognition (HTR) system implemented with TensorFlow.

Line-level Handwritten Text Recognition with TensorFlow This model is an extended version of the Simple HTR system implemented by @Harald Scheidl and

Hoàng Tùng Lâm (Linus) 72 May 07, 2022
This repository allows the user to automatically scale a 3D model/mesh/point cloud on Agisoft Metashape

Metashape-Utils This repository allows the user to automatically scale a 3D model/mesh/point cloud on Agisoft Metashape, given a set of 2D coordinates

INSCRIBE 4 Nov 07, 2022
Tutorial repo for an end-to-end Data Science project

End-to-end Data Science project This is the repo with the notebooks, code, and additional material used in the ITI's workshop. The goal of the session

Deena Gergis 127 Dec 30, 2022
Pre-trained Deep Learning models and demos (high quality and extremely fast)

OpenVINO™ Toolkit - Open Model Zoo repository This repository includes optimized deep learning models and a set of demos to expedite development of hi

OpenVINO Toolkit 3.4k Dec 31, 2022
Creating multimodal multitask models

Fusion Brain Challenge The English version of the document can be found here. Обновления 01.11 Мы выкладываем пример данных, аналогичных private test

Sber AI 43 Nov 28, 2022
Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

TensorFlow implementation of 3D Convolutional Neural Networks for Speaker Verification - Official Project Page - Pytorch Implementation This repositor

Amirsina Torfi 753 Dec 17, 2022
Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling

TGraM Multi-Object Tracking in Satellite Videos with Graph-Based Multi-Task Modeling, Qibin He, Xian Sun, Zhiyuan Yan, Beibei Li, Kun Fu Abstract Rece

Qibin He 6 Nov 25, 2022
GBIM(Gesture-Based Interaction map)

手势交互地图 GBIM(Gesture-Based Interaction map),基于视觉深度神经网络的交互地图,通过电脑摄像头观察使用者的手势变化,进而控制地图进行简单的交互。网络使用PaddleX提供的轻量级模型PPYOLO Tiny以及MobileNet V3 small,使得整个模型大小约10MB左右,即使在CPU下也能快速定位和识别手势。

8 Feb 10, 2022
Realtime_Multi-Person_Pose_Estimation

Introduction Multi Person PoseEstimation By PyTorch Results Require Pytorch Installation git submodule init && git submodule update Demo Download conv

tensorboy 1.3k Jan 05, 2023