Supporting code for "Autoregressive neural-network wavefunctions for ab initio quantum chemistry".

Overview

naqs-for-quantum-chemistry

Generic badge MIT License


This repository contains the codebase developed for the paper Autoregressive neural-network wavefunctions for ab initio quantum chemistry.


(a) Architecture of a neural autoregressive quantum state (NAQS) (b) Energy surface of N2

TL;DR

Certain parts of the notebooks relating to generating molecular data are currently not working due to updates to the underlying OpenFermion and Psi4 packages (I'll fix it!) - however the experimental results of NAQS can still be reproduced as we also provide pre-generated data in this repository.

If you don't care for now, and you just want to see it running, here are two links to notebooks that will set-up and run on Colab. Just note that Colab will not have enough memory to run experiments on the largest molecules we considered.

  • run_naqs.ipynb Open In Colab: Run individual experiments or batches of experiments, including those to recreate published results.

  • generate_molecular_data_and_baselines.ipynb Open In Colab:

    1. Create the [molecule].hdf5 and [molecule]_qubit_hamiltonian.pkl files required (these are provided for molecules used in the paper in the molecules directory.)
    2. Solve these molecules using various canconical QC methods using Psi4.

Overview

Quantum chemistry with neural networks

A grand challenge of ab-inito quantum chemistry (QC) is to solve the many-body Schrodinger equation describing interaction of heavy nuclei and orbiting electrons. Unfortunatley, this is an extremely (read, NP) hard problem, and so a significant amout of research effort has, and continues, to be directed towards numerical methods in QC. Typically, these methods work by optimising the wavefunction in a basis set of "Slater determinants". (In practice these are anti-symetterised tensor products of single-electron orbitals, but for our purposes let's not worry about the details.) Typically, the number of Slater determinants - and so the complexity of optimisation - grows exponentially with the system size, but recently machine learning (ML) has emerged as a possible tool with which to tackle this seemingly intractable scaling issue.

Translation/disclaimer: we can use ML and it has displayed some promising properties, but right now the SOTA results still belong to the established numerical methods (e.g. coupled-cluster) in practical settings.

Project summary

We follow the approach proposed by Choo et al. to map the exponentially complex system of interacting fermions to an equivilent (and still exponentially large) system of interacting qubits (see their or our paper for details). The advantage being that we can then apply neural network quantum states (NNQS) originally developed for condensed matter physics (CMP) (with distinguishable interacting particles) to the electron structure calculations (with indistinguishable electrons and fermionic anti-symettries).

This project proposes that simply applying techniques from CMP to QC will inevitably fail to take advantage of our significant a priori knowledge of molecular systems. Moreover, the stochastic optimisation of NNQS relies on repeatedly sampling the wavefunction, which can be prohibitively expensive. This project is a sandbox for trialling different NNQS, in particular an ansatz based on autoregressive neural networks that we present in the paper. The major benefits of our approach are that it:

  1. allows for highly efficient sampling, especially of the highly asymmetric wavefunction typical found in QC,
  2. allows for physical priors - such as conservation of electron number, overall spin and possible symettries - to be embedded into the network without sacrificing expressibility.

Getting started

In this repo

notebooks
  • run_naqs.ipynb Open In Colab: Run individual experiments or batches of experiments, including those to recreate published results.

  • generate_molecular_data_and_baselines.ipynb Open In Colab:

    1. Create the [molecule].hdf5 and [molecule]_qubit_hamiltonian.pkl files required (these are provided for molecules used in the paper in the molecules directory.)
    2. Solve these molecules using various canconical QC methods using Psi4.
experiments

Experimental scripts, including those to reproduced published results, for NAQS and Psi4.

molecules

The molecular data required to reproduce published results.

src / src_cpp

Python and cython source code for the main codebase and fast calculations, respectively.

Running experiments

Further details are provided in the run_naqs.ipynb notebook, however the published experiments can be run using the provided batch scripts.

>>> experiments/bash/naqs/batch_train.sh 0 LiH

Here, 0 is the GPU number to use (if one is available, otherwise the CPU will be used by default) and LiH can be replaced by any folder in the molecules directory. Similarly, the experimental ablations can be run using the corresponding bash scripts.

>>> experiments/bash/naqs/batch_train_no_amp_sym.sh 0 LiH
>>> experiments/bash/naqs/batch_train_no_mask.sh 0 LiH
>>> experiments/bash/naqs/batch_train_full_mask.sh 0 LiH

Requirements

The underlying neural networks require PyTorch. The molecular systems are typically handled by OpenFermion with the backend calculations and baselines requiring and Psi4. Note that this code expects OpenFermion 0.11.0 and will need refactoring to work with newer versions. Otherwise, all other required packages - numpy, matplotlib, seaborn if you want pretty plots etc - are standard. However, to be concrete, the linked Colab notebooks will provide an environment in which the code can be run.

Reference

If you find this project or the associated paper useful, it can be cited as below.

@article{barrett2021autoregressive,
  title={Autoregressive neural-network wavefunctions for ab initio quantum chemistry},
  author={Barrett, Thomas D and Malyshev, Aleksei and Lvovsky, AI},
  journal={arXiv preprint arXiv:2109.12606},
  year={2021}
}
You might also like...
TensorFlow code for the neural network presented in the paper:
TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

SLM: Structural Language Models of Code This is an official implementation of the model described in: "Structural Language Models of Code" [PDF] To ap

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

NeuralTextures This is repository with inference code for paper "StylePeople: A Generative Model of Fullbody Human Avatars" (CVPR21). This code is for

A code generator from ONNX to PyTorch code

onnx-pytorch Generating pytorch code from ONNX. Currently support onnx==1.9.0 and torch==1.8.1. Installation From PyPI pip install onnx-pytorch From

This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

KGI (Knowledge Graph Induction) for slot filling This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code fo

Convert Python 3 code to CUDA code.

Py2CUDA Convert python code to CUDA. Usage To convert a python file say named py_file.py to CUDA, run python generate_cuda.py --file py_file.py --arch

Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

Transformers for variable misuse, function naming and code completion tasks The official PyTorch implementation of: Empirical Study of Transformers fo

Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

This repository is a toolkit to do machine learning for programming languages. It implements tokenization, dataset preprocessing, model training and m

Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

CoProtector Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

Low-code/No-code approach for deep learning inference on devices
Low-code/No-code approach for deep learning inference on devices

EzEdgeAI A concept project that uses a low-code/no-code approach to implement deep learning inference on devices. It provides a componentized framewor

Comments
  • pip installation

    pip installation

    Great code. It runs very smoothly and clearly outperforms the results in Choo et al. Would you consider re-engineering the code slightly to allow for a pipy installation?

    opened by kastoryano 0
Releases(v1.0.0)
Owner
Tom Barrett
Research Scientist @ InstaDeep, formerly postdoctoral researcher @ Oxford. RL, GNN's, quantum physics, optical computing and the intersection thereof.
Tom Barrett
Async API for controlling Hue Lights

Hue API Async API for controlling Hue Lights Documentation: hue-api.nirantak.com Source: github.com/nirantak/hue-api Installation This is an async cli

Nirantak Raghav 4 Nov 16, 2022
Much faster than SORT(Simple Online and Realtime Tracking), a little worse than SORT

QSORT QSORT(Quick + Simple Online and Realtime Tracking) is a simple online and realtime tracking algorithm for 2D multiple object tracking in video s

Yonghye Kwon 8 Jul 27, 2022
Jaxtorch (a jax nn library)

Jaxtorch (a jax nn library) This is my jax based nn library. I created this because I was annoyed by the complexity and 'magic'-ness of the popular ja

nshepperd 17 Dec 08, 2022
PyTorch implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

DiscoGAN in PyTorch PyTorch implementation of Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. * All samples in READM

Taehoon Kim 1k Jan 04, 2023
WaveFake: A Data Set to Facilitate Audio DeepFake Detection

WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper

Chair for Sys­tems Se­cu­ri­ty 27 Dec 22, 2022
HEAM: High-Efficiency Approximate Multiplier Optimization for Deep Neural Networks

Approximate Multiplier by HEAM What's HEAM? HEAM is a general optimization method to generate high-efficiency approximate multipliers for specific app

4 Sep 11, 2022
This is the official repository for our paper: ''Pruning Self-attentions into Convolutional Layers in Single Path''.

Pruning Self-attentions into Convolutional Layers in Single Path This is the official repository for our paper: Pruning Self-attentions into Convoluti

Zhuang AI Group 77 Dec 26, 2022
Dist2Dec: A Simplicial Neural Network for Homology Localization

Dist2Dec: A Simplicial Neural Network for Homology Localization

Alexandros Keros 6 Jun 12, 2022
Code for "Adversarial attack by dropping information." (ICCV 2021)

AdvDrop Code for "AdvDrop: Adversarial Attack to DNNs by Dropping Information(ICCV 2021)." Human can easily recognize visual objects with lost informa

Ranjie Duan 52 Nov 10, 2022
PyTorch implementation of EfficientNetV2

[NEW!] Check out our latest work involution accepted to CVPR'21 that introduces a new neural operator, other than convolution and self-attention. PyTo

Duo Li 375 Jan 03, 2023
A small fun project using python OpenCV, mediapipe, and pydirectinput

Here I tried a small fun project using python OpenCV, mediapipe, and pydirectinput. Here we can control moves car game when yellow color come to right box (press key 'd') left box (press key 'a') lef

Sameh Elisha 3 Nov 17, 2022
CLIP (Contrastive Language–Image Pre-training) trained on Indonesian data

CLIP-Indonesian CLIP (Radford et al., 2021) is a multimodal model that can connect images and text by training a vision encoder and a text encoder joi

Galuh 17 Mar 10, 2022
PyTorch original implementation of Cross-lingual Language Model Pretraining.

XLM NEW: Added XLM-R model. PyTorch original implementation of Cross-lingual Language Model Pretraining. Includes: Monolingual language model pretrain

Facebook Research 2.7k Dec 27, 2022
MBPO (paper: When to trust your model: Model-based policy optimization) in offline RL settings

offline-MBPO This repository contains the code of a version of model-based RL algorithm MBPO, which is modified to perform in offline RL settings Pape

LxzGordon 1 Oct 24, 2021
The code for two papers: Feedback Transformer and Expire-Span.

transformer-sequential This repo contains the code for two papers: Feedback Transformer Expire-Span The training code is structured for long sequentia

Facebook Research 125 Dec 25, 2022
git《Tangent Space Backpropogation for 3D Transformation Groups》(CVPR 2021) GitHub:1]

LieTorch: Tangent Space Backpropagation Introduction The LieTorch library generalizes PyTorch to 3D transformation groups. Just as torch.Tensor is a m

Princeton Vision & Learning Lab 482 Jan 06, 2023
Official Pytorch implementation for video neural representation (NeRV)

NeRV: Neural Representations for Videos (NeurIPS 2021) Project Page | Paper | UVG Data Hao Chen, Bo He, Hanyu Wang, Yixuan Ren, Ser-Nam Lim, Abhinav S

hao 214 Dec 28, 2022
Segcache: a memory-efficient and scalable in-memory key-value cache for small objects

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects This repo contains the code of Segcache described in the followi

TheSys Group @ CMU CS 78 Jan 07, 2023
Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening

Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening Introduction This is an implementation of the model used for breast

757 Dec 30, 2022
Team Enigma at ArgMining 2021 Shared Task: Leveraging Pretrained Language Models for Key Point Matching

Team Enigma at ArgMining 2021 Shared Task: Leveraging Pretrained Language Models for Key Point Matching This is our attempt of the shared task on Quan

Manav Nitin Kapadnis 12 Jul 08, 2022