Supporting code for "Autoregressive neural-network wavefunctions for ab initio quantum chemistry".

Overview

naqs-for-quantum-chemistry

Generic badge MIT License


This repository contains the codebase developed for the paper Autoregressive neural-network wavefunctions for ab initio quantum chemistry.


(a) Architecture of a neural autoregressive quantum state (NAQS) (b) Energy surface of N2

TL;DR

Certain parts of the notebooks relating to generating molecular data are currently not working due to updates to the underlying OpenFermion and Psi4 packages (I'll fix it!) - however the experimental results of NAQS can still be reproduced as we also provide pre-generated data in this repository.

If you don't care for now, and you just want to see it running, here are two links to notebooks that will set-up and run on Colab. Just note that Colab will not have enough memory to run experiments on the largest molecules we considered.

  • run_naqs.ipynb Open In Colab: Run individual experiments or batches of experiments, including those to recreate published results.

  • generate_molecular_data_and_baselines.ipynb Open In Colab:

    1. Create the [molecule].hdf5 and [molecule]_qubit_hamiltonian.pkl files required (these are provided for molecules used in the paper in the molecules directory.)
    2. Solve these molecules using various canconical QC methods using Psi4.

Overview

Quantum chemistry with neural networks

A grand challenge of ab-inito quantum chemistry (QC) is to solve the many-body Schrodinger equation describing interaction of heavy nuclei and orbiting electrons. Unfortunatley, this is an extremely (read, NP) hard problem, and so a significant amout of research effort has, and continues, to be directed towards numerical methods in QC. Typically, these methods work by optimising the wavefunction in a basis set of "Slater determinants". (In practice these are anti-symetterised tensor products of single-electron orbitals, but for our purposes let's not worry about the details.) Typically, the number of Slater determinants - and so the complexity of optimisation - grows exponentially with the system size, but recently machine learning (ML) has emerged as a possible tool with which to tackle this seemingly intractable scaling issue.

Translation/disclaimer: we can use ML and it has displayed some promising properties, but right now the SOTA results still belong to the established numerical methods (e.g. coupled-cluster) in practical settings.

Project summary

We follow the approach proposed by Choo et al. to map the exponentially complex system of interacting fermions to an equivilent (and still exponentially large) system of interacting qubits (see their or our paper for details). The advantage being that we can then apply neural network quantum states (NNQS) originally developed for condensed matter physics (CMP) (with distinguishable interacting particles) to the electron structure calculations (with indistinguishable electrons and fermionic anti-symettries).

This project proposes that simply applying techniques from CMP to QC will inevitably fail to take advantage of our significant a priori knowledge of molecular systems. Moreover, the stochastic optimisation of NNQS relies on repeatedly sampling the wavefunction, which can be prohibitively expensive. This project is a sandbox for trialling different NNQS, in particular an ansatz based on autoregressive neural networks that we present in the paper. The major benefits of our approach are that it:

  1. allows for highly efficient sampling, especially of the highly asymmetric wavefunction typical found in QC,
  2. allows for physical priors - such as conservation of electron number, overall spin and possible symettries - to be embedded into the network without sacrificing expressibility.

Getting started

In this repo

notebooks
  • run_naqs.ipynb Open In Colab: Run individual experiments or batches of experiments, including those to recreate published results.

  • generate_molecular_data_and_baselines.ipynb Open In Colab:

    1. Create the [molecule].hdf5 and [molecule]_qubit_hamiltonian.pkl files required (these are provided for molecules used in the paper in the molecules directory.)
    2. Solve these molecules using various canconical QC methods using Psi4.
experiments

Experimental scripts, including those to reproduced published results, for NAQS and Psi4.

molecules

The molecular data required to reproduce published results.

src / src_cpp

Python and cython source code for the main codebase and fast calculations, respectively.

Running experiments

Further details are provided in the run_naqs.ipynb notebook, however the published experiments can be run using the provided batch scripts.

>>> experiments/bash/naqs/batch_train.sh 0 LiH

Here, 0 is the GPU number to use (if one is available, otherwise the CPU will be used by default) and LiH can be replaced by any folder in the molecules directory. Similarly, the experimental ablations can be run using the corresponding bash scripts.

>>> experiments/bash/naqs/batch_train_no_amp_sym.sh 0 LiH
>>> experiments/bash/naqs/batch_train_no_mask.sh 0 LiH
>>> experiments/bash/naqs/batch_train_full_mask.sh 0 LiH

Requirements

The underlying neural networks require PyTorch. The molecular systems are typically handled by OpenFermion with the backend calculations and baselines requiring and Psi4. Note that this code expects OpenFermion 0.11.0 and will need refactoring to work with newer versions. Otherwise, all other required packages - numpy, matplotlib, seaborn if you want pretty plots etc - are standard. However, to be concrete, the linked Colab notebooks will provide an environment in which the code can be run.

Reference

If you find this project or the associated paper useful, it can be cited as below.

@article{barrett2021autoregressive,
  title={Autoregressive neural-network wavefunctions for ab initio quantum chemistry},
  author={Barrett, Thomas D and Malyshev, Aleksei and Lvovsky, AI},
  journal={arXiv preprint arXiv:2109.12606},
  year={2021}
}
You might also like...
TensorFlow code for the neural network presented in the paper:
TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

SLM: Structural Language Models of Code This is an official implementation of the model described in: "Structural Language Models of Code" [PDF] To ap

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

NeuralTextures This is repository with inference code for paper "StylePeople: A Generative Model of Fullbody Human Avatars" (CVPR21). This code is for

A code generator from ONNX to PyTorch code

onnx-pytorch Generating pytorch code from ONNX. Currently support onnx==1.9.0 and torch==1.8.1. Installation From PyPI pip install onnx-pytorch From

This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

KGI (Knowledge Graph Induction) for slot filling This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code fo

Convert Python 3 code to CUDA code.

Py2CUDA Convert python code to CUDA. Usage To convert a python file say named py_file.py to CUDA, run python generate_cuda.py --file py_file.py --arch

Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

Transformers for variable misuse, function naming and code completion tasks The official PyTorch implementation of: Empirical Study of Transformers fo

Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

This repository is a toolkit to do machine learning for programming languages. It implements tokenization, dataset preprocessing, model training and m

Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

CoProtector Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

Low-code/No-code approach for deep learning inference on devices
Low-code/No-code approach for deep learning inference on devices

EzEdgeAI A concept project that uses a low-code/no-code approach to implement deep learning inference on devices. It provides a componentized framewor

Comments
  • pip installation

    pip installation

    Great code. It runs very smoothly and clearly outperforms the results in Choo et al. Would you consider re-engineering the code slightly to allow for a pipy installation?

    opened by kastoryano 0
Releases(v1.0.0)
Owner
Tom Barrett
Research Scientist @ InstaDeep, formerly postdoctoral researcher @ Oxford. RL, GNN's, quantum physics, optical computing and the intersection thereof.
Tom Barrett
Pytorch implementation of DeepMind's differentiable neural computer paper.

DNC pytorch This is a Pytorch implementation of DeepMind's Differentiable Neural Computer (DNC) architecture introduced in their recent Nature paper:

Yuanpu Xie 91 Nov 21, 2022
Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning, NeurIPS 2021 (Spotlight)

Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning, NeurIPS 2021 (Spotlight) Abstract Due to the limited and even imbalanced dat

Hanzhe Hu 99 Dec 12, 2022
Pytorch implementation of Masked Auto-Encoder

Masked Auto-Encoder (MAE) Pytorch implementation of Masked Auto-Encoder: Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, Ross Girshick

Jiyuan 22 Dec 13, 2022
An Efficient Implementation of Analytic Mesh Algorithm for 3D Iso-surface Extraction from Neural Networks

AnalyticMesh Analytic Marching is an exact meshing solution from neural networks. Compared to standard methods, it completely avoids geometric and top

Karbo 45 Dec 21, 2022
🌾 PASTIS 🌾 Panoptic Agricultural Satellite TIme Series

🌾 PASTIS 🌾 Panoptic Agricultural Satellite TIme Series (optical and radar) The PASTIS Dataset Dataset presentation PASTIS is a benchmark dataset for

86 Jan 04, 2023
A Graph Neural Network Tool for Recovering Dense Sub-graphs in Random Dense Graphs.

PYGON A Graph Neural Network Tool for Recovering Dense Sub-graphs in Random Dense Graphs. Installation This code requires to install and run the graph

Yoram Louzoun's Lab 0 Jun 25, 2021
Teaching end to end workflow of deep learning

Deep-Education This repository is now available for public use for teaching end to end workflow of deep learning. This implies that learners/researche

Data Lab at College of William and Mary 2 Sep 26, 2022
MogFace: Towards a Deeper Appreciation on Face Detection

MogFace: Towards a Deeper Appreciation on Face Detection Introduction In this repo, we propose a promising face detector, termed as MogFace. Our MogFa

48 Dec 20, 2022
Using modified BiSeNet for face parsing in PyTorch

face-parsing.PyTorch Contents Training Demo References Training Prepare training data: -- download CelebAMask-HQ dataset -- change file path in the pr

zll 1.6k Jan 08, 2023
WSDM2022 "A Simple but Effective Bidirectional Extraction Framework for Relational Triple Extraction"

BiRTE WSDM2022 "A Simple but Effective Bidirectional Extraction Framework for Relational Triple Extraction" Requirements The main requirements are: py

9 Dec 27, 2022
AFLFast (extends AFL with Power Schedules)

AFLFast Power schedules implemented by Marcel Böhme [email protected]

Marcel Böhme 380 Jan 03, 2023
Generate images from texts. In Russian. In PaddlePaddle

ruDALL-E PaddlePaddle ruDALL-E in PaddlePaddle. Install: pip install rudalle_paddle==0.0.1rc1 Run with free v100 on AI Studio. Original Pytorch versi

AgentMaker 20 Oct 18, 2022
code and data for paper "GIANT: Scalable Creation of a Web-scale Ontology"

GIANT Code and data for paper "GIANT: Scalable Creation of a Web-scale Ontology" https://arxiv.org/pdf/2004.02118.pdf Please cite our paper if this pr

Excalibur 39 Dec 29, 2022
PyTorch implementation of UNet++ (Nested U-Net).

PyTorch implementation of UNet++ (Nested U-Net) This repository contains code for a image segmentation model based on UNet++: A Nested U-Net Architect

4ui_iurz1 642 Jan 04, 2023
Gauge equivariant mesh cnn

Geometric Mesh CNN The code in this repository is an implementation of the Gauge Equivariant Mesh CNN introduced in the paper Gauge Equivariant Mesh C

50 Dec 18, 2022
MISSFormer: An Effective Medical Image Segmentation Transformer

MISSFormer Code for paper "MISSFormer: An Effective Medical Image Segmentation Transformer". Please read our preprint at the following link: paper_add

Fong 22 Dec 24, 2022
[NeurIPS 2021] Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training

Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training Code for NeurIPS 2021 paper "Better Safe Than Sorry: Preventing Delu

Lue Tao 29 Sep 20, 2022
Multi-scale discriminator feature-wise loss function

Multi-Scale Discriminative Feature Loss This repository provides code for Multi-Scale Discriminative Feature (MDF) loss for image reconstruction algor

Graphics and Displays group - University of Cambridge 76 Dec 12, 2022
On the Limits of Pseudo Ground Truth in Visual Camera Re-Localization

On the Limits of Pseudo Ground Truth in Visual Camera Re-Localization This repository contains the evaluation code and alternative pseudo ground truth

Torsten Sattler 36 Dec 22, 2022
data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer"

C2F-FWN data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer" (https://arxiv.org/abs/

EKILI 46 Dec 14, 2022