Towards Representation Learning for Atmospheric Dynamics (AtmoDist)

Related tags

Deep LearningAtmoDist
Overview

Towards Representation Learning for Atmospheric Dynamics (AtmoDist)

The prediction of future climate scenarios under anthropogenic forcing is critical to understand climate change and to assess the impact of potentially counter-acting technologies. Machine learning and hybrid techniques for this prediction rely on informative metrics that are sensitive to pertinent but often subtle influences. For atmospheric dynamics, a critical part of the climate system, no well established metric exists and visual inspection is currently still often used in practice. However, this "eyeball metric" cannot be used for machine learning where an algorithmic description is required. Motivated by the success of intermediate neural network activations as basis for learned metrics, e.g. in computer vision, we present a novel, self-supervised representation learning approach specifically designed for atmospheric dynamics. Our approach, called AtmoDist, trains a neural network on a simple, auxiliary task: predicting the temporal distance between elements of a randomly shuffled sequence of atmospheric fields (e.g. the components of the wind field from reanalysis or simulation). The task forces the network to learn important intrinsic aspects of the data as activations in its layers and from these hence a discriminative metric can be obtained. We demonstrate this by using AtmoDist to define a metric for GAN-based super resolution of vorticity and divergence. Our upscaled data matches both visually and in terms of its statistics a high resolution reference closely and it significantly outperform the state-of-the-art based on mean squared error. Since AtmoDist is unsupervised, only requires a temporal sequence of fields, and uses a simple auxiliary task, it has the potential to be of utility in a wide range of applications.

Original implementation of

Hoffmann, Sebastian, and Christian Lessig. "Towards Representation Learning for Atmospheric Dynamics." arXiv preprint arXiv:2109.09076 (2021). https://arxiv.org/abs/2109.09076

presented as part of the NeurIPS 2021 Workshop on Tackling Climate Change with Machine Learning

We would like to thank Stengel et al. for openly making available their implementation (https://github.com/NREL/PhIRE) of Adversarial super-resolution of climatological wind and solar data on which we directly based the super-resolution part of this work.


Requirements

  • tensorflow 1.15.5
  • pyshtools (for SR evaluation)
  • pyspharm (for SR evaluation)
  • h5py
  • hdf5plugin
  • dask.array

Installation

pip install -e ./

This also makes available multiple command line tools that provide easy access to preprocessing, training, and evaluation routines. It's recommended to install the project in a virtual environment as to not polutte the global PATH.


CLI Tools

The provided CLI tools don't accept parameters but rather act as a shortcut to execute the corresponding script files. All parameters controlling the behaviour of the training etc. should thus be adjusted in the script files directly. We list both the command-line command, as well as the script file the command executes.

  • rplearn-data (python/phire/data_tool.py)
    • Samples patches and generates .tfrecords files from HDF5 data for the self-supervised representation-learning task.
  • rplearn-train (python/phire/rplearn/train.py)
    • Trains the representation network. By toggling comments, the same script is also used for evaluation of the trained network.
  • phire-data (python/phire/data_tool.py)
    • Samples patches and generates .tfrecords files from HDF5 data for the super-resolution task.
  • phire-train (python/phire/main.py)
    • Trains the SRGAN model using either MSE or a content-loss based on AtmoDist.
  • phire-eval (python/phire/evaluation/cli.py)
    • Evaluates trained SRGAN models using various metrics (e.g. energy spectrum, semivariogram, etc.). Generation of images is also part of this.

Project Structure

  • python/phire
    • Mostly preserved from the Stengel et al. implementation, this directory contains the code for the SR training. sr_network.py contains the actual GAN model, whereas PhIREGANs.py contains the main training loop, data pipeline, as well as interference procedure.
  • python/phire/rplearn
    • Contains everything related to representation learning task, i.e. AtmoDist. The actual ResNet models are defined in resnet.py, while the training procedure can be found in train.py.
  • python/phire/evaluation
    • Dedicated to the evaluation of the super-resolved fields. The main configuration of the evaluation is done in cli.py, while the other files mostly correspond to specific evaluation metrics.
  • python/phire/data
    • Static data shipped with the python package.
  • python/phire/jetstream
    • WiP: Prediction of jetstream latitude as downstream task.
  • scripts/
    • Various utility scripts, e.g. used to generate some of the figures seen in the paper.

Preparing the data

AtmoDist is trained on vorticity and divergence fields from ERA5 reanalysis data. The data was directly obtained as spherical harmonic coefficients from model level 120, before being converted to regular lat-lon grids (1280 x 2560) using pyshtools (right now not included in this repository).

We assume this gridded data to be stored in a hdf5 file for training and evaluation respectively containing a single dataset /data with dimensions C x T x H x W. These dimensions correspond to channel (/variable), time, height, and width respectively. Patches are then sampled from this hdf5 data and stored in .tfrecords files for training.

In practice, these "master" files actually contained virtual datasets, while the actual data was stored as one hdf5 file per year. This is however not a hard requirement. The script to create these virtual datasets is currently not included in the repository but might be at a later point of time.

To sample patches for training or evaluation run rplearn-data and phire-data.

Normalization

Normalization is done by the phire/data_tool.py script. This procedure is opaque to the models and data is only de-normalized during evaluation. The mean and standard deviations used for normalization can be specified using DataSampler.mean, DataSampler.std, DataSampler.mean_log1p, DataSampler.std_log1p. If specified as None, then these statistics will be calculated from the dataset using dask (this will take some time).


Training the AtmoDist model

  1. Specify dataset location, model name, output location, and number of classes (i.e. max delta T) in phire/rplearn/train.py
  2. Run training using rplearn-train
  3. Switch to evaluation by calling evaluate_all() and compute metrics on eval set
  4. Find optimal epoch and calculate normalization factors (for specific layer) using calculate_loss()

Training the SRGAN model

  1. Specify dataset location, model name, AtmoDist model to use, and training regimen in phire/main.py
  2. Run training using phire-train

Evaluating the SRGAN models

  1. Specify dataset location, models to evaluate, output location, and metrics to calculate in phire/evaluation/cli.py
  2. Evaluate using phire-eval
  3. Toggle the if-statement to generate comparing plots and data between different models and rerun phire-eval
Owner
Sebastian Hoffmann
Sebastian Hoffmann
Permeability Prediction Via Multi Scale 3D CNN

Permeability-Prediction-Via-Multi-Scale-3D-CNN Data: The raw CT rock cores are obtained from the Imperial Colloge portal. The CT rock cores are sub-sa

Mohamed Elmorsy 2 Jul 06, 2022
Collections for the lasted paper about multi-view clustering methods (papers, codes)

Multi-View Clustering Papers Collections for the lasted paper about multi-view clustering methods (papers, codes). There also exists some repositories

Andrew Guan 10 Sep 20, 2022
Official code for "Towards An End-to-End Framework for Flow-Guided Video Inpainting" (CVPR2022)

E2FGVI (CVPR 2022) English | 简体中文 This repository contains the official implementation of the following paper: Towards An End-to-End Framework for Flo

Media Computing Group @ Nankai University 537 Jan 07, 2023
CATE: Computation-aware Neural Architecture Encoding with Transformers

CATE: Computation-aware Neural Architecture Encoding with Transformers Code for paper: CATE: Computation-aware Neural Architecture Encoding with Trans

16 Dec 27, 2022
Creating Artificial Life with Reinforcement Learning

Although Evolutionary Algorithms have shown to result in interesting behavior, they focus on learning across generations whereas behavior could also be learned during ones lifetime.

Maarten Grootendorst 49 Dec 21, 2022
Gesture-controlled Video Game. Just swing your finger and play the game without touching your PC

Gesture Controlled Video Game Detailed Blog : https://www.analyticsvidhya.com/blog/2021/06/gesture-controlled-video-game/ Introduction This project is

Devbrat Anuragi 35 Jan 06, 2023
Picasso: A CUDA-based Library for Deep Learning over 3D Meshes

The Picasso Library is intended for complex real-world applications with large-scale surfaces, while it also performs impressively on the small-scale applications over synthetic shape manifolds. We h

97 Dec 01, 2022
Approaches to modeling terrain and maps in python

topography 🌎 Contains different approaches to modeling terrain and topographic-style maps in python Features Inverse Distance Weighting (IDW) A given

John Gutierrez 1 Aug 10, 2022
Fairness Metrics: All you need to know

Fairness Metrics: All you need to know Testing machine learning software for ethical bias has become a pressing current concern. Recent research has p

Anonymous2020 1 Jan 17, 2022
The official repo for CVPR2021——ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search.

ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search [paper] Introduction This is the official implementation of ViPNAS: Efficient V

Lumin 42 Sep 26, 2022
Wide Residual Networks (WideResNets) in PyTorch

Wide Residual Networks (WideResNets) in PyTorch WideResNets for CIFAR10/100 implemented in PyTorch. This implementation requires less GPU memory than

Jason Kuen 296 Dec 27, 2022
Gems & Holiday Package Prediction

Predictive_Modelling Gems & Holiday Package Prediction This project is based on 2 cases studies : Gems Price Prediction and Holiday Package prediction

Avnika Mehta 1 Jan 27, 2022
Lite-HRNet: A Lightweight High-Resolution Network

LiteHRNet Benchmark 🔥 🔥 Based on MMsegmentation 🔥 🔥 Cityscapes FCN resize concat config mIoU last mAcc last eval last mIoU best mAcc best eval bes

16 Dec 12, 2022
The deployment framework aims to provide a simple, lightweight, fast integrated, pipelined deployment framework that ensures reliability, high concurrency and scalability of services.

savior是一个能够进行快速集成算法模块并支持高性能部署的轻量开发框架。能够帮助将团队进行快速想法验证(PoC),避免重复的去github上找模型然后复现模型;能够帮助团队将功能进行流程拆解,很方便的提高分布式执行效率;能够有效减少代码冗余,减少不必要负担。

Tao Luo 125 Dec 22, 2022
Sample code and notebooks for Vertex AI, the end-to-end machine learning platform on Google Cloud

Google Cloud Vertex AI Samples Welcome to the Google Cloud Vertex AI sample repository. Overview The repository contains notebooks and community conte

Google Cloud Platform 560 Dec 31, 2022
GANSketchingJittor - Implementation of Sketch Your Own GAN in Jittor

GANSketching in Jittor Implementation of (Sketch Your Own GAN) in Jittor(计图). Or

Bernard Tan 10 Jul 02, 2022
Semi-supervised Stance Detection of Tweets Via Distant Network Supervision

SANDS This is an annonymous repository containing code and data necessary to reproduce the results published in "Semi-supervised Stance Detection of T

2 Sep 22, 2022
Codebase for ECCV18 "The Sound of Pixels"

Sound-of-Pixels Codebase for ECCV18 "The Sound of Pixels". *This repository is under construction, but the core parts are already there. Environment T

Hang Zhao 318 Dec 20, 2022
A curated list of neural network pruning resources.

A curated list of neural network pruning and related resources. Inspired by awesome-deep-vision, awesome-adversarial-machine-learning, awesome-deep-learning-papers and Awesome-NAS.

Yang He 1.7k Jan 09, 2023
This repository contains code demonstrating the methods outlined in Path Signature Area-Based Causal Discovery in Coupled Time Series presented at Causal Analysis Workshop 2021.

signed-area-causal-inference This repository contains code demonstrating the methods outlined in Path Signature Area-Based Causal Discovery in Coupled

Will Glad 1 Mar 11, 2022