simple_pytorch_example project is a toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset

Overview

Summary

This simple_pytorch_example project is a toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset with several common and useful features:

  • Choose between two different neural network architectures
  • Make architectures parametrizable
  • Read input arguments from config file or command line
    • (command line arguments override config file ones)
  • Download FashionMNIST dataset if not already downloaded
  • Monitor training progress on the terminal and/or with TensorBoard logs
    • Accuracy, loss, confusion matrix

More details about FashionMNIST can be found here.

It may be useful as a starting point for people who are starting to learn about PyTorch and neural networks.

Prerequisites

We assume that most users will have a GPU driver correctly configured, although the script can also be run on the CPU.

The project should work with your preferred python environment, but I have only tested it with conda (MiniConda 3) local environments. To create a local environment for this project,

conda create --name simple_pytorch_example python=3.9

and then activate it with

conda activate simple_pytorch_example

Installation on Ubuntu Linux

(Tested on Ubuntu Linux Focal 20.04.3 LTS)

Go to the directory where you want to have the project, e.g.

cd Software

Clone the simple_pytorch_example github repository

git clone https://github.com/rcasero/simple_pytorch_example.git

Install the python dependencies

cd simple_pytorch_example
python setup.py install

train_simple_pytorch_example.py: Main script to train the neural network

You can run the script train_simple_pytorch_example.py as

./train_simple_pytorch_example.py [options]

or

python train_simple_pytorch_example.py [options]

Usage summary

usage: train_simple_pytorch_example.py [-h] [-c CONFIG_FILE] [-v] [--workdir DIR] [-d STR] [-e N] [-b N] [-l F] [--validation_ratio F] [-n STR] [--conv_out_features N [N ...]]
                                       [--conv_kernel_size N] [--maxpool_kernel_size N]

optional arguments:
  -h, --help            show this help message and exit
  -c CONFIG_FILE, --config CONFIG_FILE
                        config file path
  -v, --verbose         verbose output for debugging
  --workdir DIR         working directory to place data, logs, weights, etc subdirectories (def .)
  -d STR, --device STR  device to train on (def 'cuda', 'cpu')
  -e N, --epochs N      number of epochs for training (def 10)
  -b N, --batch_size N  batch size for training (def 64)
  -l F, --learning_rate F
                        learning rate for training (def 1e-3)
  --validation_ratio F  ratio of training dataset reserved for validation (def 0.0)
  -n STR, --nn STR      neural network architecture (def 'SimpleCNN', 'SimpleLinearNN')
  --conv_out_features N [N ...]
                        (SimpleCNN only) number of output features for each convolutional block (def 8 16)
  --conv_kernel_size N  (SimpleCNN only) kernel size of convolutional layers (def 3)
  --maxpool_kernel_size N
                        (SimpleCNN only) kernel size of max pool layers (def 2)

Args that start with '--' (eg. -v) can also be set in a config file (specified via -c). Config file syntax allows: key=value, flag=true, stuff=[a,b,c]
(for details, see syntax at https://goo.gl/R74nmi). If an arg is specified in more than one place, then commandline values override config file values
which override defaults.

Options not provided to the script take default values, e.g. running ./train_simple_pytorch_example.py -v produces the output

** Arg breakdown (defaults / config file / command line):
Command Line Args:   -v
Defaults:
  --workdir:         .
  --device:          cuda
  --epochs:          10
  --batch_size:      64
  --learning_rate:   0.001
  --validation_ratio:0.0
  --nn:              SimpleCNN
  --conv_out_features:[8, 16]
  --conv_kernel_size:3
  --maxpool_kernel_size:2

Arguments that start with -- can have their default values overridden using a configuration file (-c CONFIG_FILE). A configuration file is just a text file (e.g. config.txt) that looks like this:

device = cuda
epochs = 20
batch_size = 64
learning_rate = 1e-3
validation_ratio = 0.2
nn = SimpleCNN
conv_out_features = [8, 16]
conv_kernel_size = 3
maxpool_kernel_size = 2

Note that when running ./train_simple_pytorch_example.py -v -c config.txt the defaults have been replaced by the arguments provided in the config file:

** Arg breakdown (defaults / config file / command line):
Command Line Args:   -v -c config.txt
Config File (config.txt):
  device:            cuda
  epochs:            20
  batch_size:        64
  learning_rate:     1e-3
  validation_ratio:  0.2
  nn:                SimpleCNN
  conv_out_features: [8, 16]
  conv_kernel_size:  3
  maxpool_kernel_size:2
Defaults:
  --workdir:         .

Command line arguments override both defaults and configuration file arguments, e.g.

./train_simple_pytorch_example.py --nn SimpleCNN -v --conv_out_features 8 16 32 -e 5

FashionMNIST data download

When train_simple_pytorch_example.py runs, it checks whether the FashionMNIST data has already been downloaded to WORKDIR/data, and if not, it downloads it automatically.

Network architectures

We provide two neural network architectures that can be selected with option --nn SimpleLinearNN or --nn SimpleCNN.

SimpleLinearNN is a network with fully connected layers

==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
SimpleLinearNN                           --                        --
├─Flatten: 1-1                           [1, 784]                  --
├─Sequential: 1-2                        [1, 10]                   --
│    └─Linear: 2-1                       [1, 512]                  401,920
│    └─ReLU: 2-2                         [1, 512]                  --
│    └─Linear: 2-3                       [1, 512]                  262,656
│    └─ReLU: 2-4                         [1, 512]                  --
│    └─Linear: 2-5                       [1, 10]                   5,130
==========================================================================================

SimpleCNN is a traditional convolutional neural network (CNN) formed by concatenation of convolutional blocks (Conv2d + ReLU + MaxPool2d + BatchNorm2d). Those blocks are followed by a 1x1 convolution and a fully connected layer with 10 outputs. The hyperparameters that the user can configure are (they are ignored for the other network):

  • --conv_kernel_size N: Size of the convolutional kernels (NxN, dafault 3x3).
  • --maxpool_kernel_size N: Size of the maxpool kernels (NxN, dafault 2x2).
  • --conv_out_features N1 [N2 ...]: Each number adds a convolutional block with the corresponding number of output features. E.g. --conv_out_features 8 16 32 creates a network with 3 blocks
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
SimpleCNN                                --                        --
├─ModuleList: 1-1                        --                        --
│    └─Conv2d: 2-1                       [1, 8, 28, 28]            80
│    └─ReLU: 2-2                         [1, 8, 28, 28]            --
│    └─MaxPool2d: 2-3                    [1, 8, 14, 14]            --
│    └─BatchNorm2d: 2-4                  [1, 8, 14, 14]            16
│    └─Conv2d: 2-5                       [1, 16, 14, 14]           1,168
│    └─ReLU: 2-6                         [1, 16, 14, 14]           --
│    └─MaxPool2d: 2-7                    [1, 16, 7, 7]             --
│    └─BatchNorm2d: 2-8                  [1, 16, 7, 7]             32
│    └─Conv2d: 2-9                       [1, 32, 7, 7]             4,640
│    └─ReLU: 2-10                        [1, 32, 7, 7]             --
│    └─MaxPool2d: 2-11                   [1, 32, 3, 3]             --
│    └─BatchNorm2d: 2-12                 [1, 32, 3, 3]             64
│    └─Conv2d: 2-13                      [1, 1, 3, 3]              289
│    └─Flatten: 2-14                     [1, 9]                    --
│    └─Linear: 2-15                      [1, 10]                   100
==========================================================================================

General training options

Currently, the loss (torch.nn.CrossEntropyLoss) and optimizer (torch.optim.SGD) are fixed.

Parameters common to both architectures are

  • --epochs N: number of training epochs.
  • --batch_size N: size of the training batch (if the dataset size is not a multiple of the batch size, the last batch will be smaller).
  • --learning_rate F: learning rate.
  • --validation_ratio F: by default, the script uses all the training data in FashionMNIST for training. But the user can choose to split the training data between training and validation. (The test data is a separate dataset in FashionMNIST).

Output network parameters

Once the network is trained, the model.state_dict() is saved to WORKDIR/models/LOGFILENAME.state_dict.

Monitoring

Option --verbose outputs detailed information about the script arguments, datasets, network architecture and training progress.

** Training:
Epoch 1/10
-------------------------------
train mean loss: 2.3913  [     0/ 60000]
train mean loss: 2.1813  [  6400/ 60000]
train mean loss: 2.1227  [ 12800/ 60000]
train mean loss: 2.0780  [ 19200/ 60000]
train mean loss: 1.9196  [ 25600/ 60000]
train mean loss: 1.6919  [ 32000/ 60000]
train mean loss: 1.4112  [ 38400/ 60000]
train mean loss: 1.2632  [ 44800/ 60000]
train mean loss: 1.0215  [ 51200/ 60000]
train mean loss: 0.8559  [ 57600/ 60000]
Training: Mean loss: 1.6672
Test: Accuracy: 63.8%, Mean loss: 0.9794
Validation: Accuracy: nan%, Mean loss:    nan
Epoch 2/10
-------------------------------
train mean loss: 1.0026  [     0/ 60000]
train mean loss: 0.8822  [  6400/ 60000]
...

Training progress can also be monitored with TensorBoard. The script saves TensorBoard logs to WORKDIR/runs, with a filename formed by the date (YYYY-MM-DD), time (HH-MM-SS), hostname and network architecture (e.g. 2021-11-25_01-15-49_marcel_SimpleCNN). To monitor the logs either during training or afterwards, run

tensorboard --logdir=runs &

and browse the URL displayed on the terminal, e.g. http://localhost:6006/.

If you are working remotely on the GPU server, you need to forward the remote server's port to your local machine

ssh -L 6006:localhost:6006 [email protected]_IP 

We provide plots for Accuracy (%), Mean loss and the Confusion Matrix

Accuracy and loss plots Confusion matrix

Results

SimpleLinearNN

Experiment 2021-11-26_01-33-52_marcel_SimpleLinearNN run with parameters:

./train_simple_pytorch_example.py -v --nn SimpleLinearNN --validation_ratio 0.2 -e 100

** All args:
Namespace(config_file=None, verbose=True, workdir='.', device='cuda', epochs=100, batch_size=64, learning_rate=0.001, validation_ratio=0.2, nn='SimpleLinearNN', conv_out_features=[8, 16], conv_kernel_size=3, maxpool_kernel_size=2)
** Arg breakdown (defaults / config file / command line):
Command Line Args:   -v --nn SimpleLinearNN --validation_ratio 0.2 -e 100
Defaults:
  --workdir:         .
  --device:          cuda
  --batch_size:      64
  --learning_rate:   0.001
  --conv_out_features:[8, 16]
  --conv_kernel_size:3
  --maxpool_kernel_size:2

** GPU found:
NVIDIA GeForce GTX 1050
** Datasets:
Image size (H, W): (28, 28)
Training samples: 48000
Validation samples: 12000
Testing samples: 10000
Classes: {'T-shirt/top': 0, 'Trouser': 1, 'Pullover': 2, 'Dress': 3, 'Coat': 4, 'Sandal': 5, 'Shirt': 6, 'Sneaker': 7, 'Bag': 8, 'Ankle boot': 9}
** Neural network architecture:
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
SimpleLinearNN                           --                        --
├─Flatten: 1-1                           [1, 784]                  --
├─Sequential: 1-2                        [1, 10]                   --
│    └─Linear: 2-1                       [1, 512]                  401,920
│    └─ReLU: 2-2                         [1, 512]                  --
│    └─Linear: 2-3                       [1, 512]                  262,656
│    └─ReLU: 2-4                         [1, 512]                  --
│    └─Linear: 2-5                       [1, 10]                   5,130
==========================================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
Total mult-adds (M): 0.67
==========================================================================================
Input size (MB): 0.00
Forward/backward pass size (MB): 0.01
Params size (MB): 2.68
Estimated Total Size (MB): 2.69
==========================================================================================

The final metrics (after 100 epochs) are shown under each corresponding figure:

Mean loss plots

  • Mean loss:
    • Training (brown): 0.4125
    • Test (dark blue): 0.4571
    • Validation (cyan): 0.4478

Accuracy plots

  • Accuracy:
    • Test (pink): 83.8%
    • Validation (green): 84.3%

SimpleCNN

Experiment 2021-11-26_02-17-18_marcel_SimpleCNN run with parameters:

./train_simple_pytorch_example.py -v --nn SimpleCNN --validation_ratio 0.2 -e 100 --conv_out_features 8 16 --conv_kernel_size 3 --maxpool_kernel_size 2

** All args:
Namespace(config_file=None, verbose=True, workdir='.', device='cuda', epochs=100, batch_size=64, learning_rate=0.001, validation_ratio=0.2, nn='SimpleCNN', conv_out_features=[8, 16], conv_kernel_size=3, maxpool_kernel_size=2)
** Arg breakdown (defaults / config file / command line):
Command Line Args:   -v --nn SimpleCNN --validation_ratio 0.2 -e 100 --conv_out_features 8 16 --conv_kernel_size 3 --maxpool_kernel_size 2
Defaults:
  --workdir:         .
  --device:          cuda
  --batch_size:      64
  --learning_rate:   0.001

** GPU found:
NVIDIA GeForce GTX 1050
** Datasets:
Image size (H, W): (28, 28)
Training samples: 48000
Validation samples: 12000
Testing samples: 10000
Classes: {'T-shirt/top': 0, 'Trouser': 1, 'Pullover': 2, 'Dress': 3, 'Coat': 4, 'Sandal': 5, 'Shirt': 6, 'Sneaker': 7, 'Bag': 8, 'Ankle boot': 9}
** Neural network architecture:
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
SimpleCNN                                --                        --
├─ModuleList: 1-1                        --                        --
│    └─Conv2d: 2-1                       [1, 8, 28, 28]            80
│    └─ReLU: 2-2                         [1, 8, 28, 28]            --
│    └─MaxPool2d: 2-3                    [1, 8, 14, 14]            --
│    └─BatchNorm2d: 2-4                  [1, 8, 14, 14]            16
│    └─Conv2d: 2-5                       [1, 16, 14, 14]           1,168
│    └─ReLU: 2-6                         [1, 16, 14, 14]           --
│    └─MaxPool2d: 2-7                    [1, 16, 7, 7]             --
│    └─BatchNorm2d: 2-8                  [1, 16, 7, 7]             32
│    └─Conv2d: 2-9                       [1, 1, 7, 7]              145
│    └─Flatten: 2-10                     [1, 49]                   --
│    └─Linear: 2-11                      [1, 10]                   500
==========================================================================================
Total params: 1,941
Trainable params: 1,941
Non-trainable params: 0
Total mult-adds (M): 0.30
==========================================================================================
Input size (MB): 0.00
Forward/backward pass size (MB): 0.09
Params size (MB): 0.01
Estimated Total Size (MB): 0.11
==========================================================================================

Mean loss plots

  • Mean loss:
    • Training (dark blue): 0.3186
    • Test (orange): 0.3686
    • Validation (brown): 0.3372

Accuracy plots

  • Accuracy:
    • Test (cyan): 87.2%
    • Validation (pink): 88.1%
You might also like...
A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images. Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E. Evaluated on benchmark dataset Office31.
Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E. Evaluated on benchmark dataset Office31.

Deep-Unsupervised-Domain-Adaptation Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E.

In this project we investigate the performance of the SetCon model on realistic video footage. Therefore, we implemented the model in PyTorch and tested the model on two example videos.
In this project we investigate the performance of the SetCon model on realistic video footage. Therefore, we implemented the model in PyTorch and tested the model on two example videos.

Contrastive Learning of Object Representations Supervisor: Prof. Dr. Gemma Roig Institutions: Goethe University CVAI - Computational Vision & Artifici

This is a model made out of Neural Network specifically a Convolutional Neural Network model
This is a model made out of Neural Network specifically a Convolutional Neural Network model

This is a model made out of Neural Network specifically a Convolutional Neural Network model. This was done with a pre-built dataset from the tensorflow and keras packages. There are other alternative libraries that can be used for this purpose, one of which is the PyTorch library.

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

SLATE This is the official source code for SLATE. We provide the code for the model, the training code and a dataset loader for the 3D Shapes dataset.

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks. Bayesian-Torch is designed to be flexible and seamless in extending a deterministic deep neural network architecture to corresponding Bayesian form by simply replacing the deterministic layers with Bayesian layers.

An implementation of quantum convolutional neural network with MindQuantum. Huawei, classifying MNIST dataset

关于实现的一点说明 山东大学 2020级 苏博南 www.subonan.com 文件说明 tools.py 这里面主要有两个函数: resize(a, lenb) 这其实是我找同学写的一个小算法hhh。给出一个$28\times 28$的方阵a,返回一个$lenb\times lenb$的方阵。因

This is the official repo for TransFill:  Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transformations at CVPR'21. According to some product reasons, we are not planning to release the training/testing codes and models. However, we will release the dataset and the scripts to prepare the dataset.
Releases(v1.0.0)
  • v1.0.0(Jan 7, 2022)

    Toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset with several common and useful features:

    • Choose between two different neural network architectures
    • Make architectures parametrizable
    • Read input arguments from config file or command line
      • (command line arguments override config file ones)
    • Download FashionMNIST dataset if not already downloaded
    • Monitor training progress on the terminal and/or with TensorBoard logs
      • Accuracy, loss, confusion matrix
    Source code(tar.gz)
    Source code(zip)
Owner
Ramón Casero
Ramón Casero
Implementation of Neural Distance Embeddings for Biological Sequences (NeuroSEED) in PyTorch

Neural Distance Embeddings for Biological Sequences Official implementation of Neural Distance Embeddings for Biological Sequences (NeuroSEED) in PyTo

Gabriele Corso 56 Dec 23, 2022
A decent AI that solves daily Wordle puzzles. Works with different websites with similar wordlists,.

Wordle-AI A decent AI that solves daily "Wordle" puzzles. Works with different websites with similar wordlists. When prompted with "Word:" enter the w

Ethan 1 Feb 10, 2022
Put blind watermark into a text with python

text_blind_watermark Put blind watermark into a text. Can be used in Wechat dingding ... How to Use install pip install text_blind_watermark Alice Pu

郭飞 164 Dec 30, 2022
Efficient Two-Step Networks for Temporal Action Segmentation (Neurocomputing 2021)

Efficient Two-Step Networks for Temporal Action Segmentation This repository provides a PyTorch implementation of the paper Efficient Two-Step Network

8 Apr 16, 2022
This is the paddle code for SeBoW(Self-Born wiring for neural trees), a kind of neural tree born form a large search space

SeBoW: Self-Born Wiring for neural trees(PaddlePaddle version) This is the paddle code for SeBoW(Self-Born wiring for neural trees), a kind of neural

HollyLee 13 Dec 08, 2022
A modified version of DeepMind's Alphafold2 to divide CPU part (MSA and template searching) and GPU part (prediction model)

ParallelFold Author: Bozitao Zhong This is a modified version of DeepMind's Alphafold2 to divide CPU part (MSA and template searching) and GPU part (p

Bozitao Zhong 77 Dec 22, 2022
Robot Reinforcement Learning on the Constraint Manifold

Implementation of "Robot Reinforcement Learning on the Constraint Manifold"

31 Dec 05, 2022
Fine-Tune EleutherAI GPT-Neo to Generate Netflix Movie Descriptions in Only 47 Lines of Code Using Hugginface And DeepSpeed

GPT-Neo-2.7B Fine-Tuning Example Using HuggingFace & DeepSpeed Installation cd venv/bin ./pip install -r ../../requirements.txt ./pip install deepspe

Nikita 180 Jan 05, 2023
Second Order Optimization and Curvature Estimation with K-FAC in JAX.

KFAC-JAX - Second Order Optimization with Approximate Curvature in JAX Installation | Quickstart | Documentation | Examples | Citing KFAC-JAX KFAC-JAX

DeepMind 90 Dec 22, 2022
BMN: Boundary-Matching Network

BMN: Boundary-Matching Network A pytorch-version implementation codes of paper: "BMN: Boundary-Matching Network for Temporal Action Proposal Generatio

qinxin 260 Dec 06, 2022
Official implement of Paper:A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sening images

A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images 深度监督影像融合网络DSIFN用于高分辨率双时相遥感影像变化检测 Of

Chenxiao Zhang 135 Dec 19, 2022
Learning cell communication from spatial graphs of cells

ncem Features Repository for the manuscript Fischer, D. S., Schaar, A. C. and Theis, F. Learning cell communication from spatial graphs of cells. 2021

Theis Lab 77 Dec 30, 2022
Python Auto-ML Package for Tabular Datasets

Tabular-AutoML AutoML Package for tabular datasets Tabular dataset tuning is now hassle free! Run one liner command and get best tuning and processed

Sagnik Roy 18 Nov 20, 2022
Sound and Cost-effective Fuzzing of Stripped Binaries by Incremental and Stochastic Rewriting

StochFuzz: A New Solution for Binary-only Fuzzing StochFuzz is a (probabilistically) sound and cost-effective fuzzing technique for stripped binaries.

Zhuo Zhang 164 Dec 05, 2022
Automatic packaging of the open-composite libs for OvGME

OvGME Packager for OpenXR – OpenComposite for DCS Note This repository is currently unsupported and needs to be migrated to the upstream OpenComposite

12 Nov 03, 2022
Turning SymPy expressions into PyTorch modules.

sympytorch A micro-library as a convenience for turning SymPy expressions into PyTorch Modules. All SymPy floats become trainable parameters. All SymP

Patrick Kidger 89 Dec 13, 2022
Code implementation for the paper 'Conditional Gaussian PAC-Bayes'.

CondGauss This repository contains PyTorch code for the paper Stochastic Gaussian PAC-Bayes. A novel PAC-Bayesian training method is implemented. Ther

0 Nov 01, 2021
Disentangled Lifespan Face Synthesis

Disentangled Lifespan Face Synthesis Project Page | Paper Demo on Colab Preparation Please follow this github to prepare the environments and dataset.

何森 50 Sep 20, 2022
Official Pytorch implementation of Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference (ICLR 2022)

The Official Implementation of CLIB (Continual Learning for i-Blurry) Online Continual Learning on Class Incremental Blurry Task Configuration with An

NAVER AI 34 Oct 26, 2022
Asymmetric metric learning for knowledge transfer

Asymmetric metric learning This is the official code that enables the reproduction of the results from our paper: Asymmetric metric learning for knowl

20 Dec 06, 2022