simple_pytorch_example project is a toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset

Overview

Summary

This simple_pytorch_example project is a toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset with several common and useful features:

  • Choose between two different neural network architectures
  • Make architectures parametrizable
  • Read input arguments from config file or command line
    • (command line arguments override config file ones)
  • Download FashionMNIST dataset if not already downloaded
  • Monitor training progress on the terminal and/or with TensorBoard logs
    • Accuracy, loss, confusion matrix

More details about FashionMNIST can be found here.

It may be useful as a starting point for people who are starting to learn about PyTorch and neural networks.

Prerequisites

We assume that most users will have a GPU driver correctly configured, although the script can also be run on the CPU.

The project should work with your preferred python environment, but I have only tested it with conda (MiniConda 3) local environments. To create a local environment for this project,

conda create --name simple_pytorch_example python=3.9

and then activate it with

conda activate simple_pytorch_example

Installation on Ubuntu Linux

(Tested on Ubuntu Linux Focal 20.04.3 LTS)

Go to the directory where you want to have the project, e.g.

cd Software

Clone the simple_pytorch_example github repository

git clone https://github.com/rcasero/simple_pytorch_example.git

Install the python dependencies

cd simple_pytorch_example
python setup.py install

train_simple_pytorch_example.py: Main script to train the neural network

You can run the script train_simple_pytorch_example.py as

./train_simple_pytorch_example.py [options]

or

python train_simple_pytorch_example.py [options]

Usage summary

usage: train_simple_pytorch_example.py [-h] [-c CONFIG_FILE] [-v] [--workdir DIR] [-d STR] [-e N] [-b N] [-l F] [--validation_ratio F] [-n STR] [--conv_out_features N [N ...]]
                                       [--conv_kernel_size N] [--maxpool_kernel_size N]

optional arguments:
  -h, --help            show this help message and exit
  -c CONFIG_FILE, --config CONFIG_FILE
                        config file path
  -v, --verbose         verbose output for debugging
  --workdir DIR         working directory to place data, logs, weights, etc subdirectories (def .)
  -d STR, --device STR  device to train on (def 'cuda', 'cpu')
  -e N, --epochs N      number of epochs for training (def 10)
  -b N, --batch_size N  batch size for training (def 64)
  -l F, --learning_rate F
                        learning rate for training (def 1e-3)
  --validation_ratio F  ratio of training dataset reserved for validation (def 0.0)
  -n STR, --nn STR      neural network architecture (def 'SimpleCNN', 'SimpleLinearNN')
  --conv_out_features N [N ...]
                        (SimpleCNN only) number of output features for each convolutional block (def 8 16)
  --conv_kernel_size N  (SimpleCNN only) kernel size of convolutional layers (def 3)
  --maxpool_kernel_size N
                        (SimpleCNN only) kernel size of max pool layers (def 2)

Args that start with '--' (eg. -v) can also be set in a config file (specified via -c). Config file syntax allows: key=value, flag=true, stuff=[a,b,c]
(for details, see syntax at https://goo.gl/R74nmi). If an arg is specified in more than one place, then commandline values override config file values
which override defaults.

Options not provided to the script take default values, e.g. running ./train_simple_pytorch_example.py -v produces the output

** Arg breakdown (defaults / config file / command line):
Command Line Args:   -v
Defaults:
  --workdir:         .
  --device:          cuda
  --epochs:          10
  --batch_size:      64
  --learning_rate:   0.001
  --validation_ratio:0.0
  --nn:              SimpleCNN
  --conv_out_features:[8, 16]
  --conv_kernel_size:3
  --maxpool_kernel_size:2

Arguments that start with -- can have their default values overridden using a configuration file (-c CONFIG_FILE). A configuration file is just a text file (e.g. config.txt) that looks like this:

device = cuda
epochs = 20
batch_size = 64
learning_rate = 1e-3
validation_ratio = 0.2
nn = SimpleCNN
conv_out_features = [8, 16]
conv_kernel_size = 3
maxpool_kernel_size = 2

Note that when running ./train_simple_pytorch_example.py -v -c config.txt the defaults have been replaced by the arguments provided in the config file:

** Arg breakdown (defaults / config file / command line):
Command Line Args:   -v -c config.txt
Config File (config.txt):
  device:            cuda
  epochs:            20
  batch_size:        64
  learning_rate:     1e-3
  validation_ratio:  0.2
  nn:                SimpleCNN
  conv_out_features: [8, 16]
  conv_kernel_size:  3
  maxpool_kernel_size:2
Defaults:
  --workdir:         .

Command line arguments override both defaults and configuration file arguments, e.g.

./train_simple_pytorch_example.py --nn SimpleCNN -v --conv_out_features 8 16 32 -e 5

FashionMNIST data download

When train_simple_pytorch_example.py runs, it checks whether the FashionMNIST data has already been downloaded to WORKDIR/data, and if not, it downloads it automatically.

Network architectures

We provide two neural network architectures that can be selected with option --nn SimpleLinearNN or --nn SimpleCNN.

SimpleLinearNN is a network with fully connected layers

==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
SimpleLinearNN                           --                        --
├─Flatten: 1-1                           [1, 784]                  --
├─Sequential: 1-2                        [1, 10]                   --
│    └─Linear: 2-1                       [1, 512]                  401,920
│    └─ReLU: 2-2                         [1, 512]                  --
│    └─Linear: 2-3                       [1, 512]                  262,656
│    └─ReLU: 2-4                         [1, 512]                  --
│    └─Linear: 2-5                       [1, 10]                   5,130
==========================================================================================

SimpleCNN is a traditional convolutional neural network (CNN) formed by concatenation of convolutional blocks (Conv2d + ReLU + MaxPool2d + BatchNorm2d). Those blocks are followed by a 1x1 convolution and a fully connected layer with 10 outputs. The hyperparameters that the user can configure are (they are ignored for the other network):

  • --conv_kernel_size N: Size of the convolutional kernels (NxN, dafault 3x3).
  • --maxpool_kernel_size N: Size of the maxpool kernels (NxN, dafault 2x2).
  • --conv_out_features N1 [N2 ...]: Each number adds a convolutional block with the corresponding number of output features. E.g. --conv_out_features 8 16 32 creates a network with 3 blocks
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
SimpleCNN                                --                        --
├─ModuleList: 1-1                        --                        --
│    └─Conv2d: 2-1                       [1, 8, 28, 28]            80
│    └─ReLU: 2-2                         [1, 8, 28, 28]            --
│    └─MaxPool2d: 2-3                    [1, 8, 14, 14]            --
│    └─BatchNorm2d: 2-4                  [1, 8, 14, 14]            16
│    └─Conv2d: 2-5                       [1, 16, 14, 14]           1,168
│    └─ReLU: 2-6                         [1, 16, 14, 14]           --
│    └─MaxPool2d: 2-7                    [1, 16, 7, 7]             --
│    └─BatchNorm2d: 2-8                  [1, 16, 7, 7]             32
│    └─Conv2d: 2-9                       [1, 32, 7, 7]             4,640
│    └─ReLU: 2-10                        [1, 32, 7, 7]             --
│    └─MaxPool2d: 2-11                   [1, 32, 3, 3]             --
│    └─BatchNorm2d: 2-12                 [1, 32, 3, 3]             64
│    └─Conv2d: 2-13                      [1, 1, 3, 3]              289
│    └─Flatten: 2-14                     [1, 9]                    --
│    └─Linear: 2-15                      [1, 10]                   100
==========================================================================================

General training options

Currently, the loss (torch.nn.CrossEntropyLoss) and optimizer (torch.optim.SGD) are fixed.

Parameters common to both architectures are

  • --epochs N: number of training epochs.
  • --batch_size N: size of the training batch (if the dataset size is not a multiple of the batch size, the last batch will be smaller).
  • --learning_rate F: learning rate.
  • --validation_ratio F: by default, the script uses all the training data in FashionMNIST for training. But the user can choose to split the training data between training and validation. (The test data is a separate dataset in FashionMNIST).

Output network parameters

Once the network is trained, the model.state_dict() is saved to WORKDIR/models/LOGFILENAME.state_dict.

Monitoring

Option --verbose outputs detailed information about the script arguments, datasets, network architecture and training progress.

** Training:
Epoch 1/10
-------------------------------
train mean loss: 2.3913  [     0/ 60000]
train mean loss: 2.1813  [  6400/ 60000]
train mean loss: 2.1227  [ 12800/ 60000]
train mean loss: 2.0780  [ 19200/ 60000]
train mean loss: 1.9196  [ 25600/ 60000]
train mean loss: 1.6919  [ 32000/ 60000]
train mean loss: 1.4112  [ 38400/ 60000]
train mean loss: 1.2632  [ 44800/ 60000]
train mean loss: 1.0215  [ 51200/ 60000]
train mean loss: 0.8559  [ 57600/ 60000]
Training: Mean loss: 1.6672
Test: Accuracy: 63.8%, Mean loss: 0.9794
Validation: Accuracy: nan%, Mean loss:    nan
Epoch 2/10
-------------------------------
train mean loss: 1.0026  [     0/ 60000]
train mean loss: 0.8822  [  6400/ 60000]
...

Training progress can also be monitored with TensorBoard. The script saves TensorBoard logs to WORKDIR/runs, with a filename formed by the date (YYYY-MM-DD), time (HH-MM-SS), hostname and network architecture (e.g. 2021-11-25_01-15-49_marcel_SimpleCNN). To monitor the logs either during training or afterwards, run

tensorboard --logdir=runs &

and browse the URL displayed on the terminal, e.g. http://localhost:6006/.

If you are working remotely on the GPU server, you need to forward the remote server's port to your local machine

ssh -L 6006:localhost:6006 [email protected]_IP 

We provide plots for Accuracy (%), Mean loss and the Confusion Matrix

Accuracy and loss plots Confusion matrix

Results

SimpleLinearNN

Experiment 2021-11-26_01-33-52_marcel_SimpleLinearNN run with parameters:

./train_simple_pytorch_example.py -v --nn SimpleLinearNN --validation_ratio 0.2 -e 100

** All args:
Namespace(config_file=None, verbose=True, workdir='.', device='cuda', epochs=100, batch_size=64, learning_rate=0.001, validation_ratio=0.2, nn='SimpleLinearNN', conv_out_features=[8, 16], conv_kernel_size=3, maxpool_kernel_size=2)
** Arg breakdown (defaults / config file / command line):
Command Line Args:   -v --nn SimpleLinearNN --validation_ratio 0.2 -e 100
Defaults:
  --workdir:         .
  --device:          cuda
  --batch_size:      64
  --learning_rate:   0.001
  --conv_out_features:[8, 16]
  --conv_kernel_size:3
  --maxpool_kernel_size:2

** GPU found:
NVIDIA GeForce GTX 1050
** Datasets:
Image size (H, W): (28, 28)
Training samples: 48000
Validation samples: 12000
Testing samples: 10000
Classes: {'T-shirt/top': 0, 'Trouser': 1, 'Pullover': 2, 'Dress': 3, 'Coat': 4, 'Sandal': 5, 'Shirt': 6, 'Sneaker': 7, 'Bag': 8, 'Ankle boot': 9}
** Neural network architecture:
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
SimpleLinearNN                           --                        --
├─Flatten: 1-1                           [1, 784]                  --
├─Sequential: 1-2                        [1, 10]                   --
│    └─Linear: 2-1                       [1, 512]                  401,920
│    └─ReLU: 2-2                         [1, 512]                  --
│    └─Linear: 2-3                       [1, 512]                  262,656
│    └─ReLU: 2-4                         [1, 512]                  --
│    └─Linear: 2-5                       [1, 10]                   5,130
==========================================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
Total mult-adds (M): 0.67
==========================================================================================
Input size (MB): 0.00
Forward/backward pass size (MB): 0.01
Params size (MB): 2.68
Estimated Total Size (MB): 2.69
==========================================================================================

The final metrics (after 100 epochs) are shown under each corresponding figure:

Mean loss plots

  • Mean loss:
    • Training (brown): 0.4125
    • Test (dark blue): 0.4571
    • Validation (cyan): 0.4478

Accuracy plots

  • Accuracy:
    • Test (pink): 83.8%
    • Validation (green): 84.3%

SimpleCNN

Experiment 2021-11-26_02-17-18_marcel_SimpleCNN run with parameters:

./train_simple_pytorch_example.py -v --nn SimpleCNN --validation_ratio 0.2 -e 100 --conv_out_features 8 16 --conv_kernel_size 3 --maxpool_kernel_size 2

** All args:
Namespace(config_file=None, verbose=True, workdir='.', device='cuda', epochs=100, batch_size=64, learning_rate=0.001, validation_ratio=0.2, nn='SimpleCNN', conv_out_features=[8, 16], conv_kernel_size=3, maxpool_kernel_size=2)
** Arg breakdown (defaults / config file / command line):
Command Line Args:   -v --nn SimpleCNN --validation_ratio 0.2 -e 100 --conv_out_features 8 16 --conv_kernel_size 3 --maxpool_kernel_size 2
Defaults:
  --workdir:         .
  --device:          cuda
  --batch_size:      64
  --learning_rate:   0.001

** GPU found:
NVIDIA GeForce GTX 1050
** Datasets:
Image size (H, W): (28, 28)
Training samples: 48000
Validation samples: 12000
Testing samples: 10000
Classes: {'T-shirt/top': 0, 'Trouser': 1, 'Pullover': 2, 'Dress': 3, 'Coat': 4, 'Sandal': 5, 'Shirt': 6, 'Sneaker': 7, 'Bag': 8, 'Ankle boot': 9}
** Neural network architecture:
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
SimpleCNN                                --                        --
├─ModuleList: 1-1                        --                        --
│    └─Conv2d: 2-1                       [1, 8, 28, 28]            80
│    └─ReLU: 2-2                         [1, 8, 28, 28]            --
│    └─MaxPool2d: 2-3                    [1, 8, 14, 14]            --
│    └─BatchNorm2d: 2-4                  [1, 8, 14, 14]            16
│    └─Conv2d: 2-5                       [1, 16, 14, 14]           1,168
│    └─ReLU: 2-6                         [1, 16, 14, 14]           --
│    └─MaxPool2d: 2-7                    [1, 16, 7, 7]             --
│    └─BatchNorm2d: 2-8                  [1, 16, 7, 7]             32
│    └─Conv2d: 2-9                       [1, 1, 7, 7]              145
│    └─Flatten: 2-10                     [1, 49]                   --
│    └─Linear: 2-11                      [1, 10]                   500
==========================================================================================
Total params: 1,941
Trainable params: 1,941
Non-trainable params: 0
Total mult-adds (M): 0.30
==========================================================================================
Input size (MB): 0.00
Forward/backward pass size (MB): 0.09
Params size (MB): 0.01
Estimated Total Size (MB): 0.11
==========================================================================================

Mean loss plots

  • Mean loss:
    • Training (dark blue): 0.3186
    • Test (orange): 0.3686
    • Validation (brown): 0.3372

Accuracy plots

  • Accuracy:
    • Test (cyan): 87.2%
    • Validation (pink): 88.1%
You might also like...
A python-image-classification web application project, written in Python and served through the Flask Microframework. This Project implements the VGG16 covolutional neural network, through Keras and Tensorflow wrappers, to make predictions on uploaded images. Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E. Evaluated on benchmark dataset Office31.
Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E. Evaluated on benchmark dataset Office31.

Deep-Unsupervised-Domain-Adaptation Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E.

In this project we investigate the performance of the SetCon model on realistic video footage. Therefore, we implemented the model in PyTorch and tested the model on two example videos.
In this project we investigate the performance of the SetCon model on realistic video footage. Therefore, we implemented the model in PyTorch and tested the model on two example videos.

Contrastive Learning of Object Representations Supervisor: Prof. Dr. Gemma Roig Institutions: Goethe University CVAI - Computational Vision & Artifici

This is a model made out of Neural Network specifically a Convolutional Neural Network model
This is a model made out of Neural Network specifically a Convolutional Neural Network model

This is a model made out of Neural Network specifically a Convolutional Neural Network model. This was done with a pre-built dataset from the tensorflow and keras packages. There are other alternative libraries that can be used for this purpose, one of which is the PyTorch library.

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

SLATE This is the official source code for SLATE. We provide the code for the model, the training code and a dataset loader for the 3D Shapes dataset.

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks. Bayesian-Torch is designed to be flexible and seamless in extending a deterministic deep neural network architecture to corresponding Bayesian form by simply replacing the deterministic layers with Bayesian layers.

An implementation of quantum convolutional neural network with MindQuantum. Huawei, classifying MNIST dataset

关于实现的一点说明 山东大学 2020级 苏博南 www.subonan.com 文件说明 tools.py 这里面主要有两个函数: resize(a, lenb) 这其实是我找同学写的一个小算法hhh。给出一个$28\times 28$的方阵a,返回一个$lenb\times lenb$的方阵。因

This is the official repo for TransFill:  Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transformations at CVPR'21. According to some product reasons, we are not planning to release the training/testing codes and models. However, we will release the dataset and the scripts to prepare the dataset.
Releases(v1.0.0)
  • v1.0.0(Jan 7, 2022)

    Toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset with several common and useful features:

    • Choose between two different neural network architectures
    • Make architectures parametrizable
    • Read input arguments from config file or command line
      • (command line arguments override config file ones)
    • Download FashionMNIST dataset if not already downloaded
    • Monitor training progress on the terminal and/or with TensorBoard logs
      • Accuracy, loss, confusion matrix
    Source code(tar.gz)
    Source code(zip)
Owner
Ramón Casero
Ramón Casero
Perform Linear Classification with Multi-way Data

MultiwayClassification This is an R package to perform linear classification for data with multi-way structure. The distance-weighted discrimination (

Eric F. Lock 2 Dec 15, 2020
This project provides a stock market environment using OpenGym with Deep Q-learning and Policy Gradient.

Stock Trading Market OpenAI Gym Environment with Deep Reinforcement Learning using Keras Overview This project provides a general environment for stoc

Kim, Ki Hyun 769 Dec 25, 2022
Latex code for making neural networks diagrams

PlotNeuralNet Latex code for drawing neural networks for reports and presentation. Have a look into examples to see how they are made. Additionally, l

Haris Iqbal 18.6k Jan 01, 2023
📚 A collection of Jupyter notebooks for learning and experimenting with OpenVINO 👓

A collection of ready-to-run Python* notebooks for learning and experimenting with OpenVINO developer tools. The notebooks are meant to provide an introduction to OpenVINO basics and teach developers

OpenVINO Toolkit 840 Jan 03, 2023
PyTorch implementation of PSPNet segmentation network

pspnet-pytorch PyTorch implementation of PSPNet segmentation network Original paper Pyramid Scene Parsing Network Details This is a slightly different

Roman Trusov 532 Dec 29, 2022
A PyTorch implementation of "Semi-Supervised Graph Classification: A Hierarchical Graph Perspective" (WWW 2019)

SEAL ⠀⠀⠀ A PyTorch implementation of Semi-Supervised Graph Classification: A Hierarchical Graph Perspective (WWW 2019) Abstract Node classification an

Benedek Rozemberczki 202 Dec 27, 2022
This repository is an unoffical PyTorch implementation of Medical segmentation in 3D and 2D.

Pytorch Medical Segmentation Read Chinese Introduction:Here! Recent Updates 2021.1.8 The train and test codes are released. 2021.2.6 A bug in dice was

EasyCV-Ellis 618 Dec 27, 2022
Train/evaluate a Keras model, get metrics streamed to a dashboard in your browser.

Hera Train/evaluate a Keras model, get metrics streamed to a dashboard in your browser. Setting up Step 1. Plant the spy Install the package pip

Keplr 495 Dec 10, 2022
PoolFormer: MetaFormer is Actually What You Need for Vision

PoolFormer: MetaFormer is Actually What You Need for Vision (arXiv) This is a PyTorch implementation of PoolFormer proposed by our paper "MetaFormer i

Sea AI Lab 1k Dec 30, 2022
On the Analysis of French Phonetic Idiosyncrasies for Accent Recognition

On the Analysis of French Phonetic Idiosyncrasies for Accent Recognition With the spirit of reproducible research, this repository contains codes requ

0 Feb 24, 2022
Image Restoration Using Swin Transformer for VapourSynth

SwinIR SwinIR function for VapourSynth, based on https://github.com/JingyunLiang/SwinIR. Dependencies NumPy PyTorch, preferably with CUDA. Note that t

Holy Wu 11 Jun 19, 2022
Data labels and scripts for fastMRI.org

fastMRI+: Clinical pathology annotations for the fastMRI dataset The fastMRI dataset is a publicly available MRI raw (k-space) dataset. It has been us

Microsoft 51 Dec 22, 2022
Algorithmic encoding of protected characteristics and its implications on disparities across subgroups

Algorithmic encoding of protected characteristics and its implications on disparities across subgroups This repository contains the code for the paper

Team MIRA - BioMedIA 15 Oct 24, 2022
Code Repository for Liquid Time-Constant Networks (LTCs)

Liquid time-constant Networks (LTCs) [Update] A Pytorch version is added in our sister repository: https://github.com/mlech26l/keras-ncp This is the o

Ramin Hasani 553 Dec 27, 2022
i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery

i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery This is a public code repository for the publication: i-SpaSP: Structured Neural Pruning

Cameron Ronald Wolfe 5 Nov 04, 2022
MLJetReconstruction - using machine learning to reconstruct jets for CMS

MLJetReconstruction - using machine learning to reconstruct jets for CMS The C++ data extraction code used here was based heavily on that foundv here.

ALPhA Davidson 0 Nov 17, 2021
Deploying PyTorch Model to Production with FastAPI in CUDA-supported Docker

Deploying PyTorch Model to Production with FastAPI in CUDA-supported Docker A example FastAPI PyTorch Model deploy with nvidia/cuda base docker. Model

Ming 68 Jan 04, 2023
Official implementation of the paper Chunked Autoregressive GAN for Conditional Waveform Synthesis

PyEmits, a python package for easy manipulation in time-series data. Time-series data is very common in real life. Engineering FSI industry (Financial

Descript 150 Dec 06, 2022
Camview - A CLI-tool used to stream CCTV online footage based on URL params

CamView A CLI-tool used to stream CCTV online footage based on URL params Get St

Finn Lancaster 54 Dec 09, 2022
Caffe implementation for Hu et al. Segmentation for Natural Language Expressions

Segmentation from Natural Language Expressions This repository contains the Caffe reimplementation of the following paper: R. Hu, M. Rohrbach, T. Darr

10 Jul 27, 2021