Complete U-net Implementation with keras

Overview

U Net Lowered with Keras

Complete U-net Implementation with keras






Original Paper Link : https://arxiv.org/abs/1505.04597

Special Implementations :


The model is implemented using the original paper. But I have changed the number of filters of the layers. The implemented number of layers are reduced to 25% of the original paper.

Original Model Architecture :

Dataset :


The dataset has been taken from kaggle . It had a specific directory tree, but it was tough to execute dataset building from it, so I prepared an usable dat directory.

Link : https://www.kaggle.com/azkihimmawan/chest-xray-masks-and-defect-detection

Primary Directory Tree :

.
└── root/
    ├── train_images/
    │   └── id/
    │       ├── images/
    │       │   └── id.png
    │       └── masks/
    │           └── id.png
    └── test_images/
        └── id/
            └── id.png

Given Images :

Image Mask

Supporting Libraries :

Numpy opencv Matplotlib

Library Versions :

All versions are up to date as per 14th June 2021.

Dataset Directory Generation :


We have performed operations to ceate the data directory like this :

              .
              └── root/
                  ├── train/
                  │   ├── images/
                  │   │   └── id.png
                  │   └── masks/
                  │       └── id.png
                  └── test/
                      └── id.png

Model Architecture ( U-Net Lowered ):

Model: “UNet-Lowered”

Layer Type Output Shape Param Connected to
input_1 (InputLayer) [(None, 512, 512, 1) 0
conv2d (Conv2D) (None, 512, 512, 16) 160 input_1[0][0]
conv2d_1 (Conv2D) (None, 512, 512, 16) 2320 conv2d[0][0]
max_pooling2d (MaxPooling2D) (None, 256, 256, 16) 0 conv2d_1[0][0]
conv2d_2 (Conv2D) (None, 256, 256, 32) 4640 max_pooling2d[0][0]
conv2d_3 (Conv2D) (None, 256, 256, 32) 9248 conv2d_2[0][0]
max_pooling2d_1 (MaxPooling2D) (None, 128, 128, 32) 0 conv2d_3[0][0]
conv2d_4 (Conv2D) (None, 128, 128, 64) 18496 max_pooling2d_1[0][0]
conv2d_5 (Conv2D) (None, 128, 128, 64) 36928 conv2d_4[0][0]
max_pooling2d_2 (MaxPooling2D) (None, 64, 64, 64) 0 conv2d_5[0][0]
conv2d_6 (Conv2D) (None, 64, 64, 128) 73856 max_pooling2d_2[0][0]
conv2d_7 (Conv2D) (None, 64, 64, 128) 147584 conv2d_6[0][0]
dropout (Dropout) (None, 64, 64, 128) 0 conv2d_7[0][0]
max_pooling2d_3 (MaxPooling2D) (None, 32, 32, 128) 0 dropout[0][0]
conv2d_8 (Conv2D) (None, 32, 32, 256) 295168 max_pooling2d_3[0][0]
conv2d_9 (Conv2D) (None, 32, 32, 256) 590080 conv2d_8[0][0]
dropout_1 (Dropout) (None, 32, 32, 256) 0 conv2d_9[0][0]
up_sampling2d (UpSampling2D) (None, 64, 64, 256) 0 dropout_1[0][0]
conv2d_10 (Conv2D) (None, 64, 64, 128) 131200 up_sampling2d[0][0]
concatenate (Concatenate) (None, 64, 64, 256) 0 dropout[0][0] & conv2d_10[0][0]
conv2d_11 (Conv2D) (None, 64, 64, 128) 295040 concatenate[0][0]
conv2d_12 (Conv2D) (None, 64, 64, 128) 147584
up_sampling2d_1 (UpSampling2D) (None, 128, 128, 128) 0 conv2d_12[0][0]
conv2d_13 (Conv2D) (None, 128, 128, 64) 32832 up_sampling2d_1[0][0]
concatenate_1 (Concatenate) (None, 128, 128, 128) 0 conv2d_5[0][0] & conv2d_13[0][0]
conv2d_14 (Conv2D) (None, 128, 128, 64) 73792 concatenate_1[0][0]
conv2d_15 (Conv2D) (None, 128, 128, 64) 36928 conv2d_14[0][0]
up_sampling2d_2 (UpSampling2D) (None, 256, 256, 64) 0 conv2d_15[0][0]
conv2d_16 (Conv2D) (None, 256, 256, 32) 8224 up_sampling2d_2[0][0]
concatenate_2 (Concatenate) (None, 256, 256, 64) 0 conv2d_3[0][0] & conv2d_16[0][0]
conv2d_17 (Conv2D) (None, 256, 256, 32) 18464 concatenate_2[0][0]
conv2d_18 (Conv2D) (None, 256, 256, 32) 9248 conv2d_17[0][0]
up_sampling2d_3 (UpSampling2D) (None, 512, 512, 32) 0 conv2d_18[0][0]
conv2d_19 (Conv2D) (None, 512, 512, 16) 2064 up_sampling2d_3[0][0]
concatenate_3 (Concatenate) (None, 512, 512, 32) 0 conv2d_1[0][0] & conv2d_19[0][0]
conv2d_20 (Conv2D) (None, 512, 512, 16) 4624 concatenate_3[0][0]
conv2d_21 (Conv2D) (None, 512, 512, 16) 2320 conv2d_20[0][0]
conv2d_22 (Conv2D) (None, 512, 512, 2) 290 conv2d_21[0][0]
conv2d_23 (Conv2D) (None, 512, 512, 1) 3 conv2d_22[0][0]

Data Preparation :

Taken single channels of both image and mask for training.

Hyperparameters :

      Image Shape : (512 , 512 , 1)
      Optimizer : Adam ( Learning Rate : 1e-4 )
      Loss : Binary Cross Entropy 
      Metrics : Accuracy
      Epochs on Training : 100
      Train Validation Ratio : ( 85%-15% )
      Batch Size : 10

Model Evaluation Metrics :

Model Performance on Train Data :

Model Performance on Validation Data :

One task left : Will update the tutorial notebooks soon ;)

Conclusion :

The full model on the simpliefied 1 channel images was giving bad overfitted accuracy. But this structure shows better and efficient tuning over the data.

STAR the repository if this was helpful :) Also follow me on kaggle and Linkedin.

THANK YOU for visiting :)

Owner
Sagnik Roy
Kaggle Expert exploring Computer Vision as no one did!
Sagnik Roy
Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.

Value Retrieval with Arbitrary Queries for Form-like Documents Introduction Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-

Salesforce 13 Sep 15, 2022
Official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo'

IterMVS official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo' Introduction IterMVS is a novel lear

Fangjinhua Wang 127 Jan 04, 2023
IOT: Instance-wise Layer Reordering for Transformer Structures

Introduction This repository contains the code for Instance-wise Ordered Transformer (IOT), which is introduced in the ICLR2021 paper IOT: Instance-wi

IOT 19 Nov 15, 2022
DeepLab-ResNet rebuilt in TensorFlow

DeepLab-ResNet-TensorFlow This is an (re-)implementation of DeepLab-ResNet in TensorFlow for semantic image segmentation on the PASCAL VOC dataset. Fr

Vladimir 1.2k Nov 04, 2022
A PyTorch implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"

TokenLearner: What Can 8 Learned Tokens Do for Images and Videos? Source: Improving Vision Transformer Efficiency and Accuracy by Learning to Tokenize

Caiyong Wang 14 Sep 20, 2022
ROS Basics and TurtleSim

Waypoint Follower Anna Garverick This package draws given waypoints, then waits for a service call with a start position to send the turtle to each wa

Anna Garverick 1 Dec 13, 2021
Benchmark VAE - Library for Variational Autoencoder benchmarking

Documentation pythae This library implements some of the most common (Variational) Autoencoder models. In particular it provides the possibility to pe

1.1k Jan 02, 2023
NOMAD - A blackbox optimization software

################################################################################### #

Blackbox Optimization 78 Dec 29, 2022
Distributed Deep learning with Keras & Spark

Elephas: Distributed Deep Learning with Keras & Spark Elephas is an extension of Keras, which allows you to run distributed deep learning models at sc

Max Pumperla 1.6k Jan 05, 2023
Portfolio Optimization and Quantitative Strategic Asset Allocation in Python

Riskfolio-Lib Quantitative Strategic Asset Allocation, Easy for Everyone. Description Riskfolio-Lib is a library for making quantitative strategic ass

Riskfolio 1.7k Jan 07, 2023
PyTorch implementation of Lip to Speech Synthesis with Visual Context Attentional GAN (NeurIPS2021)

Lip to Speech Synthesis with Visual Context Attentional GAN This repository contains the PyTorch implementation of the following paper: Lip to Speech

6 Nov 02, 2022
Code to run experiments in SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression.

Code to run experiments in SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression. Not an official Google product. Me

Google Research 27 Dec 12, 2022
Plaything for Autistic Children (demo for PaddlePaddle/Wechaty/Mixlab project)

星星的孩子 - 一款为孤独症孩子设计的聊天机器人游戏 孤独症儿童是目前常常被忽视的一类群体。他们有着类似性格内向的特征,实际却受着广泛性发育障碍的折磨。 项目背景 这类儿童在与人交往时存在着沟通障碍,其特点表现在: 社交交流差,互动障碍明显 认知能力有限,被动认知 兴趣狭窄,重复刻板,缺乏变化和想象

Tianyi Pan 35 Nov 24, 2022
This is the code for ACL2021 paper A Unified Generative Framework for Aspect-Based Sentiment Analysis

This is the code for ACL2021 paper A Unified Generative Framework for Aspect-Based Sentiment Analysis Install the package in the requirements.txt, the

108 Dec 23, 2022
Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing, Ant Colony Optimization Algorithm,Immune Algorithm, Artificial Fish Swarm Algorithm, Differential Evolution and TSP(Traveling salesman)

scikit-opt Swarm Intelligence in Python (Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing, Ant Colony Algorithm, Immune Algorithm,A

郭飞 3.7k Jan 03, 2023
[EMNLP 2021] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

RoSTER The source code used for Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training, p

Yu Meng 60 Dec 30, 2022
Code for Talking Face Generation by Adversarially Disentangled Audio-Visual Representation (AAAI 2019)

Talking Face Generation by Adversarially Disentangled Audio-Visual Representation (AAAI 2019) We propose Disentangled Audio-Visual System (DAVS) to ad

Hang_Zhou 750 Dec 23, 2022
PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision

Learning to Generate Grounded Visual Captions without Localization Supervision This is the PyTorch implementation of our paper: Learning to Generate G

Chih-Yao Ma 41 Nov 17, 2022
Exponential Graph is Provably Efficient for Decentralized Deep Training

Exponential Graph is Provably Efficient for Decentralized Deep Training This code repository is for the paper Exponential Graph is Provably Efficient

3 Apr 20, 2022
Julia package for multiway (inverse) covariance estimation.

TensorGraphicalModels TensorGraphicalModels.jl is a suite of Julia tools for estimating high-dimensional multiway (tensor-variate) covariance and inve

Wayne Wang 3 Sep 23, 2022