An attempt at the implementation of GLOM, Geoffrey Hinton's paper for emergent part-whole hierarchies from data

Last update: Feb 21, 2022

Overview

GLOM TensorFlow

This Python package attempts to implement GLOM in TensorFlow, which allows advances made by several different groups transformers, neural fields, contrastive representation learning, distillation and capsules to be combined. This was suggested by Geoffrey Hinton in his paper "How to represent part-whole hierarchies in a neural network".

Further, Yannic Kilcher's video and Phil Wang's repo was very helpful for me to implement this project.

Installation

Run the following to install:

pip install glom-tf

Developing `glom-tf`

To install glom-tf, along with tools you need to develop and test, run the following in your virtualenv:

git clone https://github.com/Rishit-dagli/GLOM-TensorFlow.git
# or clone your own fork

cd GLOM-TensorFlow
pip install -e .[dev]

A bit about GLOM

The GLOM architecture is composed of a large number of columns which all use exactly the same weights. Each column is a stack of spatially local autoencoders that learn multiple levels of representation for what is happening in a small image patch. Each autoencoder transforms the embedding at one level into the embedding at an adjacent level using a multilayer bottom-up encoder and a multilayer top-down decoder. These levels correspond to the levels in a part-whole hierarchy.

Interactions among the 3 levels in one column

An example shared by the author was as an example when show a face image, a single column might converge on embedding vectors representing a nostril, a nose, a face, and a person.

At each discrete time and in each column separately, the embedding at a level is updated to be the weighted average of:

bottom-up neural net acting on the embedding at the level below at the previous time
top-down neural net acting on the embedding at the level above at the previous time
embedding vector at the previous time step
attention-weighted average of the embeddings at the same level in nearby columns at the previous time

For a static image, the embeddings at a level should settle down over time to produce similar vectors.

A picture of the embeddings at a particular time

Usage

from glomtf import Glom

model = Glom(dim = 512,
             levels = 5,
             image_size = 224,
             patch_size = 14)

img = tf.random.normal([1, 3, 224, 224])
levels = model(img, iters = 12) # (1, 256, 5, 12)
# 1 - batch
# 256 - patches
# 5 - levels
# 12 - dimensions

Use the return_all = True argument to get all the column and level states per iteration. This also gives you access to all the level data across iterations for clustering, from which you can inspect the islands too.

from glomtf import Glom

model = Glom(dim = 512,
             levels = 5,
             image_size = 224,
             patch_size = 14)

img = tf.random.normal([1, 3, 224, 224])
all_levels = model(img, iters = 12, return_all = True) # (13, 1, 256, 5, 12)
# 13 - time

# top level outputs after iteration 6
top_level_output = all_levels[7, :, :, -1] # (1, 256, 512)
# 1 - batch
# 256 - patches
# 512 - dimensions

Want to Contribute 🙋‍♂️ ?

Awesome! If you want to contribute to this project, you're always welcome! See Contributing Guidelines. You can also take a look at open issues for getting more information about current or upcoming tasks.

Want to discuss? 💬

Have any questions, doubts or want to present your opinions, views? You're always welcome. You can start discussions.

Citations

@misc{hinton2021represent,
    title   = {How to represent part-whole hierarchies in a neural network}, 
    author  = {Geoffrey Hinton},
    year    = {2021},
    eprint  = {2102.12627},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}

License

Copyright 2020 Rishit Dagli

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

You might also like...

Deep Multi-Magnification Network for multi-class tissue segmentation of whole slide images

Deep Multi-Magnification Network This repository provides training and inference codes for Deep Multi-Magnification Network published here. Deep Multi

12 Aug 6, 2022

The official implementation of the CVPR 2021 paper FAPIS: a Few-shot Anchor-free Part-based Instance Segmenter

FAPIS The official implementation of the CVPR 2021 paper FAPIS: a Few-shot Anchor-free Part-based Instance Segmenter Introduction This repo is primari

8 Dec 11, 2022

Utility tools for the "Divide and Remaster" dataset, introduced as part of the Cocktail Fork problem paper

Divide and Remaster Utility Tools Utility tools for the "Divide and Remaster" dataset, introduced as part of the Cocktail Fork problem paper The DnR d

46 Dec 11, 2022

Part-Aware Data Augmentation for 3D Object Detection in Point Cloud

Part-Aware Data Augmentation for 3D Object Detection in Point Cloud This repository contains a reference implementation of our Part-Aware Data Augment

62 Jan 3, 2023

Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646

[TCSVT] Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization LPN [Paper] NEWs Prerequisites Python 3.6 GPU Memory = 8G Numpy 1.

46 Dec 14, 2022

Towards Part-Based Understanding of RGB-D Scans

Created as part of CS50 AI's coursework. This AI makes use of knowledge entailment to calculate the best probabilities to win Minesweeper.

Minesweeper-AI Created as part of CS50 AI's coursework. This AI makes use of knowledge entailment to calculate the best probabilities to win Minesweep

0 Jul 20, 2022

Comments

[ImgBot] Optimize images

Beep boop. Your images are optimized!

Your image file size has been reduced by 12% 🎉

Details

| File | Before | After | Percent reduction | |:--|:--|:--|:--| | /images/embeddings.png | 65.19kb | 56.17kb | 13.83% | | /images/interactions.png | 56.01kb | 50.43kb | 9.96% | | | | | | | Total : | 121.20kb | 106.60kb | 12.04% |

Black Lives Matter | 💰 donate | 🎓 learn | ✍🏾 sign

📝 docs | :octocat: repo | 🙋🏾 issues | 🏅 swag | 🏪 marketplace

opened by imgbot[bot] 0
Implement Pairwise Distance
Write an algorithm that computes batched the p-norm distance between each pair of two collections of row vectors. We use the euclidean distance metric. For a matrix A [m, d] and a matrix B [n, d] we expect a matrix of pairwise distances here D [m, n]

Arguments:

A: A tf.Tensor object. The first matrix.

B: A tf.tensor object. The second matrix.

Returns:

Calculate distance.

Reference:

scipy.spatial.distance.cdist

tensorflow/tensorflow#30659

Closes #4
opened by Rishit-dagli 0
Implement Pairwise Distance
While trying to implement #2 I noticed there is no TensorFlow op for calculating pairwise distances, so I would also need to create an implementation for that.

References

https://github.com/tensorflow/tensorflow/issues/30659

scipy.spatial.distance.cdist
opened by Rishit-dagli 0
GroupedFeeedForward Layer

Write a GroupedFeeedForward layer inherited from the tf.keras.layers.Layer. This layer should be used for the bottom-up and top-down networks changing the number of groups in each case.

opened by Rishit-dagli 0

Releases(v0.1.1)

v0.1.1(Mar 27, 2021)

Add usage examples to better help understand usage.
Source code(tar.gz)
Source code(zip)
v0.1.0(Mar 27, 2021)

Minor changes to usage examples
Source code(tar.gz)
Source code(zip)
0.1.0(Mar 27, 2021)

Fix a major shape error with GroupedFeedForward
Source code(tar.gz)
Source code(zip)

An attempt at the implementation of GLOM, Geoffrey Hinton's paper for emergent part-whole hierarchies from data

Related tags

Overview

GLOM TensorFlow

Installation

Developing glom-tf

A bit about GLOM

Usage

Want to Contribute 🙋‍♂️ ?

Want to discuss? 💬

Citations

License

You might also like...

Deep Multi-Magnification Network for multi-class tissue segmentation of whole slide images

The official implementation of the CVPR 2021 paper FAPIS: a Few-shot Anchor-free Part-based Instance Segmenter

Utility tools for the "Divide and Remaster" dataset, introduced as part of the Cocktail Fork problem paper

Part-Aware Data Augmentation for 3D Object Detection in Point Cloud

Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646

Towards Part-Based Understanding of RGB-D Scans

Kaggle | 9th place (part of) solution for the Bristol-Myers Squibb – Molecular Translation challenge

TorchIO is a Medical image preprocessing and augmentation toolkit for deep learning. Part of the PyTorch Ecosystem.

Created as part of CS50 AI's coursework. This AI makes use of knowledge entailment to calculate the best probabilities to win Minesweeper.

Comments

[ImgBot] Optimize images

Beep boop. Your images are optimized!

Implement Pairwise Distance

Arguments:

Returns:

Reference:

Implement Pairwise Distance

References

GroupedFeeedForward Layer

Releases(v0.1.1)

v0.1.1(Mar 27, 2021)

v0.1.0(Mar 27, 2021)

0.1.0(Mar 27, 2021)

Owner

Rishit Dagli

RSC-Net: 3D Human Pose, Shape and Texture from Low-Resolution Images and Videos

[CVPR 2021] MiVOS - Scribble to Mask module

A tensorflow implementation of GCN-LPA

Distributed Evolutionary Algorithms in Python

A Factor Model for Persistence in Investment Manager Performance

[CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

Readings for "A Unified View of Relational Deep Learning for Polypharmacy Side Effect, Combination Therapy, and Drug-Drug Interaction Prediction."

Flexible-CLmser: Regularized Feedback Connections for Biomedical Image Segmentation

This repository contains the code for EMNLP-2021 paper "Word-Level Coreference Resolution"

Implementation of Hire-MLP: Vision MLP via Hierarchical Rearrangement and An Image Patch is a Wave: Phase-Aware Vision MLP.

This script runs neural style transfer against the provided content image.

CBREN: Convolutional Neural Networks for Constant Bit Rate Video Quality Enhancement

HybVIO visual-inertial odometry and SLAM system

A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities

[ICCV-2021] An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation

for taichi voxel-challange event

DeconvNet : Learning Deconvolution Network for Semantic Segmentation

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency[ECCV 2020]

Seg-Torch for Image Segmentation with Torch

A python toolbox for predictive uncertainty quantification, calibration, metrics, and visualization

Developing `glom-tf`