Ἀνατομή is a PyTorch library to analyze representation of neural networks

Overview

anatome

Ἀνατομή is a PyTorch library to analyze internal representation of neural networks

This project is under active development and the codebase is subject to change.

Installation

anatome requires

Python>=3.9.0
PyTorch>=1.9.0
torchvision>=0.10.0

After the installation of PyTorch, install anatome as follows:

pip install -U git+https://github.com/moskomule/anatome

Available Tools

Representation Similarity

To measure the similarity of learned representation, anatome.SimilarityHook is a useful tool. Currently, the following methods are implemented.

from anatome import SimilarityHook

model = resnet18()
hook1 = SimilarityHook(model, "layer3.0.conv1")
hook2 = SimilarityHook(model, "layer3.0.conv2")
model.eval()
with torch.no_grad():
    model(data[0])
# downsampling to (size, size) may be helpful
hook1.distance(hook2, size=8)

Loss Landscape Visualization

from anatome import landscape2d

x, y, z = landscape2d(resnet18(),
                      data,
                      F.cross_entropy,
                      x_range=(-1, 1),
                      y_range=(-1, 1),
                      step_size=0.1)
imshow(z)

Fourier Analysis

  • Yin et al. NeurIPS 2019 etc.,
from anatome import fourier_map

map = fourier_map(resnet18(),
                  data,
                  F.cross_entropy,
                  norm=4)
imshow(map)

Citation

If you use this implementation in your research, please cite as:

@software{hataya2020anatome,
    author={Ryuichiro Hataya},
    title={anatome, a PyTorch library to analyze internal representation of neural networks},
    url={https://github.com/moskomule/anatome},
    year={2020}
}
Comments
  • CCA is very small between a random net vs a pretrained one, bug?

    CCA is very small between a random net vs a pretrained one, bug?

    I am getting this issue:

    import anatome
    print(anatome)
    # from anatome import CCAHook
    from anatome import SimilarityHook
    model = resnet18(pretrained=True)
    random_model = resnet18()
    # random_model = resnet18().cuda()
    # hook1 = CCAHook(model, "layer1.0.conv1")
    # hook2 = CCAHook(random_model, "layer1.0.conv1")
    cxa_dist_type = 'pwcca'
    layer_name = "layer1.0.conv1"
    hook1 = SimilarityHook(model, layer_name, cxa_dist_type)
    hook2 = SimilarityHook(random_model, layer_name, cxa_dist_type)
    with torch.no_grad():
        model(data[0])
        random_model(data[0])
    distance_btw_nets = hook1.distance(hook2, size=8)
    print(f'{distance_btw_nets=}')
    distance_btw_nets = hook1.distance(hook2, size=None)
    print(f'{distance_btw_nets=}')
    <module 'anatome' from '/Users/brando/anaconda3/envs/metalearning/lib/python3.9/site-packages/anatome/__init__.py'>
    distance_btw_nets=0.3089657425880432
    distance_btw_nets=-2.468004822731018e-08
    

    the second is suppose to use the full features but we see the error is much smaller when I expected it to increase by a lot since we are using more info since we didn't down sample.

    Is this a bug?

    opened by brando90 13
  • do you preprocess the matrices for us?

    do you preprocess the matrices for us?

    I just noticed these two paragraphs from the papers I read and was wondering if you centered the matrices or if the user has to center them for anatome to work before hand.

    Screen Shot 2021-09-15 at 4 31 04 PM Screen Shot 2021-09-15 at 4 30 48 PM

    opened by brando90 7
  • Do hooks create unexpected side effects in anatome?

    Do hooks create unexpected side effects in anatome?

    I will try really hard to make this my last question, and I won't bother you again. Do we need to do some sort of clearing after we call hook.distance in anatome? (or deep copy the models for the code to work properly)?

    e.g. modification based on your tutorial:

    def cxa_dist(mdl1: nn.Module, mdl2: nn.Module, X: Tensor, layer_name: str,
                 downsample_size: Optional[str] = None, iters: int = 1, cxa_dist_type: str = 'pwcca') -> float:
        import copy
        mdl1 = copy.deepcopy(mdl1)
        mdl2 = copy.deepcopy(mdl2)
        # get sim/dis functions
        hook1 = SimilarityHook(mdl1, layer_name, cxa_dist_type)
        hook2 = SimilarityHook(mdl2, layer_name, cxa_dist_type)
        mdl1.eval()
        mdl2.eval()
        for _ in range(iters):  # might make sense to go through multiple is NN is stochastic e.g. BN, dropout layers
            mdl1(X)
            mdl2(X)
        # - size: size of the feature map after downsampling
        dist = hook1.distance(hook2, size=downsample_size)
        # - remove hook, to make sure code stops being stateful (I hope)
        remove_hook(mdl1, hook1)
        remove_hook(mdl2, hook2)
        return float(dist)
    
    def remove_hook(mdl: nn.Module, hook):
        """
        ref: https://github.com/pytorch/pytorch/issues/5037
        """
        handle = mdl.register_forward_hook(hook)
        handle.remove()
    
    opened by brando90 5
  • How should anatome be used if we are comparing nets during training?

    How should anatome be used if we are comparing nets during training?

    Since the code is attaching hooks and it seems stateful (due to using objects instead of pure functions), how should one use anatome to compute CCAs sims etc correctly?

    Is using deep copy of the solution? Or deleting the hook after every use of similarity function?:

    def cxa_dist(mdl1: nn.Module, mdl2: nn.Module, X: Tensor, layer_name: str,
                 downsample_size: Optional[str] = None, iters: int = 1, cxa_dist_type: str = 'pwcca') -> float:
        import copy
        mdl1 = copy.deepcopy(mdl1)
        mdl2 = copy.deepcopy(mdl2)
        # print(cca_size)
        # meta_batch [T, N*K, CHW], [T, K, D]
        from anatome import SimilarityHook
        # get sim/dis functions
        hook1 = SimilarityHook(mdl1, layer_name, cxa_dist_type)
        hook2 = SimilarityHook(mdl2, layer_name, cxa_dist_type)
        mdl1.eval()
        mdl2.eval()
        for _ in range(iters):  # might make sense to go through multiple is NN is stochastic e.g. BN, dropout layers
            # x = torch_uu.torch_uu.distributions.Uniform(low=lb, high=ub).sample((num_samples_per_task, Din))
            # x = torch_uu.torch_uu.distributions.Uniform(low=-1, high=1).sample((15, 1))
            # x = torch_uu.torch_uu.distributions.Uniform(low=-1, high=1).sample((500, 1))
            mdl1(X)
            mdl2(X)
        # - size: size of the feature map after downsampling
        dist = hook1.distance(hook2, size=downsample_size)
        return float(dist)
    
    opened by brando90 4
  • Size Check

    Size Check

    Hello ~ Thank you for the implementation! It is amazing and helps me a lot!

    I have a question: in CCA, you seem to check x.size(0) < x.size(1). I believe the implementation is correct, for I have seen some similar checks in other implementations of CCA. However, I don't understand the rationale behind this. Could you explain a bit? Thanks! Also, something related (it could be other reasons), my data has a larger feature size (x.size(1)) than the number of examples (x.size(0)), and sometimes I get this error: The algorithm failed to converge because the input matrix is ill-conditioned or has too many repeated singular values. I was wondering whether they were related? Could you provide any insight on this?

    opened by Yupei-Du 3
  • How do we run the jupyter notebook example?

    How do we run the jupyter notebook example?

    Got error:

    FileNotFoundError: [Errno 2] No such file or directory: '/Users/brando/.torch/data/imagenet/val'
    

    is it possible to download some data set to test anatome such that it doesn't throw an error?

    Perhaps a colab example is better?

    a start: https://colab.research.google.com/drive/1GrhWrWFPmlc6kmxc0TJY0Nb6qOBBgjzX?usp=sharing

    opened by brando90 3
  • error on import

    error on import

    called

    ran

    import anatome

    Traceback (most recent call last):

    File "/home/cody/miniconda3/envs/RepDist/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3437, in run_code exec(code_obj, self.user_global_ns, self.user_ns)

    File "", line 1, in import anatome

    File "/home/cody/miniconda3/envs/RepDist/lib/python3.7/site-packages/anatome/init.py", line 1, in from .similarity import SimilarityHook

    File "", line 1 (x.size(0)=) ^ SyntaxError: invalid syntax

    I have anatome==0.0.1 I can provide my environment info if needed but the error looks pretty clear.

    opened by codestar12 3
  • Correcting CKA similarity to output distance by subtracting 1

    Correcting CKA similarity to output distance by subtracting 1

    corrected what seems to be that anatome is returning similarity when the function name is distance.

    Including the table from the original CKA paper to ensure my suggestion is correct indeed:

    Screen Shot 2021-02-02 at 4 52 02 PM

    paper link: http://proceedings.mlr.press/v97/kornblith19a.html

    opened by brando90 3
  • Is the way to calculate similarity just by doing 1 minus the values your library gives?

    Is the way to calculate similarity just by doing 1 minus the values your library gives?

    asking because

    1. it seems the original paper for CKA gives the values in terms of similarities and not distances so I don't understand why the library here gives them in distances
    2. make sure I don't do 1 - cca and make mistakes/is valid.

    I see that you have a bunch of 1 - value except for CKA, is that a bug?

    https://github.com/moskomule/anatome/blob/57f34fa796ffcff0ba37b5fa5142018e9f6fde61/anatome/similarity.py#L154 https://github.com/moskomule/anatome/blob/57f34fa796ffcff0ba37b5fa5142018e9f6fde61/anatome/similarity.py#L133 https://github.com/moskomule/anatome/blob/57f34fa796ffcff0ba37b5fa5142018e9f6fde61/anatome/similarity.py#L203

    opened by brando90 3
  • cca code for GPU code not working

    cca code for GPU code not working

    small example:

    import torch
    import torch.nn as nn
    from anatome import SimilarityHook
    
    from collections import OrderedDict
    
    #
    Din, Dout = 1, 1
    mdl1 = nn.Sequential(OrderedDict([
        ('fc1_l1', nn.Linear(Din, Dout)),
        ('out', nn.SELU()),
        ('fc2_l2', nn.Linear(Din, Dout)),
    ]))
    mdl2 = nn.Sequential(OrderedDict([
        ('fc1_l1', nn.Linear(Din, Dout)),
        ('out', nn.SELU()),
        ('fc2_l2', nn.Linear(Din, Dout)),
    ]))
    
    print(f'is cuda available: {torch.cuda.is_available()}')
    
    with torch.no_grad():
        mu = torch.zeros(Din)
        # std =  1.25e-2
        std = 10
        noise = torch.distributions.normal.Normal(loc=mu, scale=std).sample()
        # mdl2.fc1_l1.weight.fill_(50.0)
        # mdl2.fc1_l1.bias.fill_(50.0)
        mdl2.fc1_l1.weight += noise
        mdl2.fc1_l1.bias += noise
    
    if torch.cuda.is_available():
        mdl1 = mdl1.cuda()
        mdl2 = mdl2.cuda()
    
    hook1 = SimilarityHook(mdl1, "fc1_l1")
    hook2 = SimilarityHook(mdl2, "fc1_l1")
    mdl1.eval()
    mdl2.eval()
    
    # params for doing "good" CCA
    iters = 10
    num_samples_per_task = 500
    size = 8
    # start CCA comparision
    lb, ub = -1, 1
    
    for _ in range(iters):
        x = torch.torch.distributions.Uniform(low=-1, high=1).sample((num_samples_per_task, 1))
        if torch.cuda.is_available():
            x = x.cuda()
        y1 = mdl1(x)
        y2 = mdl2(x)
        print(f'y1 - y2 = {(y1-y2).norm(2)}')
    print('about to do cca')
    dist = hook1.distance(hook2, size=size)
    print('cca done')
    print(f'cca dist = {dist}')
    print('--> Done!\a')
    

    but it always has a segmentation error:

    (automl-meta-learning) miranda9~/automl-meta-learning $ python test_cca_gpu.py 
    is cuda available: True
    y1 - y2 = 4.561897277832031
    y1 - y2 = 3.7458858489990234
    y1 - y2 = 3.8464999198913574
    y1 - y2 = 4.947702407836914
    y1 - y2 = 5.404015064239502
    y1 - y2 = 4.85843563079834
    y1 - y2 = 4.000360488891602
    y1 - y2 = 4.194643020629883
    y1 - y2 = 4.894904613494873
    y1 - y2 = 4.7721710205078125
    about to do cca
    Segmentation fault
    

    why? how is this fixed?

    opened by brando90 3
  • potential improvement for CNN size=None (or bug?)

    potential improvement for CNN size=None (or bug?)

    I noticed for size=None you assume an activation is a neuron.

    It is possible to have this instead:

                # - convolution layer [M, C, H, W]
                if size is None:
                    # - no downsampling: [M, C, H, W] -> [M, C, H*W]
                    # an activation is an (effective) neuron is an activation (in the spatial dimension)
                    # (effective) data size is
    
                    # flatten(2) -> flatten from 2 to -1 (end)
                    self_tensor = self_tensor.flatten(start_dim=2, end_dim=-1).contiguous()
                    other_tensor = other_tensor.flatten(start_dim=2, end_dim=-1).contiguous()
    
                    # improvement [M, C, H, W] -> [M, C*H*W]
                    self_tensor = self_tensor.flatten(start_dim=3, end_dim=-1).contiguous()
                    other_tensor = other_tensor.flatten(start_dim=3, end_dim=-1).contiguous()
                    return self.cca_function(self_tensor, other_tensor).item()
    

    Original paper Screen Shot 2021-10-28 at 12 42 40 PM

    I am aware you later go and compare them by looping through each data point later, which is not exactly equivalent as the above - though that is a small nuance. That approach assumes C is the effective size of the data and that each activation in the spatial dimension is a filter. But usually a neuron (vector) is considered to have size with respect to the data set so usually it's [M, CHW] or [MHW, C]. So I'm unsure why having the filter size as the effective size of the data set for CCA is justified.

    I will go with [MHW, C] since I think the definition of a neuron per filter makes more sense and each patch seen by a filter as a data point makes more sense. I think due to the nature of CCA, this is fine to apply even across layers. If you want to know why I'm happy to copy paste that section of the background section of my paper here.

    see:

    Screen Shot 2021-10-28 at 12 48 47 PM

    Thanks for your great library and feedback!

    opened by brando90 2
  • bug in lincka?

    bug in lincka?

    isn't a 1- missing?

    https://github.com/moskomule/anatome/blob/393b36df77631590be7f4d23bff5436fa392dc0e/anatome/distance.py#L212

    https://github.com/moskomule/anatome/pull/7

    Screen Shot 2021-11-17 at 10 51 08 AM
    opened by brando90 1
  • why orthonormalize the cca combination instead of the cca vectors/canonical neurons?

    why orthonormalize the cca combination instead of the cca vectors/canonical neurons?

    Why does anatome compute pwcca by orthonormalizing the a vector instead of the CCA vector x_tilde = x @ a? e.g.

    https://github.com/moskomule/anatome/blob/393b36df77631590be7f4d23bff5436fa392dc0e/anatome/distance.py#L161

    the authors do the latter i.e. orthonormalize x_tilde not a: Screen Shot 2021-11-16 at 10 32 03 AM

    https://arxiv.org/abs/1806.05759

    current anatome code:

    def pwcca_distance(x: Tensor,
                       y: Tensor,
                       backend: str
                       ) -> Tensor:
        """ Projection Weighted CCA proposed in Marcos et al. 2018.
        Args:
            x: input tensor of Shape DxH, where D>H
            y: input tensor of Shape DxW, where D>H
            backend: svd or qr
        Returns:
        """
    
        a, b, diag = cca(x, y, backend)
        a, _ = torch.linalg.qr(a)  # reorthonormalize
        alpha = (x @ a).abs_().sum(dim=0)
        alpha /= alpha.sum()
        return 1 - alpha @ diag
    

    related: https://stackoverflow.com/questions/69993768/how-does-one-implement-pwcca-in-pytorch-match-the-original-pwcca-implemented-in

    opened by brando90 6
  • Bug in cca_by_svd computation

    Bug in cca_by_svd computation

    I am computing two random matrices that are different and totally random. Their CCA shouldn't be high since they are different and random. Google's svcca library gives me much lower CCA values but anatome's gives me CCA values that are 1.0...which obviously are wrong. Any idea where the bug might be? Been looking for it for a while:

    Code:

    #%%
    import torch
    from matplotlib import pyplot as plt
    
    from uutils.torch_uu.metrics.cca import cca_core
    
    from torch import Tensor
    from anatome.similarity import svcca_distance, cca_by_svd, cca_by_qr, _compute_cca_traditional_equation
    import numpy as np
    import random
    np.random.seed(0)
    torch.manual_seed(0)
    random.seed(0)
    
    
    # tutorial shapes (500, 10000) (500, 10000) based on MNIST with 500 neurons from a FCNN
    D, N = 500, 10_000
    
    # - creating a random baseline
    # b1 = np.random.randn(*acts1.shape)
    # b2 = np.random.randn(*acts2.shape)
    b1 = np.random.randn(D, N)
    b2 = np.random.randn(D, N)
    print('-- reproducibity finger print')
    print(f'{b1.sum()=}')
    print(f'{b2.sum()=}')
    print(f'{b1.shape=}')
    
    # - get cca values for baseline
    print("\n-- Google's SVCCA -- ")
    baseline = cca_core.get_cca_similarity(b1, b2, epsilon=1e-10, verbose=False)
    # _plot_helper(baseline["cca_coef1"], "CCA coef idx", "CCA coef value")
    # print("Baseline Mean CCA similarity", np.mean(baseline["cca_coef1"]))
    # print("Baseline CCA similarity", baseline["cca_coef1"])
    print(f'{len(baseline["cca_coef1"])=}')
    print("Baseline CCA similarity", baseline["cca_coef1"][:6])
    print(f"{np.mean(baseline['cca_coef1'])=}")
    
    # - get sklern's cca's https://scikit-learn.org/stable/modules/generated/sklearn.cross_decomposition.CCA.html
    # from sklearn.cross_decomposition import CCA
    # # cca = CCA(n_components=D)
    # cca = CCA(n_components=6)
    # cca.fit(b1, b2)
    
    # -
    print("\n-- Ultimate Anatome's SVCCA --")
    # 'svcca': partial(svcca_distance, accept_rate=0.99, backend='svd')
    # svcca_dist: Tensor = svcca_distance(x, y, accept_rate=0.99, backend='svd')
    b1_t, b2_t = torch.from_numpy(b1), torch.from_numpy(b2)
    # svcca: Tensor = 1.0 - svcca_distance(x=b1_t, y=b2_t, accept_rate=1.0, backend='svd')
    # diag: Tensor = svcca_distance(x=b1_t, y=b2_t, accept_rate=0.99, backend='svd')
    # a, b, diag = cca(x, y, backend='svd')
    a, b, diag = cca_by_svd(b1_t, b2_t)
    # a, b, diag = cca_by_qr(b1_t, b2_t)
    # diag = _compute_cca_traditional_equation(b1_t, b2_t)
    print(f'{diag.size()=}')
    print(f'{diag[:6]=}')
    print(f'{diag.mean()=}')
    # print(f'{svcca=}')
    
    print()
    

    output

    -- reproducibity finger print
    b1.sum()=686.0427476883059
    b2.sum()=2341.981561471438
    b1.shape=(500, 10000)
    -- Google's SVCCA -- 
    len(baseline["cca_coef1"])=500
    Baseline CCA similarity [0.43056735 0.42918498 0.42502398 0.42290128 0.42208184 0.41986944]
    np.mean(baseline['cca_coef1'])=0.19071397156414877
    -- Ultimate Anatome's SVCCA --
    diag.size()=torch.Size([500])
    diag[:6]=tensor([1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000], dtype=torch.float64)
    diag.mean()=tensor(1., dtype=torch.float64)
    

    related: https://stackoverflow.com/questions/69993768/how-does-one-implement-pwcca-in-pytorch-match-the-original-pwcca-implemented-in

    opened by brando90 33
Releases(0.04)
Owner
Ryuichiro Hataya
PhD student at UTokyo and RA at RIKEN AIP / focusing on DL and ML
Ryuichiro Hataya
A PyTorch implementation of "Semi-Supervised Graph Classification: A Hierarchical Graph Perspective" (WWW 2019)

SEAL ⠀⠀⠀ A PyTorch implementation of Semi-Supervised Graph Classification: A Hierarchical Graph Perspective (WWW 2019) Abstract Node classification an

Benedek Rozemberczki 202 Dec 27, 2022
Generalized and Efficient Blackbox Optimization System.

OpenBox Doc | OpenBox中文文档 OpenBox: Generalized and Efficient Blackbox Optimization System OpenBox is an efficient and generalized blackbox optimizatio

DAIR Lab 238 Dec 29, 2022
Python code to fuse multiple RGB-D images into a TSDF voxel volume.

Volumetric TSDF Fusion of RGB-D Images in Python This is a lightweight python script that fuses multiple registered color and depth images into a proj

Andy Zeng 845 Jan 03, 2023
Official re-implementation of the Calibrated Adversarial Refinement model described in the paper Calibrated Adversarial Refinement for Stochastic Semantic Segmentation

Official re-implementation of the Calibrated Adversarial Refinement model described in the paper Calibrated Adversarial Refinement for Stochastic Semantic Segmentation

Elias Kassapis 31 Nov 22, 2022
Progressive Coordinate Transforms for Monocular 3D Object Detection

Progressive Coordinate Transforms for Monocular 3D Object Detection This repository is the official implementation of PCT. Introduction In this paper,

58 Nov 06, 2022
CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation

CPT This repository contains code and checkpoints for CPT. CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Gener

fastNLP 341 Dec 29, 2022
Implementation of Wasserstein adversarial attacks.

Stronger and Faster Wasserstein Adversarial Attacks Code for Stronger and Faster Wasserstein Adversarial Attacks, appeared in ICML 2020. This reposito

21 Oct 06, 2022
Implementation of the Swin Transformer in PyTorch.

Swin Transformer - PyTorch Implementation of the Swin Transformer architecture. This paper presents a new vision Transformer, called Swin Transformer,

597 Jan 03, 2023
Wind Speed Prediction using LSTMs in PyTorch

Implementation of Deep-Forecast using PyTorch Deep Forecast: Deep Learning-based Spatio-Temporal Forecasting Adapted from original implementation Setu

Onur Kaplan 151 Dec 14, 2022
Speech Enhancement Generative Adversarial Network Based on Asymmetric AutoEncoder

ASEGAN: Speech Enhancement Generative Adversarial Network Based on Asymmetric AutoEncoder 中文版简介 Readme with English Version 介绍 基于SEGAN模型的改进版本,使用自主设计的非

Nitin 53 Nov 17, 2022
Pytorch implementation of Implicit Behavior Cloning.

Implicit Behavior Cloning - PyTorch (wip) Pytorch implementation of Implicit Behavior Cloning. Install conda create -n ibc python=3.8 pip install -r r

Kevin Zakka 49 Dec 25, 2022
This repository is for DSA and CP scripts for reference.

dsa-script-collections This Repo is the collection of DSA and CP scripts for reference. Contents Python Bubble Sort Insertion Sort Merge Sort Quick So

Aditya Kumar Pandey 9 Nov 22, 2022
Designing a Practical Degradation Model for Deep Blind Image Super-Resolution (ICCV, 2021) (PyTorch) - We released the training code!

Designing a Practical Degradation Model for Deep Blind Image Super-Resolution Kai Zhang, Jingyun Liang, Luc Van Gool, Radu Timofte Computer Vision Lab

Kai Zhang 804 Jan 08, 2023
PromptDet: Expand Your Detector Vocabulary with Uncurated Images

PromptDet: Expand Your Detector Vocabulary with Uncurated Images Paper Website Introduction The goal of this work is to establish a scalable pipeline

103 Dec 20, 2022
Stream images from a connected camera over MQTT, view using Streamlit, record to file and sqlite

mqtt-camera-streamer Summary: Publish frames from a connected camera or MJPEG/RTSP stream to an MQTT topic, and view the feed in a browser on another

Robin Cole 183 Dec 16, 2022
EsViT: Efficient self-supervised Vision Transformers

Efficient Self-Supervised Vision Transformers (EsViT) PyTorch implementation for EsViT, built with two techniques: A multi-stage Transformer architect

Microsoft 352 Dec 25, 2022
Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts The rapid progress in 3D scene understanding has come with growing dem

Facebook Research 182 Dec 30, 2022
A PyTorch Implementation of the Luna: Linear Unified Nested Attention

Unofficial PyTorch implementation of Luna: Linear Unified Nested Attention The quadratic computational and memory complexities of the Transformer’s at

Soohwan Kim 32 Nov 07, 2022
Source code for Transformer-based Multi-task Learning for Disaster Tweet Categorisation (UCD's participation in TREC-IS 2020A, 2020B and 2021A).

Source code for "UCD participation in TREC-IS 2020A, 2020B and 2021A". *** update at: 2021/05/25 This repo so far relates to the following work: Trans

Congcong Wang 4 Oct 19, 2021
Practical tutorials and labs for TensorFlow used by Nvidia, FFN, CNN, RNN, Kaggle, AE

TensorFlow Tutorial - used by Nvidia Learn TensorFlow from scratch by examples and visualizations with interactive jupyter notebooks. Learn to compete

Alexander R Johansen 1.9k Dec 19, 2022