2022-04-23 20:48:00 NuerNuer

from einops import rearrange, reduce, repeat
from einops.layers.torch import Rearrange, Reduce

One .rearrange and Rearrange, effect : It can also be seen from the function name that it rearranges the tensor scale ,

difference :
1.einops.layers.torch Medium Rearrange, It is used to analyze the tensor when building the network structure “ Implicit ” To deal with

for example :

class PatchEmbedding(nn.Module):
    def __init__(self, in_channels: int = 3, patch_size: int = 16, emb_size: int = 768, img_size: int = 224):
        self.patch_size = patch_size
        self.projection = nn.Sequential(
            # using a conv layer instead of a linear one -> performance gains
            nn.Conv2d(in_channels, emb_size, kernel_size=patch_size, stride=patch_size),
            Rearrange('b e (h) (w) -> b (h w) e'),

there Rearrange('b e (h) (w) -> b (h w) e'), It means that you will 4 The dimension tensor is converted to 3 dimension , And the original last two dimensions are merged into one dimension :(16,512,4,16)->(16,64,512)
In this way, as long as we know the initial tensor dimension, we can operate the annotation to rearrange its dimension .

2.eniops Medium rearrange, For tensor ‘ Show ’ To deal with , It's a function

for example :

rearrange(images, 'b h w c -> b (h w) c')

take 4 The dimension tensor is converted to 3 dimension , alike , As long as we know the initial dimension , You can manipulate annotations to rearrange them
It is worth noting that : After the annotation here is given, it represents the current dimension
, Can't change , for example :

image = torch.randn(1,2,3,2)  # torch.Size([1,2,3,2]) 

out = rearrange(image, 'b c h w -> b (c h w)', c=2,h=3,w=2) # torch.Size([1,12])
# h,w Value change for 
err1 = rearrange(image, 'b c h w -> b (c h w)', c=2,h=2,w=3) #  Report errors 

Two .repeat: the tensor Repeat a dimension in , To expand the number of dimensions

B = 16
cls_token = torch.randn(1, 1, emb_size)
cls_tokens = repeat(cls_token, '() n e -> b n e', b=B)# Dimension for 1 Available when () Instead of 

take (1,1,emb_size) The tensor treatment of is (B,1,emb_size)

R = 16
a = torch.randn(2,3,4)
b = repeat(a, 'b n e -> (r b) n e', r = R)
#(2R, 3, 4)
c = repeat(a, 'b n e -> b (r n) e', r = R)
#(2, 3R, 4)

# Incorrect usage :
d = repeat(a, 'b n e -> c n e', c = 2R)

# take (2,3,4) The dimensional tensor is treated as (2R, 3, 4)......
The above is the expansion of the same latitude , Let's look at an extension of dimension upgrading :

R = 5
a = torch.randn(2, 3, 4)
d = repeat(a,'b n e->  b n c e ', c = R)

# take (2,3,4) The dimensional tensor is treated as (2, 3, 5, 4)......

Here, we also only need to operate the dimension annotation to complete the corresponding tensor operation .

3、 ... and .Reduce and reduce:

x = torch.randn(100, 32, 64)
# perform max-reduction on the first axis:
y0 = reduce(x, 't b c -> b c', 'max') #(32, 64)
# Appoint h2,w2, Equivalent to the size of the specified pooled core 
x = torch.randn(10, 512, 30, 40)
# 2d max-pooling with kernel size = 2 * 2 
y1 = reduce(x, 'b c (h1 h2) (w1 w2) -> b c h1 w1', 'max', h2=2, w2=2)
#(10, 512, 15, 20)
# go back to the original height and width
y2 = rearrange(y1, 'b (c h2 w2) h1 w1 -> b c (h1 h2) (w1 w2)', h2=2, w2=2)
#(10, 128, 30, 40)
# Appoint h1,w1, Equivalent to the size of the tensor after specified pooling 
# 2d max-pooling to 12 * 16 grid:
y3 = reduce(x, 'b c (h1 h2) (w1 w2) -> b c h1 w1', 'max', h1=12, w1=16)
#(10, 512, 12, 16)

# 2d average-pooling to 12 * 16 grid:
y4 = (reduce(x, 'b c (h1 h2) (w1 w2) -> b c h1 w1', 'mean', h1=12, w1=16)
#(10, 512, 12, 16)

# Global average pooling
y5 = reduce(x, 'b c h w -> b c', 'mean')
#(10, 512)

Redece Empathy .

Be careful : Here we take tensor as an example ,einops Can also handle numpy The data under the
