当前位置:网站首页>Contrastive Learning Series (3)-----SimCLR
Contrastive Learning Series (3)-----SimCLR
2022-08-11 08:46:00 【Tao Jiang】
SimCLR
SimCLRRepresentations are learned by maximizing the consistency of the same data under different augmentations through a contrastive loss in the hidden space.SimCLRThe framework has four main components,分别是:数据增广,encode网络,projection headNetworks and Contrastive Learning Functions.
对于数据 x x x,Extract two independent data augmentation operators from the same data augmentation family( t ∼ T t \sim T t∼T和 t ′ ∼ T {t}' \sim T t′∼T),to get two related views x ^ i \hat{x}_{i} x^i和 x ^ j \hat{x}_{j} x^j, x ^ i \hat{x}_{i} x^i和 x ^ j \hat{x}_{j} x^j是一对正样本,Then a neural network encoder f ( ⋅ ) f\left( \cdot \right) f(⋅)Extract features from augmented data h i = f ( x ^ i ) , h j = f ( x ^ j ) , h_{i}=f\left( \hat{x}_{i} \right), h_{j}=f\left( \hat{x}_{j} \right), hi=f(x^i),hj=f(x^j),.And then a small neural networkproject head g ( ⋅ ) g\left( \cdot \right) g(⋅)Map features into the space of contrastive losses.project headwith one hidden layerMLP获取 z i = g ( h i ) = W ( 2 ) σ ( W ( 1 ) h i ) z_{i} = g\left( h_{i} \right) = W^{\left( 2 \right)} \sigma \left( W^{\left( 1 \right)} h_{i}\right) zi=g(hi)=W(2)σ(W(1)hi).
For contains a pair of positive samples x ^ i \hat{x}_{i} x^i和 x ^ j \hat{x}_{j} x^j的集合 { x ^ k } \{ \hat{x}_{k} \} { x^k},The contrast prediction task aims for a given x ^ i \hat{x}_{i} x^i在 { x ^ } k ≠ i \{ \hat{x} \}_{k \neq i} { x^}k=i中识别出 x ^ j \hat{x}_{j} x^j.随机挑选 N N N个样本组成一个minibatch,这个minibatch中则有 2 N 2N 2N个数据样本,将其他 2 ( N − 1 ) 2\left( N - 1\right) 2(N−1)an amplified sample as thisminibatch中的负样本,设 s i m ( u , v ) = u T v / ∥ u ∥ ∥ v ∥ sim\left( u, v\right) = u^{T}v / \| u\| \| v\| sim(u,v)=uTv/∥u∥∥v∥表示 l 2 l_{2} l2Yours after regularization u u u和 v v v的点积,Then for a pair of positive samples ( i , j ) \left( i, j \right) (i,j),The loss function is defined as follows:
l i , j = − l o g e x p ( s i m ( z i , z j ) / τ ) ∑ k = 1 2 N 1 [ k ≠ i ] e x p ( s i m ( z i , z k ) / τ ) l_{i,j} = - log \frac{exp\left( sim \left( z_{i}, z_{j}\right) / \tau \right)}{\sum_{k=1}^{2N} \mathbb{1}_{[ k \neq i]} exp\left( sim \left( z_{i}, z_{k}\right) / \tau \right)} li,j=−log∑k=12N1[k=i]exp(sim(zi,zk)/τ)exp(sim(zi,zj)/τ)
The final loss function computes aminibatchAll positive sample pairs in ,包括 ( i , j ) \left( i, j \right) (i,j)和 ( j , i ) \left( j,i \right) (j,i).下面是SimCLR的伪代码.从伪代码中可以看出,编码器 f ( ⋅ ) f\left( \cdot \right) f(⋅)和project head g ( ⋅ ) g\left( \cdot \right) g(⋅) Parameters are updated during training,But only the encoder f ( ⋅ ) f\left( \cdot \right) f(⋅)用于下游任务.
simCLR不采用memory bank的形式进行训练,rather increasebatchsize,bacth size为8192,对于每一个正样本,将会有16382Instances of negative samples.增大batch sizeActually equivalent to eachminibatchdynamically generate onememory bank.The papers found using standard onesSGD/Momentum,大batch sizeIt is unstable during training,论文中采用LARS优化器.
参考
边栏推荐
猜你喜欢
随机推荐
gRPC系列(二) 如何用Protobuf组织内容
gRPC系列(一) 什么是RPC?
【系统梳理】微服务的注册和发现中心
无代码平台助力中山医院搭建“智慧化管理体系”,实现智慧医疗
klayout--导出版图为gds文件
盘点四个入门级SSL证书
kali渗透测试环境搭建
YTU 2297: KMP模式匹配 三(串)
【系统梳理】当我们在说服务治理的时候,其实我们说的是什么?
Design of Cluster Gateway in Game Server
nodejs微服务中跨域,请求,接口,参数拦截等功能
go sqlx 包
如何通过开源数据库管理工具 DBeaver 连接 TDengine
MySql事务
CIKM 2022 AnalytiCup Competition: Federal Heterogeneous Task Learning
@RequiredArgsConstructor注解
Kotlin算法入门计算水仙花数
Unity3D——自定义类的Inspector面板的修改
关于架构的认知
音视频+AI,中关村科金助力某银行探索发展新路径 | 案例研究