当前位置:网站首页>R-Dropout
R-Dropout
2022-04-22 00:29:00 【hxxjxw】
R-Drop yes Regularized Dropout
In order to solve Dropout In training and testing ( Reasoning ) The problem of inconsistency
Dropout In essence, it is a kind of integrated learning , That is, multiple neural networks are trained at the same time
R-Drop Made by Drop Different sub models produced , The distribution of their outputs should be consistent with each other .
say concretely , For each training sample ,R-Dropout Will be two sub models KL The divergence is minimized
in each mini-batch training, each data sample goes through the forward pass twice, and each pass is processed by a different sub model by randomly dropping out some hidden units. R-Drop forces the two distributions for the same data sample outputted by the two sub models to be consistent with each other, through minimizing the bidirectional Kullback-Leibler (KL) divergence between the two distributions
Code implementation
Pseudo code , Demonstration principle
import torch from torch import nn import numpy as np # Simulate two-layer network def train(p, x, w1, b1, w2, b2): layer1 = np.maximum(0, np.dot(w1, x) + b1) mask1 = np.random.binomial(1, 1-p, layer1.shape) layer1 = layer1 * mask1 layer1 = layer1 / (1-p) layer2 = np.maximum(0, np.dot(w2, layer1) + b2) mask2 = np.random.binomial(1, 1-p, layer2.shape) layer2 = layer2 * mask2 layer2 = layer2 / (1-p) return layer2 # Simulate two-layer network def train_r_dropout(p, x, w1, b1, w2, b2): bs = x.shape[0] x = torch.cat((x,x), dim=0) #----------- primary Dropout Part remains the same --------- layer1 = np.maximum(0, np.dot(w1, x) + b1) mask1 = np.random.binomial(1, 1-p, layer1.shape) layer1 = layer1 * mask1 layer1 = layer1 / (1-p) layer2 = np.maximum(0, np.dot(w2, layer1) + b2) mask2 = np.random.binomial(1, 1-p, layer2.shape) layer2 = layer2 * mask2 layer2 = layer2 / (1-p) #------------------------------------- logits = func(layer2) logits1, logits2 = logits[:bs, :], logits[bs:, :] nll1 = nll(logits1, label) nll2 = nll(logits2, label) kl_loss = kl(logits1, logits2) loss = nll1 + nll2 + kl_loss return loss def test(x, w1, b1, w2, b2): layer1 = np.maximum(0, np.dot(w1, x)+b1) layer2 = np.maximum(0, np.dot(w2, layer1) + b2) return layer2 input = np.random.randn(5, 4) w1 = np.random.rand(30,20) b1 = np.random.rand(30) w2 = np.random.rand(40,30) b2 = np.random.rand(40) output1 = train(p=0.5, x=input.reshape(-1), w1=w1, b1=b1, w2=w2, b2=b2) print(output1) output2 = test(x=input.reshape(-1), w1=w1, b1=b1, w2=w2, b2=b2) print(output2)
版权声明
本文为[hxxjxw]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204220025138424.html
边栏推荐
- Why do requirements change during MES implementation? How to solve it?
- [ES6] simplified writing method of object method, arrow function, parameter default value and rest parameter
- Memory management
- Basic knowledge of diode
- 油猴脚本同步
- Understand the production scheduling management function of MES system
- Sqlserver - Excel database connection related knowledge
- C语言 结构体
- uniapp+php开发的影票返利系统,可完美运营
- go操作mysql
猜你喜欢

Redis(三):redis集群——主从复制、哨兵、集群

xxl-job 带参数执行和高可用部署

Codesys méthode de lecture des fichiers csv (non Excel)

Flattening multilevel bidirectional linked list-c language

Activity preview | on April 23, a number of wonderful openmldb sharing came, which lived up to the good time of the weekend!

OJ每日一练——水仙花数
![【C语言】深度剖析文件操作 [进阶篇_ 复习专用]](/img/49/2e1554780cd67ba336c051d08e1e1a.png)
【C语言】深度剖析文件操作 [进阶篇_ 复习专用]

Privacy computing -- 36 -- federal learning acceleration method

笔记本拓展外接显示器时 鼠标移动不到主显示器外的另一块屏上
![[reading notes] empirical accounting and financial research methods - principle, application and SAS implementation, Lu Guihua](/img/a0/ceb8212bae4d76860f5a70824cd2c8.gif)
[reading notes] empirical accounting and financial research methods - principle, application and SAS implementation, Lu Guihua
随机推荐
Activity preview | on April 23, a number of wonderful openmldb sharing came, which lived up to the good time of the weekend!
MES实施过程中为什么会出现需求变更?又该如何解决?
Analysis of EMI suppression methods of switches and diodes in switching power supply
Functional coverage cov of coverage series learning
Deeply analyze the six differences between ERP and MES, and be sure to read them patiently
Deep learning (15): instructions for kitti2bag
移动互联网app开发,字节跳动 京东 360 网易面试题整理
How to handle the convenient and safe futures account opening?
On the happiness of fishing -- April 20
Oil monkey script synchronization
Architecture practice battalion - module III - operation
What has changed since Huawei Routing & Switching switched to datacom
Unity determines whether the file (under the local absolute directory) exists
等待wait(),wait(long),wait(long,int)/通知机制notify(),notifyAll()
An example of double exponential smoothing method
微服务简介,Euraka,Ribbon,openFeign
油猴脚本同步
L1-025 正整数A+B
笔记本拓展外接显示器时 鼠标移动不到主显示器外的另一块屏上
RT thread application - using RT thread on stm32l051 (I. new project of wireless temperature and humidity sensor)
