当前位置:网站首页>Advantages, disadvantages and selection of activation function
Advantages, disadvantages and selection of activation function
2022-04-23 15:27:00 【moletop】
Activation function :
-
significance : Increase the nonlinear modeling ability of the network , If there is no activation function , Then the network can only express linear mapping , Even if there are more hidden layers , The whole network is also equivalent to the single-layer neural network
-
Characteristics required :1. Continuous derivable .2, As simple as possible , Improve network computing efficiency .3, The value range is in the appropriate range , Otherwise, it will affect the training efficiency and stability .
-
Saturation activation function :Sigmoid、Tanh. Unsaturated activation function :ReLu. And the output layer ( classifier ) Of softmax
-
The choice of activation function : In the hidden layer ReLu>Tanh>Sigmoid .RNN in :Tanh,Sigmoid. Output layer :softmax( Classification task ). Neuronal death occurs , It can be used PRelu.
1**.Sigmoid**:
advantage :<1> Sigmoid The value range of is (0, 1), Coincidence probability , And monotonically increasing , Easier to optimize .
<2> Sigmoid Derivation is easier , It can be directly deduced that .
shortcoming :
<1> Sigmoid The function converges slowly .
<2> because Sigmoid It's soft saturation , It's easy to produce gradients that disappear , It is not suitable for deep network training, which is easy to cause the gradient to disappear .
<3> Sigmoid The function is not in the form of (0,0) For the center , Ring breaking data distribution .
2.Tanh function
advantage :<1> The function outputs in (0,0) Centered .shortcoming :<1> tanh There is no solution sigmoid The problem of gradient disappearance .
3.ReLU function
advantage :<1> stay SGD The convergence rate is faster than Sigmoid and tanh Much faster
<2> It effectively alleviates the problem of gradient disappearance .
shortcoming :
<1> Neuron disappointment is easy to appear in the process of training ( Negative half axis ), Then the gradient is always 0 The situation of , Cause irreversible death .
<2> The derivative is 1, Alleviate the problem of gradient disappearance , But it's easy to explode .
4.ReLu improvement
版权声明
本文为[moletop]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204231523160750.html
边栏推荐
- Rsync + inotify remote synchronization
- UML learning_ Day2
- My raspberry PI zero 2W toss notes to record some problems and solutions
- Mysql database explanation (10)
- MultiTimer v2 重构版本 | 一款可无限扩展的软件定时器
- Application of skiplist in leveldb
- C语言超全学习路线(收藏让你少走弯路)
- Krpano panorama vtour folder and tour
- Wechat applet customer service access to send and receive messages
- GFS distributed file system (Theory)
猜你喜欢
MySQL InnoDB transaction
2022年中国数字科技专题分析
Mysql连接查询详解
木木一路走好呀
Deep learning - Super parameter setting
On the day of entry, I cried (mushroom street was laid off and fought for seven months to win the offer)
Lotus DB design and Implementation - 1 Basic Concepts
如何设计一个良好的API接口?
Explanation 2 of redis database (redis high availability, persistence and performance management)
How to use OCR in 5 minutes
随机推荐
我的树莓派 Raspberry Pi Zero 2W 折腾笔记,记录一些遇到的问题和解决办法
调度系统使用注意事项
Compiling OpenSSL
My raspberry PI zero 2W toss notes to record some problems and solutions
深度学习调参的技巧
PSYNC synchronization of redis source code analysis
JUC learning record (2022.4.22)
About UDP receiving ICMP port unreachable
Kubernetes详解(十一)——标签与标签选择器
tcp_ Diag kernel related implementation 1 call hierarchy
The wechat applet optimizes the native request through the promise of ES6
如何设计一个良好的API接口?
Tun model of flannel principle
如何设计一个良好的API接口?
JS -- realize click Copy function
Grep was unable to redirect to the file
Openstack command operation
2022年中国数字科技专题分析
asp. Net method of sending mail using mailmessage
The win10 taskbar notification area icon is missing