当前位置:网站首页>Deeply learn the skills of parameter adjustment
Deeply learn the skills of parameter adjustment
2022-04-23 15:27:00 【moletop】
How to adjust parameters :
-
batchsize Be suitable
-
epoch Be suitable , Observe the convergence , Prevent over fitting
-
Whether to add batch nomal
-
dropout If you need
-
Activate function selection : except gate Places like that , You need to limit the output to 0-1 outside , Try not to use sigmoid, It can be used tanh perhaps relu Activation functions like that .1. sigmoid Function in -4 To 4 Section in , There's a big gradient . Outside the range , The gradient is close to 0, It's easy to cause the gradient to disappear .2. Input 0 mean value ,sigmoid The output of the function is not 0 Mean .
-
Loss function round plus regular , A round without regularity
-
The choice of optimizer :adam,adadelta etc. , On small data , The effect of the experiment is not as good as sgd, sgd The convergence rate will be slower , But the final result of convergence , It's generally better . If you use sgd Words , You can choose from 1.0 perhaps 0.1 The learning rate started to , After a while , Check on the validation set , If cost No decline , Cut the learning rate by half . Many papers do this , The results of the experiment are also very good . Of course , You can also use ada The series starts with , At the end of the day , Replace it with sgd Keep training . There will also be improvements . It is said that adadelta In general, the effect of classification is better ,adam In the generation problem, the effect is better .
-
ensemble
-
The same parameters , Different initialization methods
-
Different parameters , adopt cross-validation, Choose the best groups
k Detailed explanation of folding and crossing :https://www.cnblogs.com/henuliulei/p/13686046.html
-
The same parameters , Different stages of model training , That is, models with different iterations .
-
Different models , Linear fusion . for example RNN And traditional models .
-
版权声明
本文为[moletop]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204231523160668.html
边栏推荐
- How to design a good API interface?
- Educational codeforces round 127 A-E problem solution
- API gateway / API gateway (III) - use of Kong - current limiting rate limiting (redis)
- 【backtrader源码解析18】yahoo.py 代码注释及解析(枯燥,对代码感兴趣,可以参考)
- C语言超全学习路线(收藏让你少走弯路)
- Detailed explanation of C language knowledge points -- first understanding of C language [1] - vs2022 debugging skills and code practice [1]
- Explanation of redis database (IV) master-slave replication, sentinel and cluster
- On the day of entry, I cried (mushroom street was laid off and fought for seven months to win the offer)
- tcp_ Diag kernel related implementation 1 call hierarchy
- Machine learning - logistic regression
猜你喜欢
让阿里P8都为之着迷的分布式核心原理解析到底讲了啥?看完我惊了
My raspberry PI zero 2W tossing notes record some problems encountered and solutions
机器学习——逻辑回归
Detailed explanation of C language knowledge points -- first understanding of C language [1] - vs2022 debugging skills and code practice [1]
Detailed explanation of kubernetes (XI) -- label and label selector
For 22 years, you didn't know the file contained vulnerabilities?
我的 Raspberry Pi Zero 2W 折腾笔记,记录一些遇到的问题和解决办法
Tun model of flannel principle
Mysql连接查询详解
Sword finger offer (1) -- for Huawei
随机推荐
C语言超全学习路线(收藏让你少走弯路)
Detailed explanation of C language knowledge points -- data types and variables [1] - carry counting system
MySQL query library size
Llvm - generate if else and pH
Basic operation of sequential stack
After time judgment of date
MySQL InnoDB transaction
Design of digital temperature monitoring and alarm system based on DS18B20 single chip microcomputer [LCD1602 display + Proteus simulation + C program + paper + key setting, etc.]
Differential privacy (background)
调度系统使用注意事项
What is CNAs certification? What are the software evaluation centers recognized by CNAs?
Have you learned the basic operation of circular queue?
adobe illustrator 菜单中英文对照
My raspberry PI zero 2W tossing notes record some problems encountered and solutions
Hj31 word inversion
软件性能测试报告起着什么作用?第三方测试报告如何收费?
[thymeleaf] handle null values and use safe operators
【backtrader源码解析18】yahoo.py 代码注释及解析(枯燥,对代码感兴趣,可以参考)
C language super complete learning route (collection allows you to avoid detours)
群体智能自主作业智慧农场项目启动及实施方案论证会议