当前位置:网站首页>深度学习——超参数设置
深度学习——超参数设置
2022-04-23 15:08:00 【請叫我做雷锋】
一.过拟合
1.定义:给定一个假设空间H,一个假设h属于H,如果存在其他的假设h’属于H,使得在训练样例上h的错误率比h’小,但在整个实例分布上h’比h的错误率小,那么就说假设h过度拟合训练数据。
2.通俗解释

3.常见原因
主要是学习过度和样本特征不均衡,如果细分,还可以包括(并未能列举全部原因):
(1)建模样本选取有误,样本标签错误等,导致选取的样本数据不足以代表预定的分类规则
(2)样本噪音干扰过大,使得机器将学习了噪音,还认为是特征,从而扰乱了预设的分类规则
(3)假设的模型无法合理存在,或者说是假设成立的条件实际并不成立(4)参数太多,模型复杂度过高
(5)对于tree-based模型,如果我们对于其深度与split没有合理的限制,有可能使节点只包含单纯的事件数据(event)或非事件数据(no event),使其虽然可以完美匹配(拟合)训练数据,但是无法适应其他数据集
(6)对于神经网络模型:1).权值学习迭代次数太多(Overtraining),2).BP算法使权值可能收敛过于复杂的决策面。
4.解决方法
->模型上:神经网络:加dropout,batch normalization基于树的模型:限制深度,加入正则化项等设置提前终止条件。
->数据上:增加数据集对数据集进行增强处理(augmentation)。
二、正则化
预备知识(梯度下降法):https://zhuanlan.zhihu.com/p/113714840
1.正则化的目的:为了模型的泛化而添加的一个权值累加项。
版权声明
本文为[請叫我做雷锋]所创,转载请带上原文链接,感谢
https://blog.csdn.net/weixin_44646187/article/details/124341309
边栏推荐
- LeetCode162-寻找峰值-二分-数组
- C语言超全学习路线(收藏让你少走弯路)
- Share 20 tips for ES6 that should not be missed
- thinkphp5+数据大屏展示效果
- [proteus simulation] automatic range (range < 10V) switching digital voltmeter
- The difference between having and where in SQL
- Leetcode153 - find the minimum value in the rotation sort array - array - binary search
- JUC learning record (2022.4.22)
- How to write the keywords in the cover and title? As we media, why is there no video playback
- Bingbing learning notes: take you step by step to realize the sequence table
猜你喜欢

Thread synchronization, life cycle

Basic operation of sequential stack

Introduction to distributed transaction Seata

分享 20 个不容错过的 ES6 的技巧

博睿数据携手F5共同构建金融科技从代码到用户的全数据链DNA

LeetCode165-比较版本号-双指针-字符串

Progress in the treatment of depression

LeetCode153-寻找旋转排序数组中的最小值-数组-二分查找
![Detailed explanation of C language knowledge points - data types and variables [2] - integer variables and constants [1]](/img/d4/9ee62772b42fa77dfd68a41bde1371.png)
Detailed explanation of C language knowledge points - data types and variables [2] - integer variables and constants [1]

Bingbing learning notes: take you step by step to realize the sequence table
随机推荐
What is the role of the full connection layer?
8.4 realization of recurrent neural network from zero
Thread synchronization, life cycle
January 1, 1990 is Monday. Define the function date_ to_ Week (year, month, day), which realizes the function of returning the day of the week after inputting the year, month and day, such as date_ to
Flink datastream type system typeinformation
How to use OCR in 5 minutes
Introduction to distributed transaction Seata
8.5 concise implementation of cyclic neural network
Role of asemi rectifier module mdq100-16 in intelligent switching power supply
MySQL sync could not find first log file name in binary log index file error
HJ31 单词倒排
Introduction to Arduino for esp8266 serial port function
小红书 timestamp2 (2022/04/22)
脏读、不可重复读和幻读介绍
Adobe Illustrator menu in Chinese and English
Nacos程序连接MySQL8.0+ NullPointerException
Explanation and example application of the principle of logistic regression in machine learning
Vscode Chinese plug-in doesn't work. Problem solving
PSYNC synchronization of redis source code analysis
Kubernetes详解(十一)——标签与标签选择器