当前位置:网站首页>[Deep Learning] Note 2 - The accuracy of the model in the test set is greater than that in the training set
[Deep Learning] Note 2 - The accuracy of the model in the test set is greater than that in the training set
2022-08-11 11:57:00 【aaaafeng】
Preface
Activity address: CSDN 21-day Learning Challenge
Blogger homepage: Aaaafeng's homepage_CSDN
Keep input, keep output!(quoting a sentence from my a friend)
Article table of contents
1. Description of the problem
In the process of model training, I suddenly found that the accuracy rate of the model is actually higher on the test set than on the training set.But we know that the way we train the model is to minimize the loss on the training set.Therefore, it should be normal for the model to perform better on the training set.
So, what caused the higher accuracy on the test set?
Model training results:
2. Fix the problem
2. 1. Underfitting
Later I consulted a big boss, she said: "Train a few more times to see, the first few times have been underfitting", I immediately felt, Good suggestionstrong>!
Increase the number of training epochs:
Sure enough!With increasing training epochs, the model accuracy slowly returned to the right track.The accuracy on the training set again exceeds that on the test set.
2. 2. Hysteresis of mini-batch statistics
But I still have some doubts, why in the underfitting state with fewer training cycles, the model has a higher accuracy on the test set?What is the relationship between them?
There is a part of the explanation given by a blog post, which I think is very reasonable and more in line with the situation I encountered:
The accuracy of the training set is generated after each batch, while the accuracy of the validation set is generally generated after an epoch. The model during validation is trained after batches, and there is a lag.It can be said that the model that has been trained about the same is used for verification, of course, the accuracy rate is higher.
That is, the problem arises with the way individuals specifically count the accuracy of the training set.If the accuracy of the model on the training set is counted after each training cycle, rather than at the end of each mini-batch, this will not happenThe problem.
Of course, just talking is not enough, you have to practice.I checked the previous model code and found that the accuracy on my training set was indeed counted after each mini-batch.Then you might as well try the accuracy of the training set and count it after each cycle.
Accuracy on the training set after each training cycle (train acc 2):
It is easy to find that even in the state of underfitting, if the training set and test set accuracy are statistically the same, the model will still be more accurate on the training set.
Summary
When you encounter a problem, looking at other people's thoughts may make you feel stunned in an instant.It is not advisable for a person to drill into a bull's horn.
边栏推荐
- Web3 Entrepreneur's Guide: How to Build a Decentralized Community for Your Product?
- 【黑马早报】抖音否认与头部主播签对赌协议;阿迪达斯CEO承认在中国犯了错;网易云社交App心遇被指涉黄;联通董事长称5G资费比点外卖还便宜
- 为什么最好的光刻机来自荷兰,而不是芯片大国美国?
- 低延时实时音视频在5G远程操控场景的应用实践
- Network Security - nmap
- d共享左值
- 网络安全——nmap
- SQL Runtime SLX主要包括哪两方面?
- VirtualLab: Ince - array of laser Gaussian beam generated vortex observation
- ESI VA One 2021软件安装包和安装教程
猜你喜欢
[10 o'clock open class]: Optimization of AV1 encoder and its application in streaming media and real-time communication
【LeetCode 周赛】第84场双周赛
NLP标注工具Brat的简单使用
从滴滴被罚款事件思考企业数据治理问题
pip安装后仍有ImportError No module named XX问题解决
vending machine
WPF 实现内阴影
条件竞争 && pipe_buffer + 堆喷射
TX12 + ExpressLRS RC configuration and control link problem summary 915 MHZ
The old saying: The interview must ask "Three handshakes, four waves", so you can't forget it
随机推荐
兴盛优选:时序数据如何高效处理?
MySQL --- 存储引擎
Volatile关键字的作用
Web3 创业者指南:如何为你的产品构建一个去中心化社区?
五分钟教你内网穿透
简单记录openguass_exporter对接prometheus通过grafanai来实现可视化监控
Go编译原理系列10(逃逸分析)
Azure IoT & NVIDIA Jetson 开发基础
Common operations in Typora tables
KMP与AC自动机详细讲解(带图)
公共管理学选择题(最终版)
TX12 + ExpressLRS RC configuration and control link problem summary 915 MHZ
学习笔记【nlp中的sample和beam_search】
金九银十面试复习回顾及总结:算法+框架+Redis+分布式+JVM
【学生毕业设计】基于web学生信息管理系统网站的设计与实现(13个页面)
去年今日我凭借这份文档,摇身一变成了被BAT大牛们看中的幸运儿
云原生(三十四) | Kubernetes篇之平台存储系统实战
Flutter 教程之 Kotlin 多平台与 Flutter,为您的应用选择哪一个
在华门店数超星巴克,瑞幸咖啡完成“逆袭”?
Codeforces Global Round 15 (A-F)