当前位置:网站首页>Understanding ML Cross Validation Fast
Understanding ML Cross Validation Fast
2022-08-09 04:20:00 【The whole stack O - Jay】
Cross Validation is a statistical analysis method for validating the performance of a classifier (the model you train).The basic idea is to group the original training data in a sense, a training set and a validation set.So for a large data set, we generally divide it into training set, validation set and test set according to 6:2:2 (the simple machine learning process will omit the validation set).First use the training set to train the model, and then use the validation set to test the trained model to preliminarily evaluate the performance of the model (Note!Cross-validation is still in the training stage, nottesting phase).
Common cross-validation methods include simple cross-validation, K-fold cross-validation, leave-one-out cross-validation, and leave-out cross-validationmethod for cross-validation.
Simple cross-validation
It is the simplest concept above. The training data is divided into a training set and a validation set. The training set trains the model, and the validation set validates the model. The accuracy obtained from the validation is the performance indicator of the model.
from sklearn.model_selection import train_test_splitX_train, X_test, y_train, y_test = train_test_split( data, target, test_size=.4, random_state=0 )
K-Fold Cross Validation (K-Fold)
The training data is divided into K groups (usually divided equally), and then each subset data respectively is used as a validation set, and the remaining K-1 groups are subdivided intoThe set is regarded as a training set, so K cross-validation will be performed to obtain K models, and the final validation accuracy of the K models will be averaged as the performance indicator of the model. Usually, K is set to be greater than or equal to 3.
from sklearn.model_selection import KFoldkf = KFold(n_splits = 10) # k take 10
Leave One Out Cross Validation (LOO-CV)
leave-one-out cross validation is the case of K=N in K-fold cross validation, that is, each subset consists of only one sample data, N samplesTherefore, N times of cross-validation will be performed to obtain N models, and the final validation accuracy of the N models will be averaged as the performance indicator of the model.
from sklearn.model_selection import LeaveOneOutloo = LeaveOneOut()
Leave P cross-validation (LPO-CV)
leave-p-out cross validationis the case of K=P in K-fold cross-validation, P is determined by ourselves, and each subset only consists of P samplesThe data is composed of N sample data, so (N-P+1) cross-validation will be performed to obtain (N-P+1) models, and the final validation accuracy of these models will be averaged as the model's performance indicator..
from sklearn.model_selection import LeavePOutlpo = LeavePOut(p=5) #ptake 5
边栏推荐
- 2022高压电工考试试题及答案
- Alibaba Cloud Tianchi Contest Question (Machine Learning) - Repeat Purchase Prediction of Tmall Users (Complete Code)
- HyperLynx(四)差分传输线模型
- Talking about the process and how to create it
- AttributeError: partially initialized module 'cv2' has no attribute 'gapi_wip_gst_GStreamerPipeline'
- Divisible property 1
- LeetCode - remove consecutive nodes with a sum of zero from a linked list
- 了解CV和RoboMaster视觉组(五)滤波器、观测器和预测方法:粒子滤波器Particle Filter
- Alibaba Cloud Tianchi Contest Question (Machine Learning) - Prediction of Industrial Steam Volume (Complete Code)
- 了解CV和RoboMaster视觉组(五)滤波器、观测器和预测方法:自适应滤波器的应用
猜你喜欢
松柏集(云衣裳)
了解CV和RoboMaster视觉组(五)统计特征和global-based方法
必须指定GDAL API版本。提供一个路径使用GDAL_CONFIG gdal-config环境
数量遗传学遗传力计算2:半同胞和全同胞
自动化测试-图片中添加文字注释,添加到allure测试报告中
全栈代码测试覆盖率及用例发现系统的建设和实践
Introduction to JVM garbage collection mechanism
使用Oracle SQL Developer管理Oracle Database Express Edition (XE)
2022年起重机司机(限桥式起重机)考试题库及模拟考试
NanoDet代码逐行精读与修改(二)FPN/PAN
随机推荐
Divisible property 1
【数学】点积与叉积
OKR management process, how to implement effective dialogue, using the CFR feedback and recognition?
“error“: { “root_cause“: [{ “type“: “circuit_breaking_exception“, “reason“: “[parent] D [solved]
Improve the user experience and add a small detail to your modal popup
分布式数据库怎样才能“叫好又卖座”
Dingding conflicts with RStudio shortcuts--Dingding shortcut settings
Moonriver与Shiden的XCM集成现已上线
简单的数学公式计算
了解CV和RoboMaster视觉组(五)滤波器、观测器和预测方法
Base64编码和图片转化
了解CV和RoboMaster视觉组(五)滤波器、观测器和预测方法:自适应滤波器
了解CV和RoboMaster视觉组(五)CNN没有不变性?
360 评估反馈问题的示范案例
2022年安全员-A证特种作业证考试题库及在线模拟考试
The influence law of genes for disease - read the paper
单根k线图知识别以为自己都懂了
容易混淆的指针知识点
MySQL: redo log log - notes for personal use
了解CV和RoboMaster视觉组(五)参数自适应与稳健特征