当前位置：网站首页>Understanding ML Cross Validation Fast

Understanding ML Cross Validation Fast

2022-08-09 04:20:00 【The whole stack O - Jay】

Cross Validation is a statistical analysis method for validating the performance of a classifier (the model you train).The basic idea is to group the original training data in a sense, a training set and a validation set.So for a large data set, we generally divide it into training set, validation set and test set according to 6:2:2 (the simple machine learning process will omit the validation set).First use the training set to train the model, and then use the validation set to test the trained model to preliminarily evaluate the performance of the model (Note!Cross-validation is still in the training stage, nottesting phase).
Common cross-validation methods include simple cross-validation, K-fold cross-validation, leave-one-out cross-validation, and leave-out cross-validationmethod for cross-validation.

Simple cross-validation

It is the simplest concept above. The training data is divided into a training set and a validation set. The training set trains the model, and the validation set validates the model. The accuracy obtained from the validation is the performance indicator of the model.

from sklearn.model_selection import train_test_splitX_train, X_test, y_train, y_test = train_test_split( data, target, test_size=.4, random_state=0 )

K-Fold Cross Validation (K-Fold)

The training data is divided into K groups (usually divided equally), and then each subset data respectively is used as a validation set, and the remaining K-1 groups are subdivided intoThe set is regarded as a training set, so K cross-validation will be performed to obtain K models, and the final validation accuracy of the K models will be averaged as the performance indicator of the model. Usually, K is set to be greater than or equal to 3.

from sklearn.model_selection import KFoldkf = KFold(n_splits = 10) # k take 10

Leave One Out Cross Validation (LOO-CV)

leave-one-out cross validation is the case of K=N in K-fold cross validation, that is, each subset consists of only one sample data, N samplesTherefore, N times of cross-validation will be performed to obtain N models, and the final validation accuracy of the N models will be averaged as the performance indicator of the model.

from sklearn.model_selection import LeaveOneOutloo = LeaveOneOut()

Leave P cross-validation (LPO-CV)

leave-p-out cross validationis the case of K=P in K-fold cross-validation, P is determined by ourselves, and each subset only consists of P samplesThe data is composed of N sample data, so (N-P+1) cross-validation will be performed to obtain (N-P+1) models, and the final validation accuracy of these models will be averaged as the model's performance indicator..

from sklearn.model_selection import LeavePOutlpo = LeavePOut(p=5) #ptake 5

原网站

版权声明
本文为[The whole stack O - Jay]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/221/202208090412287456.html

当前位置：网站首页>Understanding ML Cross Validation Fast

Understanding ML Cross Validation Fast

Simple cross-validation

K-Fold Cross Validation (K-Fold)

Leave One Out Cross Validation (LOO-CV)

Leave P cross-validation (LPO-CV)

边栏推荐

猜你喜欢

随机推荐