当前位置:网站首页>Fundamentals of machine learning theory -- some terms about machine learning
Fundamentals of machine learning theory -- some terms about machine learning
2022-04-23 18:39:00 【Capture bamboo shoots 123】
Catalog
This blog reference book :scikit-learn machine learning – Common algorithm principle and programming practice
cost function ( error )
Measure the consistency between the model and the training sample
cost For all training samples , The value fitted by the model is the same as the real value of the training sample Average error
cost function Is the functional relationship between cost and model parameters
J t r a i n ( θ ) = 1 2 m ∑ i = 1 m ( h θ ( x i ) − t i ) 2 J_{train}(\theta)=\frac{1}{2m}\sum_{i=1}^{m}(h_{\theta}(x^i)-t^i)^2 Jtrain(θ)=2m1i=1∑m(hθ(xi)−ti)2
among h ( x i ) h(x^i) h(xi) Represents the prediction label of the model for each sample value , t i t^i ti Represent the real label of each sample
The training process of the model is to find Appropriate model parameters bring The value of the cost function is the smallest
Model accuracy
Multiple models may be used to fit a data set ( For example, using first-order polynomials 、 Second order polynomial 、…、 Multiorder polynomial ), We tend to choose the one that performs best from these models , So how to evaluate the performance of a model ?
We often use the cost function value of the test set as the index , J t e s t ( θ ) J_{test}(\theta) Jtest(θ) The brighter the value, the smaller the error between the predicted value of the model and the actual value of the sample , That is, the better the prediction accuracy of new data
J t e s t ( θ ) = 1 2 m ∑ i = 1 m ( h θ ( x i ) − t i ) 2 J_{test}(\theta)=\frac{1}{2m}\sum_{i=1}^{m}(h_{\theta}(x^i)-t^i)^2 Jtest(θ)=2m1i=1∑m(hθ(xi)−ti)2
stay sklearn Interfaces are often used in score(x,y) To evaluate the performance of a model
Cross validation data sets
If you have a dataset now , We want to get some information from it , There are multiple models to choose from , Then we need to do the following three things
1. Train model parameters with possible multiple models
2. Select the best model from multiple models
3. Evaluate the prediction accuracy of this model
The main purpose of testing the data set is to test the accuracy of the model , And this process requires the use of models without “ Yes ” The data of , If step 2 Using test data , Then the data is “ Yes ”, To solve this problem , We can divide the data set into 3 part , The extra one is Cross validation data sets
Many times we don't use Cross validation data sets , This is because most of the time for a data set , We know what model to use
The learning curve
Take the cost function values of training data set and test data set as the vertical axis , The training dataset size is used as the horizontal axis , Draw a curve
Use sklearn Draw the learning curve with the interface provided in
from sklearn.model_selection import learning_curve,ShuffleSplit
def plot_learning_curve(estimator,x,y,cv=None,n_jobs=1,train_size=np.linspace(.1,1.0,5)):
train_size,train_score,test_score=learning_curve(estimator,x,y,cv=cv,n_jobs=n_jobs,train_sizes=train_size)
# Calculating mean , variance
train_score_mean=np.mean(train_score,axis=1)
train_score_std=np.std(train_score,axis=1)
test_score_mean=np.mean(test_score,axis=1)
test_score_std=np.std(test_score,axis=1)
plt.plot(train_size,train_score_mean,'o-',c='r')
plt.plot(train_size,test_score_mean,'o-',c='g')
return plt
The meaning of the learning curve : With the training data set ( The amount of training data ) An increase in , The accuracy of model fitting to training data , The prediction accuracy of cross validation data set changes
Over fitting
The model can fit the training samples very well , Cross validation data sets ( The new data ) The prediction accuracy of is low
resolvent
Get more training data
When fitting has happened , Increasing the amount of data can effectively improve the performance of the model
Reduce the number of input features
Over fitting shows that the model is too complex to some extent , This is what we can try to reduce the number of input features , This can reduce the amount of calculation of the model , It also reduces the complexity of the model
Under fitting
The model can't fit the training samples well , Cross validation data sets ( The new data ) The prediction accuracy is also low
Add valuable features
Under fitting shows that the model is a little simple , The reason may be that the number of input features is too small , We can mine more new features from the original data
Add the characteristics of polynomials
Sometimes it is not easy to mine features from original data , At this time, we can multiply some of the original features or square them as new features , This is equivalent to increasing the order of a model
x 1 , x 2 → x 1 2 , x 2 2 , x 1 x 2 x_1,x_2\rightarrow x_1^2,x_2^2,x_1x_2 x1,x2→x12,x22,x1x2
版权声明
本文为[Capture bamboo shoots 123]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204231835454702.html
边栏推荐
- Hard core parsing promise object (do you know these seven common APIs and seven key questions?)
- Use bitnami / PostgreSQL repmgr image to quickly set up PostgreSQL ha
- The first leg of the national tour of shengteng AI developer creation and enjoyment day was successfully held in Xi'an
- 教你用简单几个步骤快速重命名文件夹名
- 机器学习理论基础篇--关于机器学习的一些术语
- Summary of actual business optimization scheme - main directory - continuous update
- ESP32 LVGL8. 1 - arc (arc 19)
- 使用晨曦记账本,分析某个时间段每个账户收支结余
- Use bitnami / PostgreSQL repmgr image to quickly set up PostgreSQL ha
- ESP32 LVGL8. 1 - roller rolling (roller 24)
猜你喜欢
七、DOM(下) - 章节课后练习题及答案
os_authent_prefix
ctfshow-web361(SSTI)
Function recursion and solving interesting problems
Use bitnami / PostgreSQL repmgr image to quickly set up PostgreSQL ha
纠结
根据快递单号查询物流查询更新量
The first leg of the national tour of shengteng AI developer creation and enjoyment day was successfully held in Xi'an
Teach you to quickly rename folder names in a few simple steps
Machine learning theory (8): model integration ensemble learning
随机推荐
Ionic instruction set order from creation to packaging
Machine learning theory (7): kernel function kernels -- a way to help SVM realize nonlinear decision boundary
CISSP certified daily knowledge points (April 13, 2022)
ctfshow-web362(SSTI)
ESP32 LVGL8. 1 - bar progress bar (bar 21)
Use of regular expressions in QT
CISSP certified daily knowledge points (April 12, 2022)
Introduction to QT programming
Use bitnami / PostgreSQL repmgr image to quickly set up PostgreSQL ha
CANopen usage method and main parameters of object dictionary
ESP32 LVGL8. 1 - arc (arc 19)
Nacos集群搭建和mysql持久化配置
Sentinel服务熔断实战(sentinel整合ribbon+openFeign+fallback)
[mathematical modeling] - analytic hierarchy process (AHP)
[popular science] CRC verification (I) what is CRC verification?
MVVM模型
Nacos作为服务配置中心实战
Using transmittablethreadlocal to realize parameter cross thread transmission
机器学习实战 -朴素贝叶斯
Daily CISSP certification common mistakes (April 13, 2022)