当前位置：网站首页>Deep learning -- Summary of Feature Engineering

Deep learning -- Summary of Feature Engineering

2022-04-23 19:25:00 【Try not to lie flat】

For machine learning , General steps ：

Data collection — Data cleaning — Feature Engineering — Data modeling

We know , Feature engineering includes feature construction , Feature extraction and feature selection . Feature engineering is actually transforming the original data into models , The process of training data .

Feature building

https://zhuanlan.zhihu.com/p/424518359 Other bloggers' explanations for normalization

In feature construction , First give me a pile of data , So many and messy , We must normalize its data first , Let the data be distributed as I want to see . Then after the specification , You need data preprocessing , Especially missing values 、 Classification feature processing 、 Processing of continuous features .

Data normalization ： normalization ： Maximum and minimum standardization 、Z-Score Standardization

So what's the biggest difference between them ？ Is to change the distribution of characteristic data .

Maximum and minimum standardization ： Will change the distribution of characteristic data

Deep learning —— Summary of feature engineering _ Data preprocessing

Z-Score Standardization ： Do not change the distribution of characteristic data

Deep learning —— Summary of feature engineering _PCA_02

Maximum and minimum standardization ：

The linear function transforms the method of linearizing the original data into [0 1] The scope of the , The calculation result is the normalized data ,X For raw data
This normalization method is more suitable for The values are concentrated The situation of
defects ： If max and min unstable , It's easy to make the normalization result unstable , It makes the follow-up effect unstable . Empirical constants can be used to replace max and min
Application scenarios ： When it comes to distance measurement 、 Covariance calculation 、 When the data does not conform to the positive distribution , You can use the first method or other normalization methods （ barring Z-score Method ）. For example, in image processing , take RGB After the image is converted to a grayscale image, its value is limited to [0 255] The scope of the

Z-Score Standardization ：

among ,μ、σ They are the mean and method of the original data set .
Normalize the original data set to mean 0、 variance 1 Data set of
This normalization method requires that the distribution of the original data can be approximately Gaussian distribution , Otherwise, the effect of normalization will become very bad .
Application scenarios ： stay classification 、 clustering In the algorithm, , When distance is needed to measure similarity 、 Or use PCA technology During dimensionality reduction ,Z-score standardization Perform better .

feature extraction

So in the feature extraction method , We first learned about data partitioning ： Include what the dataset is ？ Give you a pile of data , What is your split method ？ There are also important dimensionality reduction methods ：PCA, There are other ways , such as ICA, But for my final exam , I won't focus on the record, hahaha .

Data sets ： Training set 、 Verification set 、 Test set

Training set ： Training data , Adjust model parameters 、 Training model weight , Building machine learning model
Verification set ： The performance of the model is verified by the data separated from the training set , As the performance index of the evaluation model
Test set ： Enter the training set with new data , To verify the quality of the trained model

Split method ： Set aside method 、K- Fold cross validation

Set aside method ： Divide the data set into mutually exclusive sets , Maintain the consistency of the split set data
K- Fold cross validation ： Split the dataset into K A mutually exclusive subset of similar size , Ensure the consistency of their data distribution

In order to convert the original data into obvious physical / Characteristics of statistical significance , You need to build new data , The methods used are usually PCA、ICA、LDA etc. .

So why do we need to reduce the dimension of features

Eliminate noise
Data compression
Eliminate data redundancy
Improve the accuracy of the algorithm
Reduce the data dimension to 2 Dimension or 3 dimension , Maintain data visibility

PCA（ Principal component analysis ）： Through the transformation of coordinate axis ; Find the optimal subspace of data distribution

Enter the original data , The structure is （m,n）, Find the original n It's made up of two eigenvectors n Dimensional space
Determine the eigenvector after dimensionality reduction ：K
Through some kind of change , find n A new eigenvector , And the new n Dimensional space V*—— Matrix decomposition
Find the original data in the new feature space V Medium n The value corresponding to a new eigenvector , Mapping data to a new space
Before selection K One of the most informative features , Delete unselected features , Will succeed n Dimension reduction of dimensional space K dimension

Deep learning —— Summary of feature engineering _ Feature Engineering _03

For feature selection , There are several ways ： Filter type 、 Parcel type 、 The embedded （ Understanding can ）

Last , Let's look at the difference between super parameters and parameters ：

Hyperparameters ： Parameters set before learning the model , Artificially set , such as padding、stride、k-means Of k、 depth 、 Number and size of convolution kernels 、 Learning rate
Parameters ： The parameters obtained through a series of model training , Such as weight w and wx+b Inside b.

版权声明
本文为[Try not to lie flat]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/04/202204231859372120.html

当前位置：网站首页>Deep learning -- Summary of Feature Engineering

Deep learning -- Summary of Feature Engineering

边栏推荐

猜你喜欢

随机推荐