当前位置:网站首页>Introduction to standardization, regularization and normalization
Introduction to standardization, regularization and normalization
2022-04-23 20:31:00 【zjt597778912】
1. Standardization
Standardized formula :z-score
X = ( X − m e a n ) s t d X = \frac {(X-mean)} {std} X=std(X−mean)
The calculation is correct Each attribute ( Each column ) separately .
For each column Every number All minus The mean value of the column , And divide by Standard deviation of this column .
The result is 0 Nearby and the variance is 1 .
Method realization sklearn.preprocessing.scale()
from sklearn import preprocessing
import numpy as np
X = np.linspace(1,9,9).reshape((3,3))
'''
X = [[1. 2. 3.]
[4. 5. 6.]
[7. 8. 9.]]
'''
X = preprocessing.scale(X)
'''
X= [[-1.22474487 -1.22474487 -1.22474487]
[ 0. 0. 0. ]
[ 1.22474487 1.22474487 1.22474487]]
'''
calculate
The standard deviation formula is : Each number in this column is summed by subtracting the square of the average , Divide by the number of numbers and square it s t d = ∑ ( x i − m e a n ) 2 n std = \sqrt {\frac{\sum(x_i-mean)^2} n} std=n∑(xi−mean)2
The mean value of the first column is ( 1 + 4 + 7 ) 3 = 4 \frac {(1+4+7)} 3=4 3(1+4+7)=4
The standard deviation of the first number in the first column is ( 1 − 4 ) 2 + ( 4 − 4 ) 2 + ( 7 − 4 ) 2 3 = 6 \sqrt \frac { {(1-4)^2+(4-4)^2+(7-4)^2}} 3 =\sqrt {6} 3(1−4)2+(4−4)2+(7−4)2=6
The first number in the first column is 1 − 4 6 = − 1.22474487 \frac {1-4} {\sqrt {6}}=-1.22474487 61−4=−1.22474487
Method realization sklearn.preprocessing.StandardScaler()
sklearn The encapsulated algorithms in the must be used before use fit, For the subsequent API service
from sklearn import preprocessing
import numpy as np
X = np.linspace(1,9,9).reshape((3,3))
'''
X = [[1. 2. 3.]
[4. 5. 6.]
[7. 8. 9.]]
'''
scaler = preprocessing.StandardScaler().fit(X)
scaler.transform(X)
'''
X= [[-1.22474487 -1.22474487 -1.22474487]
[ 0. 0. 0. ]
[ 1.22474487 1.22474487 1.22474487]]
'''
fit() Simply speaking , Is to get the training set X The average of , variance , Maximum , minimum value , These training sets X Inherent properties .
stay fit() On the basis of , Standardize , Dimension reduction , Normalization and other operations .
2. Regularization
Regularization :
- Scale each sample to the unit norm , Calculate its... For each sample p- norm , Then in the sample Every Number divided by the norm
p- Norm calculation formula : x p = ∑ x i p p x_p= \sqrt[p]{\sum x_i^p} xp=p∑xip
In general use l1-norm(p=1) or l2-norm(p=2)
For one sample, i.e a line data
Method realization :sklearn.preprocessing.Normalizer()
from sklearn import preprocessing
import numpy as np
X = np.linspace(1,9,9).reshape((3,3))
'''
X = [[1. 2. 3.]
[4. 5. 6.]
[7. 8. 9.]]
'''
normalizer = preprocessing.Normalizer().fit(X)
normalizer.transform(X)
'''
X= [[0.26726124 0.53452248 0.80178373]
[0.45584231 0.56980288 0.68376346]
[0.50257071 0.57436653 0.64616234]]
'''
calculate
- The default is l2-norm
First line 2- norm 1 2 + 2 2 + 3 2 = 14 \sqrt {1^2+2^2+3^2}=\sqrt {14} 12+22+32=14
The first number in the first line 1 14 = 0.26726124 \frac 1 {\sqrt {14}}=0.26726124 141=0.26726124
3. normalization
- Zoom the attribute to a specified range
common min-max Standardization is also called Deviation standardization
X = X − m i n m a x − m i n X=\frac {X-min} {max-min} X=max−minX−min
For an attribute, i.e A column of data
from sklearn import preprocessing
import numpy as np
X = np.linspace(1,9,9).reshape((3,3))
'''
X = [[1. 2. 3.]
[4. 5. 6.]
[7. 8. 9.]]
'''
min_max_scaler = preprocessing.MinMaxScaler().fit(X)
min_max_scaler.transform(X)
'''
X= [[0. 0. 0. ]
[0.5 0.5 0.5]
[1. 1. 1. ]]
'''
calculate
The first number in the first column 1 − 1 7 − 1 = 0 \frac {1-1} {7-1}=0 7−11−1=0
The second number in the first column 4 − 1 7 − 1 = 0.5 \frac {4-1} {7-1}=0.5 7−14−1=0.5
版权声明
本文为[zjt597778912]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204210550240962.html
边栏推荐
- [graph theory brush question-4] force deduction 778 Swimming in a rising pool
- Recommend an open source free drawing software draw IO exportable vector graph
- DNS cloud school rising posture! Three advanced uses of authoritative DNS
- Analysis of the relationship between generalized Bim and CAD under the current background
- Historical track data reading of Holux m1200-e Bluetooth GPS track recorder
- Devexpress 14.1 installation record
- SQL Server connectors by thread pool 𞓜 instructions for dtsqlservertp plug-in
- Shanghai responded that "flour official website is an illegal website": neglect of operation and maintenance has been "hacked", and the police have filed a case
- Operation of numpy array
- Experience of mathematical modeling in 18 year research competition
猜你喜欢
How to protect ECs from hacker attacks?
ArcGIS JS version military landmark drawing (dovetail arrow, pincer arrow, assembly area) fan and other custom graphics
Numpy Index & slice & iteration
UnhandledPromiseRejectionwarning:CastError: Cast to ObjectId failed for value
DNS cloud school rising posture! Three advanced uses of authoritative DNS
Development of Matlab GUI bridge auxiliary Designer (functional introduction)
[graph theory brush question-4] force deduction 778 Swimming in a rising pool
[latex] 5 how to quickly write out the latex formula corresponding to the formula
Actual measurement of automatic ticket grabbing script of barley network based on selenium (the first part of the new year)
Operation of numpy array
随机推荐
LeetCode 1346、检查整数及其两倍数是否存在
JDBC tool class jdbcconutil gets the connection to the database
star
Installation and use of NVM
Confusion about thread blocking after calling the read () method of wrapper flow
PostgreSQL basic functions
Syntaxerror: unexpected token r in JSON at position 0
Latest investigation and progress of building intelligence based on sati
LeetCode 994、腐烂的橘子
Vscode download speed up
Solve the Chinese garbled code of URL in JS - decoding
Leetcode 20. Valid parentheses
【问题解决】‘ascii‘ codec can‘t encode characters in position xx-xx: ordinal not in range(128)
Parsing methods of JSON data in C - jar and jobobject: error reading jar from jsonreader Current JsonReader item
Click an EL checkbox to select all questions
. Ren -- the intimate artifact in the field of vertical Recruitment!
[PTA] l1-002 printing hourglass
[problem solving] 'ASCII' codec can't encode characters in position XX XX: ordinal not in range (128)
What is the difference between a host and a server?
Es error: request contains unrecognized parameter [ignore_throttled]