当前位置:网站首页>4.1 - Support Vector Machines

4.1 - Support Vector Machines

2022-08-11 07:47:00 A big boa constrictor 6666


The binary classification problem from the previous chapter

  • Because the original picture on the right is yellowlossFunctions cannot do gradient descent,So we make an approximation to it(Approximation),用 l l l 来取代 δ δ δ ,此时的 l l l Many different functions can be used,比如:方差,Sigmoid+方差,Sigmoid+交叉熵(Cross entropy),铰链损失(Hinge loss)

image-20220809200039487image-20220809214242255y

image-20220809215112431image-20220809215140255

  • When there are outliers in the data(Outlier)时,铰链损失(Hinge loss)Often better than cross entropy(Cross entropy)表现得更好

image-20220809215208210image-20220809215226059

一、铰链损失(Hinge loss)

  • According to the derivation process in the figure below,SVM的lossThe function can be solved by gradient descent,最后并将SVMConverted to a common expression in textbooks.

image-20220809195904047image-20220809220856142

image-20220809220953343image-20220809221126389

二、核方法(Kernel Method)

  • 对偶表示(Dual Representation):在SVM中, α n ∗ \alpha_n^* αnMay be sparse,means there is some α n ∗ = 0 \alpha_n^*=0 αn=0的xn,而那些 α n ∗ ≠ 0 \alpha_n^*\neq 0 αn=0的xn就是支持向量(support vector).这些不是0The points ultimately determine the quality of our entire model,This is also why some outliers in the data are hard to matchSVMcause of the impact.
  • 核函数(Kernel Fountion):右图中 K ( x n , x ) K(x^n,x) K(xn,x)就是核函数,也就是做 x n 和 x x^n和x xnx的内积(inner product)

image-20220810084957771

  • 核方法(Kernel Trick):当我们的lossThe function can be written as the blue line on the left,We just need to calculate K ( x n ′ , x n ) K(x^{n'},x^n) K(xn,xn),And you don't need to know the vectorx的具体值.This is the benefit of the nuclear approach,他不仅可以应用在SVM上,It can also be applied to linear regression and logistic regression.
  • We can see in the derivation of the right figurex与zThe inner product after the feature transformation is very complicated,We don't need to do this when we use the kernel method,直接对x,zIt can be squared after inner product.

image-20220810090525242image-20220810090550438

image-20220810091605143

2.1 径向基函数核(Radial Basis Function Kernel)

  • 当x与zmore like,其Kernel值就越大.如果x=z,值为1;x与zcompletely different,值为0.
  • It is easy to see from the derivation of the formula in the figure belowRBF KernelIt is to do things on an infinitely multidimensional plane,Therefore, the complexity of the model will be very high,This is very easy to overfit.
image-20220810100626799

2.2 Sigmoid Kernel

  • Do it in the picture on the leftSigmoid Kernel时,There is only one hidden layer network,And the weight of each neuron is a piece of data,The number of neurons is the number of support vectors.
  • The figure on the right explains how to directly design a kernel functionK(x,z)来代替Φ(x)和Φ(z),以及通过Mercer’s theoryto check whether the kernel function meets the requirements.

image-20220810100710674image-20220810100822150

三、Support vector machine related methods(SVM related methods)

  • SVR(支持向量回归):When the difference between the predicted value and the actual value is within a certain range,loss=0

  • Ranking SVM:When something to consider is an orderinglist时

  • One-class SVM:He wants to belongpositive的exampleare all in the same category,negative的examplescattered elsewhere

  • 下图是SVMand deep learning similarities between the two

image-20220810102900573
原网站

版权声明
本文为[A big boa constrictor 6666]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/223/202208110650014890.html