当前位置：网站首页>SVM Support Vector Machine - Application of MATLAB in Mathematical Modeling

SVM Support Vector Machine - Application of MATLAB in Mathematical Modeling

2022-08-09 17:24:00 【YuNlear】

数据建模及MATLAB实现(三)

随着信息技术的发展和成熟,各行业积累的数据越来越多,因此需要通过数据建模的方法,从看似杂乱的海量数据中找到有用的信息.

支持向量机(SVM)

支持向量机(Support Vector Machine,SVM)It is a new generation of learning system based on statistical theory.SVMin a supervised learning manner,Divide the training set into categories,或者是预测新的训练点所对应的类别.

SVM基本思想

SVMThe goal is to construct a hyperplane that separates the two classes,And let this hyperplane maximize the split between the two classes.Separating the two classes by a large margin minimizes the expected generalization error——That is, when a new sample appears, the probability of misclassification is as small as possible.

一般来说,The middlemost partition between the two classes has the smallest probability of misclassification,因此,在SVM中,We also use a hyperplane between the two classes $\omega^Tx_i+b=0$ as two types of separators,而 $\omega^Tx_i+b=\alpha$ 与 $\omega^Tx_i+b=-\alpha$ are two parallel boundary planes, respectively.boundary plane,i.e. a hyperplane parallel to the classifier plane and passing through at least one point in the dataset.But there are many choices of boundary planes,To make the hyperplane segmentation accurate,The distance between the two boundary planes needs to be maximized,That is, the edge is maximized.“通过SVM学习”The meaning is to find the hyperplane that maximizes the edge.

SVM理论基础

首先,Suppose there is a capacity of $n$ 的训练集样本 $\{(x_i,y_i),i=1,2,\cdots,n\}$ Consists of two categories,若 $x_i$ 属于第一类,则记 $y_i=1$ ;若 $x_i$ 属于第二类,则记 $y_i=-1$ .

If there is a classification hyperplane：
$\omega^Tx_i+b=0$
Then the samples can be correctly divided into two categories,That is, samples of the same class all fall on the same side of the classification hyperplane.即满足
$\left\{ \begin{aligned} \omega^Tx_i+b&\ge\alpha \ \ \ y_i=1\\ \omega^Tx_i+b&\le-\alpha \ \ \ y_i=-1\\ \alpha&>0 \end{aligned} \right.$
两边同除以 $\alpha$ 则可表示为
$\left\{ \begin{aligned} \omega^Tx_i+b&\ge1 \ \ \ y_i=1\\ \omega^Tx_i+b&\le-1 \ \ \ y_i=-1 \end{aligned} \right.$
can be comprehensively expressed as
$y_i(\omega^Tx_i+b)\ge1$
And the distance between the hyperplanes,That is, the edge can be represented as
$\frac{2}{||\omega^T||}$
Then the planning problem can be expressed as
$max:\frac{2}{||\omega||}$
Take the countdown
$min:\frac{||\omega||}{2}$
Get the final goal planning problem：
$min:||\omega||^2\\ s.t.\ \ \ \ y_i(\omega^Tx_i+b)\ge1$
Finally, the Lagrange duality theory is used,Transform the problem into a dual problem,Solve using quadratic programming method,求得最优的 $\omega^*$ 和 $b^*$ ,Construct the optimal classification function $f (x)$ .

在输入空间中,If the data is not linearly separable,Support vector machines go through nonlinear mapping $\varnothing:R^n\rightarrow F$ Map the data to some dot product space $F$ ,The above linear algorithm is then executed in the dot product space.在文献中,This function is called “核函数”.

支持向量机MATLAB程序设计

支持向量机MATLAB程序设计——SVM.m如下：

function [x,W,R]=SVM(X0)
for i=1:3
    X(:,i)=(X0(:,i)-mean(X0(:,i)))/std(X0(:,i));
end
[m,n]=size(X);
e=ones(m,1);
D=[X0(:,4)];
B=zeros(m,m);
C=zeros(m,m);
for i =1:m
    B(i,i)=1;
    C(i,i)=D(i,1);
end
A=[-X(:,1).*D,-X(:,2).*D,-X(:,3).*D,D,-B];
b=-e;
f=[0,0,0,0,ones(1,m)];
lb=[-inf,-inf,-inf,-inf,zeros(1,m)]';
x=linprog(f,A,b,[],[],lb);
W=[x(1,1),x(2,1),x(3,1)];
CC=x(4,1);
R1=X*W'-CC;
R2=sign(R1);
R=[R1,R2];

原网站

版权声明
本文为[YuNlear]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/221/202208091453041475.html