当前位置：网站首页>16、 Anomaly detection

16、 Anomaly detection

2022-04-23 02:26:00 【Dragon Fly】

List of articles

1、 Anomaly detection (Anormly Detection) Introduce
2、 Anomaly detection algorithm
3、 Evaluate anomaly detection algorithms
- 3.1 Use anomaly detection or supervised learning
4、 Handle the feature vector of exception detection
5、 multivariate (multivariate) Gaussian distribution
THE END

1、 Anomaly detection (Anormly Detection) Introduce

$\qquad$ Anomaly detection refers to a given set of unlabeled data sets ${x(1), x(2),..., x(m)\}$ , Train a model for this data set $p (x)$ , To determine the similarity between a certain data and most data in the data set ( The probability that a data falls in the central area of a given data set ), If a data $x_{test}$ Very similar to most given data , be $p(x_{test})\geq\epsilon$ , It indicates that there is no obvious abnormality in the data ; otherwise $p(x_{test})<\epsilon$ , Explain the data $x_{test}$ And most of the data , It indicates that the given data may be an abnormal data .
Insert picture description here
$\qquad$ That is, if a certain data does not fall within the range of most data , Then the probability of such data is relatively small , Treat the detected data as abnormal data .

2、 Anomaly detection algorithm

$\qquad$ First, for $m$ Select... From data samples $n$ There are two features that may be abnormal $x_{j},j\in n$ ; Then calculate the mean value of each feature for all samples $\mu_{j},j \in n$ And variance $\sigma_{j}^2,j \in n$ ; For a new given data $x$ Calculation $p (x)$ , if $p(x)<\epsilon$ , Then it is determined that the data is abnormal data . The flow of anomaly detection algorithm is as follows ：
Insert picture description here
$\qquad$ The visual representation of anomaly detection of two-dimensional features is as follows ：

3、 Evaluate anomaly detection algorithms

$\qquad$ The evaluation method of anomaly detection algorithm is illustrated by an example of aircraft engine anomaly detection ：
Insert picture description here
$\qquad$ First, divide the training data set into , Cross validation data sets and test data sets ; Divide a small part of the known abnormal data into cross validation data set and test data set , therefore CV set and test set It can be seen as having data with labels .
Insert picture description here
$\qquad$ Then according to 2 The Gaussian model introduced in this paper uses the selected features to detect the anomaly of the training data set $p (x)$ Fitting of ; Then the fitted model $p (x)$ in the light of CV set Carry out model inspection , This test is similar to skewed data testing standard , Then according to the calculated $F_1$ Value to determine the quality of the model , At the same time, you can adjust the selected feature types and parameters of the model $\epsilon$ Value size . Finally, the trained model will use the test set test set To verify the quality of the model .

3.1 Use anomaly detection or supervised learning

$\qquad$ Usually when the number of abnormal samples is small (e.g., 0-20), But when the number of normal samples is large , Suitable for anomaly detection ; At the same time, when the characteristics of the abnormality cannot be determined , Anomaly detection is usually used , Such as abnormal parts detection , Data center computer supervision ; When the number of normal samples and abnormal samples is large , The sample contains sufficient abnormal sample information , It is suitable to use supervised learning , Such as spam detection , Weather forecast , Disease detection, etc .

4、 Handle the feature vector of exception detection

$\qquad$ To use gaussian Distribution to fit the anomaly detection model , It is necessary to ensure that the data distribution of the eigenvector satisfies the approximate Gaussian distribution , If the eigenvector of the initial data does not satisfy the Gaussian distribution , The data needs to be transformed , Make it approximately satisfy the Gaussian distribution , So that the algorithm can achieve better results . The processing method can take logarithm , take $\alpha \in(0,1)$ Power, etc .
$\qquad$ How to select features , Yes, you can first select some features , According to the training data, an anomaly detection model is trained , Then the model is validated on the cross validation data set , Increase or decrease the number of features by verifying the effect . meanwhile , Usually select those features with larger or smaller values at outliers .

5、 multivariate (multivariate) Gaussian distribution

$\qquad$ If there is n Dimension eigenvector $\in R^n$ , Multivariate Gaussian distribution does not fit each one-dimensional eigenvector separately $p(x_1)$ and $p(x_2)$ , Instead, all eigenvectors are fitted into a probability function $p (x)$ . The set form Gaussian anomaly detection model needs to use parameters $\mu \in R^n$ , $\Sigma \in R^{(n*n)}( Covariance matrix )$ , be $p(x;\mu, \Sigma)=\frac{1}{(2\pi)^\frac{n}{2}|\Sigma|^{\frac{1}{2}}}exp^{-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu)}$
$\qquad$ Several images with multivariate Gaussian distribution $\mu$ and $\Sigma$ The changes are as follows ：
Insert picture description here

5.1 Using multivariate Gaussian distribution to develop anomaly detection model

$\qquad$ Given training set $x^{(1)},x^{(2)},...,x^{(m)}$ , The method of constructing anomaly detection model using multivariate Gaussian distribution is as follows ：
Insert picture description here

5.2 The difference between original model and multivariate Gaussian distribution model

$\qquad$ Set the covariance matrix of multivariate Gaussian distribution model to... Except diagonal elements 0 after , Multivariate Gaussian model is the original model .
Insert picture description here

5.3 Choose to use the original model / Multivariate Gaussian model

$\qquad$ The original model needs to manually select the combined values between some features , However, multivariate Gaussian model can automatically capture the relationship between features ; The original model is more efficient than multivariate Gaussian model ; Multivariate Gaussian model must meet the number of training data $m$ Greater than the number of features $n$ , In this way, the covariance matrix $\Sigma$ Is reversible .
Insert picture description here
$\qquad$ The covariance matrix is irreversible in the following two cases ： If the number of training data in the training set is less than the number of features ; If there is a linear correlation between features , That is, there are redundant features .