当前位置：网站首页>3.2 - classification - Logistic regression

3.2 - classification - Logistic regression

后验概率（Posterior probability）：P(x)就是后验概率,右图是P(x)的图解,The whole process is also known as logistic regression（Logistic Regression）

最好的w^*和b^*It is the training set that has the greatest probability of generating this training setw和b,The figure on the right is a simplified operation after the identity deformation of the formula.

The blue lines are the two Bernoulli distributions（Bernoulli distribution）的交叉熵（Cross entropy）,The cross entropy is used to calculate the two distributionsp和qhow close to each other,如果p和q一模一样的话,Then the final calculated cross entropy is 0
对于逻辑回归（Logistic Regression）而言,It is used to measure the quality of the modellossThe function is the sum of the cross-entropy over the training set.该值越小,The better the performance on the training set.

对lossfunction to simplify,The final result is shown in green on the right,When the gap between the output of the model and the expected value is larger,Then the amount of our update should be larger

We can see from the left image,Logistic regression and linear regression work in the same way when doing parameter updates,You just need to adjust the learning rate $\eta$

According to the figure below, it can be clearly seen that variance is used when doing logistic regression（SquareError）来做loss函数的弊端,far from the optimal solution,Its differential value is still very small,This is not good for us to do gradient descent.

Logistic regression is a discriminant method（Discriminative）,Unlike the probabilistic generative model in the previous chapter, it is a generative method（Generative）.
Although both methods are looking for the best model in the same set of functions,But since logistic regression is found by doing gradient descent directlyw和b,Probabilistic generative models are created by finding𝜇1, 𝜇2, Σcome to findw和b,采取的方法不同,Therefore, the final model will also be very different.
In the example of Pokémon,We found that logistic regression is better than probabilistic generative models.

In the example below it is easy to intuit that the test set belongsclass 1的,However naive Bayes classifiers in probabilistic generative models end up telling us where the test set came fromclass 2.这是为什么呢？
This is because in the Naive Bayes classifier,It never considers correlations between different dimensions（correlation）,It considers that the two-dimensional parameters in each of the following data are independent of each other.Because probabilistic generative models always make some assumptions,For example, suppose that the data comes from a probability distribution,It's like it's brainstorming something,

Logistic regression is more affected by the data,Because he doesn't make any assumptions,So its error will decrease as the amount of data increases
Probabilistic generative models are less affected by the data,Because he has an assumption of his own,Sometimes it ignores thatdata,And follow that assumption of its own heart.因此在数据量比较小的时候,It is possible for probabilistic generative models to outperform logistic regression.
when the dataset is noisy,For example, part of the label is wrong,Because the probability generation model is less affected by the data,Then the final result may filter out these bad factors.
以语音辨识为例,Although a neural network is used,This is a logistic regression method.但事实上,The whole system is a probabilistic generative model,DNNJust a piece of it.

SoftmaxIt means maximum reinforcement,Because a layer of index is passed in the middle（exponential）operation to amplify the gap between the outputs.

逻辑回归的限制（Limitation of Logistic Regression）：
Here's the problem that logistic regression can't solve,We need to perform a certain feature transformation（Feature Transformation）

特征变换（Feature Transformation）：In order to allow the machine to generate transformations autonomously（Transformation）规则,We can join multiple logistic regressions（Cascading）起来.The figure on the right nicely shows the two processes of feature transformation and classification.
The box in the middle of the last picture is a class of neurons（Neuron）,And this whole network is called a neural network（Neural Network）,也被称为深度学习（Deep Learning）