当前位置:网站首页>Machine learning practice - naive Bayes
Machine learning practice - naive Bayes
2022-04-23 18:34:00 【Xuanche_】
Naive Bayes
One 、 summary
Bayesian classification algorithm is a probabilistic classification method of statistics , Naive Bayesian classification is the simplest of Bayesian classification . The classification principle is to use Bayesian formula to calculate the posterior probability according to the prior probability of a feature , Then, the class with the maximum a posteriori probability is selected as the class to which the feature belongs . And
So it's called ” simple ”, Because Bayesian classification is only the most primitive 、 The simplest hypothesis : All features are statistically independent .
Suppose a sample X Yes a 1 {a}_{1} a1, a 2 {a}_{2} a2, a 3 {a}_{3} a3… a n {a}_{n} an Attributes , So there are P(X) = P( a 1 {a}_{1} a1, a 2 {a}_{2} a2… a n {a}_{n} an) =P( a 1 {a}_{1} a1)P( a 2 {a}_{2} a2)…*P( a n {a}_{n} an) Satisfying the sample formula means that the characteristic statistics are independent .
1. Conditional probability formula
Conditional probability (Condittional probability), It means in the event B When it happens , event A Probability of occurrence , use P(A|B) To express .
According to Wen's diagram : stay B In the event of an incident , event A The probability of occurrence is P(A∩B) Divide P(B)
P ( A ∣ B ) = P ( A ∩ B ) P ( B ) P(A|B)\, =\, \frac {P(A\cap B)} {P(B)} P(A∣B)=P(B)P(A∩B) ⇒ P ( A ∣ B ) P ( B ) = P ( A ∩ B ) P(A|B)\, P(B)=\, P(A\cap B) P(A∣B)P(B)=P(A∩B)
The same can be : P ( B ∣ A ) P ( A ) = P ( A ∩ B ) P(B|A)\, P(A)=\, P(A\cap B) P(B∣A)P(A)=P(A∩B)
therefore : P ( B ∣ A ) P ( A ) = P ( A ∣ B ) P ( B ) P(B|A)\, P(A)=\, P(A|B)P(B) P(B∣A)P(A)=P(A∣B)P(B) ⇒ P ( A ∣ B ) = P ( B ∣ A ) P ( A ) P ( B ) P(A|B)=\frac {P(B|A)P(A)} {P(B)} P(A∣B)=P(B)P(B∣A)P(A)
Then look at the full probability formula , If the event A 1 {A}_{1} A1, A 2 {A}_{2} A2,… A n {A}_{n} An Constitute a complete event with positive probability , So for any event B Then there are :
P ( B ) = P ( B A 1 ) + P ( B A 2 ) + . . . + P ( B A n ) P(B)\, =\, P(B{A}_{1})+P(B{A}_{2})+...+P(B{A}_{n}) P(B)=P(BA1)+P(BA2)+...+P(BAn)
P ( B ) = ∑ i = 1 n P ( A i ) P ( B ∣ A i ) P(B)\, =\, \sum ^{n}_{i=1} {P({
{A}_{i}}_{})P(B|{A}_{i})} P(B)=∑i=1nP(Ai)P(B∣Ai)
Bayesian judgment
According to the formula of conditional probability and total probability , The Bayesian formula can be obtained as follows :
P ( A ∣ B ) = P ( A ) P ( B ∣ A ) P ( B ) P(A|B)=P(A)\frac {P(B|A)} {P(B)} P(A∣B)=P(A)P(B)P(B∣A)
P ( A i ∣ B ) = P ( A i ) P ( B ∣ A ) ∑ i = 1 n P ( A i ) P ( B ∣ A i ) P({A}_{i}|B)=P({A}_{i})\frac {P(B|A)} {\sum ^{n}_{i=1} {P({A}_{i})P(B|{A}_{i})}} P(Ai∣B)=P(Ai)∑i=1nP(Ai)P(B∣Ai)P(B∣A)
P(A) be called " Prior probability "(Prior probability)
, That is to say B Before the incident , We are right. A The probability of an event .
P(A|B) be called " Posterior probability "(Posterior probability)
, That is to say B After the event , We are right. A Reassessment of event probabilities .
P(B|A)/P(B) be called " Possibility function "(Likely hood)
, This is an adjustment factor , Make the estimated probability closer to the real probability .
So conditional probability can be understood as : Posterior probability = Prior probability * Adjustment factor
If " Possibility function ">1, signify " Prior probability " Enhanced , event A More likely to happen ;
If " Possibility function "=1, signify B Events do not help to judge events A The possibility of ;
If " Possibility function "<1, signify " Prior probability " Weakened , event A Less likely .
Naive Bayes species
stay scikit-learn in , Altogether 3 A naive Bayesian classification algorithm .
Namely GaussianNB,MultinomialNB and BernoulliNB.
1. GaussianNB
GaussianNB A priori is ** Gaussian distribution ( Normal distribution ) Naive Bayes
**, Assume that the data of each tag follows a simple normal distribution .
among by Y Of the k Class category . and For the values that need to be estimated from the training set .
here , use scikit-learn It's a simple implementation GaussianNB.
# Import package
import pandas as pd
from sklearn.naive_bayes import GaussianNB
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Import dataset
from sklearn import datasets
iris=datasets.load_iris()
# Sharding data sets
Xtrain, Xtest, ytrain, ytest = train_test_split(iris.data,
iris.target,
random_state=12)
# modeling
clf = GaussianNB()
clf.fit(Xtrain, ytrain)
# Perform prediction on the test set ,proba Derived is the probability that each sample belongs to a certain class
clf.predict(Xtest)
clf.predict_proba(Xtest)
# Test accuracy
accuracy_score(ytest, clf.predict(Xtest))
MultinomialNB
MultinomialNB Is a naive Bayes with a priori polynomial distribution . It assumes that the feature is generated by a simple polynomial distribution . Multinomial distribution can
Describe the probability of occurrence of various types of samples , Therefore, polynomial naive Bayes is very suitable for describing the characteristics of the number of occurrences or the proportion of occurrences .
This model is often used in text classification , The feature represents the number of times , For example, the number of occurrences of a word .
The polynomial distribution formula is as follows :
P ( X j = x j ∣ Y = C k ) = x j l + ξ m k + n ξ P({X}_{j}={x}_{j}|Y={C}_{k})=\frac { {x}_{jl\, }+ξ} { {m}_{k}+nξ} P(Xj=xj∣Y=Ck)=mk+nξxjl+ξ
among , P ( X j = x j ∣ Y = C k ) P({X}_{j}={x}_{j}|Y={C}_{k}) P(Xj=xj∣Y=Ck) It's No k Of categories j The number of dimensional features l The probability of the value conditions . m k {m}_{k} mk It's the output of training concentration k A sample of class
Count . ξ For a greater than 0 The constant , Often taken as 1, Laplace smoothing . You can also take other values .
BernoulliNB
BernoulliNB Is the naive Bayes with Bernoulli distribution a priori . Suppose that the prior probability of the feature is a binary Bernoulli distribution , It's like the following :
here There are only two values . x j l {x}_{jl} xjl Only value 0 perhaps 1.
In the Bernoulli model , The value of each feature is Boolean , namely true and false, perhaps 1 and 0.
In text classification , Is whether a feature appears in a document .
summary
- Generally speaking , If the distribution of sample characteristics is mostly continuous , Use GaussianNB It will be better. .
- If the distribution of sample features is mostly multivariate discrete values , Use MultinomialNB More appropriate .
- If the sample feature is binary discrete value or very sparse multivariate discrete value , You should use BernoulliNB.
版权声明
本文为[Xuanche_]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204231824246309.html
边栏推荐
- Use of regular expressions in QT
- 多功能工具箱微信小程序源码
- The vivado project corresponding to the board is generated by TCL script
- os_authent_prefix
- 机器学习理论之(7):核函数 Kernels —— 一种帮助 SVM 实现非线性化决策边界的方式
- 教你用简单几个步骤快速重命名文件夹名
- In shell programming, the shell file with relative path is referenced
- Daily CISSP certification common mistakes (April 11, 2022)
- CISSP certified daily knowledge points (April 11, 2022)
- Custom prompt box MessageBox in QT
猜你喜欢
Stm32mp157 wm8960 audio driver debugging notes
Differences between SSD hard disk SATA interface and m.2 interface (detailed summary)
视频边框背景如何虚化,简单操作几步实现
【ACM】509. 斐波那契数(dp五部曲)
os_authent_prefix
listener. log
Halo open source project learning (VII): caching mechanism
Solution to Chinese garbled code after reg file is imported into the registry
Cygwin64 right click to add menu, and open cygwin64 here
logstash 7. There is a time problem in X. the difference between @ timestamp and local time is 8 hours
随机推荐
Quantexa CDI(场景决策智能)Syneo平台介绍
CISSP certified daily knowledge points (April 13, 2022)
Cygwin64 right click to add menu, and open cygwin64 here
Database computer experiment 4 (data integrity and stored procedure)
硬核解析Promise對象(這七個必會的常用API和七個關鍵問題你都了解嗎?)
CISSP certified daily knowledge points (April 19, 2022)
Ctfshow - web362 (ssti)
STM32: LCD显示
QT notes on qmap container freeing memory
昇腾 AI 开发者创享日全国巡回首站在西安成功举行
实战业务优化方案总结---主目录---持续更新
logstash 7. There is a time problem in X. the difference between @ timestamp and local time is 8 hours
【ACM】70. climb stairs
14个py小游戏源代码分享第二弹
机器学习理论之(8):模型集成 Ensemble Learning
Const keyword, variable and function are decorated with const
登录和发布文章功能测试
Matlab tips (6) comparison of seven filtering methods
How to restore MySQL database after win10 system is reinstalled (mysql-8.0.26-winx64. Zip)
Gst-launch-1.0 usage notes