当前位置:网站首页>Machine learning -- naive Bayes
Machine learning -- naive Bayes
2022-04-23 13:16:00 【DCGJ666】
machine learning —— Naive Bayes
advantage :
- The algorithm logic is simple , Easy to implement
- The cost of time and space is small in the process of classification , High classification accuracy , Fast
- Naive Bayesian model originated from classical mathematical theory , Stable classification efficiency
- Less sensitive to missing data , The algorithm is also relatively simple , It is often used in text categorization
- Good for small-scale data , Be able to handle multi category tasks , For incremental training
shortcoming :
- Theoretically , Compared with other classification methods, naive Bayesian model has the smallest error rate . But it's not always the case , This is because the naive Bayesian model assumes Properties are independent of each other , This assumption is often not true in practical application , When the number of attributes is large or the correlation between attributes is large , The classification effect is not good .
- Need to know Prior probability , And a priori probability is often based on assumptions or existing training data , In some cases, there may be errors in classification decision-making due to the assumption of a priori probability .
Naive Bayes
Naive Bayes It's based on Independent hypothesis of characteristic conditions and Bayesian principle It's a new classification algorithm . Naive Bayes Obtained through training data X And y Of Joint distribution ; Then for what to predict X, according to Bayes' formula , Output Posterior probability maximal y.
Naive Bayes It's a kind of Generative Learning algorithms , Its generation method is through learning X,Y The joint distribution of . Assume that each feature is given y Are independent of each other .
Bayes' formula
P ( B ∣ A ) = P ( B ) P ( A ∣ B ) P ( A ) P(B \mid A)=\frac{P(B) P(A \mid B)}{P(A)} P(B∣A)=P(A)P(B)P(A∣B)
In the formula , event B The probability of is P(B), event B A conditional event has occurred A The probability of is P(A|B), event A The occurrence of a conditional event B The probability of is P(B|A)
Naive Bayes “ simple ” How to understand
Naive Bayes The simplicity in can be understood as “ Simple , naive ” It means , because “ simple ” It's a hypothesis Features are equally important , Are independent of each other , Not affecting each other , But in our real society , Attributes are not always independent of each other .
What is Laplace smoothing
Laplace smoothing yes Naive Bayes In dealing with Zero probability A way to correct the problem . When it comes to classification , There may be a case where an attribute does not appear at the same time with a class in the training set , If the calculation is directly based on the expression of naive Bayesian classifier, there will be Zero probability The phenomenon . In order to prevent the information carried by other attributes from being used by attribute values that have not appeared in the training set “ erase ”, That's why Laplace estimator Amendment . The way to do it is : Add... To the molecule 1, For a priori probability , Add the number of possible categories in the training set to the denominator ; For conditional probability , Add... To the denominator i Possible values of attributes
The application of naive Bayes
Naive Bayes The most widely used should be Document classification , Spam text filtering , Sentiment analysis , Recommendation system , Spelling correction etc. .
Naive Bayes is not sensitive to outliers
Naive Bayes Yes outliers Insensitivity . So in data processing , We can not remove outliers , Because preserving outliers can maintain the overall accuracy of naive Bayesian algorithm , Removing outliers may lead to the decline of generalization ability of the model due to the loss of some outliers in the process of prediction
A priori probability and a posteriori probability
Prior probability : It's directly the probability of something happening
Posterior probability : Know that something has happened , The probability of this happening
版权声明
本文为[DCGJ666]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230611343284.html
边栏推荐
- Introduction to metalama 4 Use fabric to manipulate items or namespaces
- @Excellent you! CSDN College Club President Recruitment!
- Complete project data of UAV apriltag dynamic tracking landing based on openmv (LabVIEW + openmv + apriltag + punctual atom four axes)
- MySQL basic statement query
- 叮~ 你的奖学金已到账!C认证企业奖学金名单出炉
- nodeJs + websocket 循环小案例
- Async void caused the program to crash
- STD:: shared of smart pointer_ ptr、std::unique_ ptr
- "Play with Lighthouse" lightweight application server self built DNS resolution server
- R语言中dcast 和 melt的使用 简单易懂
猜你喜欢

你和42W奖金池,就差一次“长沙银行杯”腾讯云启创新大赛!

Nodejs + Mysql realize simple registration function (small demo)

9419 page analysis of the latest first-line Internet Android interview questions

Imx6ull QEMU bare metal tutorial 2: usdhc SD card

Analysis of the latest Android high frequency interview questions in 2020 (BAT TMD JD Xiaomi)

@优秀的你!CSDN高校俱乐部主席招募!

Imx6ull QEMU bare metal tutorial 1: GPIO, iomux, I2C

MySQL —— 16、索引的数据结构

Example interview | sun Guanghao: College Club grows and starts a business with me

SPI NAND flash summary
随机推荐
4.22 study record (you only did water problems in one day, didn't you)
Learning notes of AMBA protocol
2020年最新字节跳动Android开发者常见面试题及详细解析
【动态规划】221. 最大正方形
The project file '' has been renamed or is no longer in the solution, and the source control provider associated with the solution could not be found - two engineering problems
9419 page analysis of the latest first-line Internet Android interview questions
CSDN College Club "famous teacher college trip" -- Hunan Normal University Station
@优秀的你!CSDN高校俱乐部主席招募!
缘结西安 | CSDN与西安思源学院签约,全面开启IT人才培养新篇章
4.22学习记录(你一天只做了水题是吗)
Ding ~ your scholarship has arrived! C certified enterprise scholarship list released
office2021安装包下载与激活教程
Design of body fat detection system based on 51 single chip microcomputer (51 + OLED + hx711 + US100)
web三大组件之Filter、Listener
100 GIS practical application cases (52) - how to keep the number of rows and columns consistent and aligned when cutting grids with grids in ArcGIS?
7_ The cell type scores obtained by addmodule and gene addition method are compared in space
Loading and using image classification dataset fashion MNIST in pytorch
[untitled] PID control TT encoder motor
AUTOSAR from introduction to mastery 100 lectures (87) - key weapon of advanced EEA - AUTOSAR and DDS
mui 关闭其他页面,只保留首页面