当前位置:网站首页>Machine learning -- naive Bayes
Machine learning -- naive Bayes
2022-04-23 13:16:00 【DCGJ666】
machine learning —— Naive Bayes
advantage :
- The algorithm logic is simple , Easy to implement
- The cost of time and space is small in the process of classification , High classification accuracy , Fast
- Naive Bayesian model originated from classical mathematical theory , Stable classification efficiency
- Less sensitive to missing data , The algorithm is also relatively simple , It is often used in text categorization
- Good for small-scale data , Be able to handle multi category tasks , For incremental training
shortcoming :
- Theoretically , Compared with other classification methods, naive Bayesian model has the smallest error rate . But it's not always the case , This is because the naive Bayesian model assumes Properties are independent of each other , This assumption is often not true in practical application , When the number of attributes is large or the correlation between attributes is large , The classification effect is not good .
- Need to know Prior probability , And a priori probability is often based on assumptions or existing training data , In some cases, there may be errors in classification decision-making due to the assumption of a priori probability .
Naive Bayes
Naive Bayes It's based on Independent hypothesis of characteristic conditions and Bayesian principle It's a new classification algorithm . Naive Bayes Obtained through training data X And y Of Joint distribution ; Then for what to predict X, according to Bayes' formula , Output Posterior probability maximal y.
Naive Bayes It's a kind of Generative Learning algorithms , Its generation method is through learning X,Y The joint distribution of . Assume that each feature is given y Are independent of each other .
Bayes' formula
P ( B ∣ A ) = P ( B ) P ( A ∣ B ) P ( A ) P(B \mid A)=\frac{P(B) P(A \mid B)}{P(A)} P(B∣A)=P(A)P(B)P(A∣B)
In the formula , event B The probability of is P(B), event B A conditional event has occurred A The probability of is P(A|B), event A The occurrence of a conditional event B The probability of is P(B|A)
Naive Bayes “ simple ” How to understand
Naive Bayes The simplicity in can be understood as “ Simple , naive ” It means , because “ simple ” It's a hypothesis Features are equally important , Are independent of each other , Not affecting each other , But in our real society , Attributes are not always independent of each other .
What is Laplace smoothing
Laplace smoothing yes Naive Bayes In dealing with Zero probability A way to correct the problem . When it comes to classification , There may be a case where an attribute does not appear at the same time with a class in the training set , If the calculation is directly based on the expression of naive Bayesian classifier, there will be Zero probability The phenomenon . In order to prevent the information carried by other attributes from being used by attribute values that have not appeared in the training set “ erase ”, That's why Laplace estimator Amendment . The way to do it is : Add... To the molecule 1, For a priori probability , Add the number of possible categories in the training set to the denominator ; For conditional probability , Add... To the denominator i Possible values of attributes
The application of naive Bayes
Naive Bayes The most widely used should be Document classification , Spam text filtering , Sentiment analysis , Recommendation system , Spelling correction etc. .
Naive Bayes is not sensitive to outliers
Naive Bayes Yes outliers Insensitivity . So in data processing , We can not remove outliers , Because preserving outliers can maintain the overall accuracy of naive Bayesian algorithm , Removing outliers may lead to the decline of generalization ability of the model due to the loss of some outliers in the process of prediction
A priori probability and a posteriori probability
Prior probability : It's directly the probability of something happening
Posterior probability : Know that something has happened , The probability of this happening
版权声明
本文为[DCGJ666]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230611343284.html
边栏推荐
- Async void provoque l'écrasement du programme
- "Xiangjian" Technology Salon | programmer & CSDN's advanced road
- Nodejs + websocket cycle small case
- 100 GIS practical application cases (34) - splicing 2020globeland30
- 8086 of x86 architecture
- The project file '' has been renamed or is no longer in the solution, and the source control provider associated with the solution could not be found - two engineering problems
- melt reshape decast 长数据短数据 长短转化 数据清洗 行列转化
- XML
- 【微信小程序】flex布局使用记录
- Solve the problem that Oracle needs to set IP every time in the virtual machine
猜你喜欢
GIS practical tips (III) - how to add legend in CASS?
The use of dcast and melt in R language is simple and easy to understand
Install nngraph
three.js文字模糊问题
web三大组件之Servlet
1130 - host XXX is not allowed to connect to this MySQL server error in Navicat remote connection database
MySQL 8.0.11 download, install and connect tutorials using visualization tools
AUTOSAR from introduction to mastery lecture 100 (84) - Summary of UDS time parameters
Servlet of three web components
9419 page analysis of the latest first-line Internet Android interview questions
随机推荐
1130 - host XXX is not allowed to connect to this MySQL server error in Navicat remote connection database
9419 page analysis of the latest first-line Internet Android interview questions
在 pytorch 中加载和使用图像分类数据集 Fashion-MNIST
MySQL5. 5 installation tutorial
将opencv 图片转换为字节的方式
AUTOSAR from introduction to mastery 100 lectures (50) - AUTOSAR memory management series - ECU abstraction layer and MCAL layer
9419页最新一线互联网Android面试题解析大全
Data warehouse - what is OLAP
filter()遍历Array异常友好
你和42W奖金池,就差一次“长沙银行杯”腾讯云启创新大赛!
Vscode tips
three. JS text ambiguity problem
vscode小技巧
Solve the problem of Oracle Chinese garbled code
Scons build embedded ARM compiler
8086 of x86 architecture
MySQL -- 16. Data structure of index
【行走的笔记】
pyqt5 将opencv图片存入内置SQLlite数据库,并查询
数据仓库—什么是OLAP