当前位置:网站首页>Intuitive understanding entropy
Intuitive understanding entropy
2022-04-23 10:48:00 【qq1033930618】
List of articles
One 、 Information entropy
H ( X ) = − ∑ i = 1 n p ( x i ) l o g p ( x i ) H\left(X\right)=-\sum_{i=1}^{n}p\left(x_i\right)logp\left(x_i\right) H(X)=−∑i=1np(xi)logp(xi)
The larger the information entropy, the more chaotic The higher the uncertainty The closer to uniform distribution Less information
n Random variables may take values
x A random variable
p(x) A random variable x The probability function of
No matter who the base of the logarithm is, it has no effect General with 10 Base number
Two 、 Relative entropy KL The divergence
D K L ( p ∣ ∣ q ) = ∑ i = 1 n p ( x i ) l o g p ( x i ) q ( x i ) D_{KL}\left(p||q\right)=\sum_{i=1}^{n}{p\left(x_i\right)log\frac{p\left(x_i\right)}{q\left(x_i\right)}} DKL(p∣∣q)=∑i=1np(xi)logq(xi)p(xi)
An asymmetric measure of the difference between two probability distributions
The distance between two different distributions of the same random variable
Asymmetry only PQ The probability distribution is exactly the same
Nonnegativity only PQ If the probability distribution is exactly the same, it will be equal to 0
You can write cross entropy minus information entropy
D K L ( p ∣ ∣ q ) = ∑ i = 1 n p ( x i ) l o g p ( x i ) q ( x i ) D_{KL}\left(p||q\right)=\sum_{i=1}^{n}{p\left(x_i\right)log\frac{p\left(x_i\right)}{q\left(x_i\right)}} DKL(p∣∣q)=∑i=1np(xi)logq(xi)p(xi)
= ∑ i = 1 n p ( x i ) l o g p ( x i ) − ∑ i = 1 n p ( x i ) l o g q ( x i ) =\sum_{i=1}^{n}p\left(x_i\right)logp\left(x_i\right)-\sum_{i=1}^{n}p\left(x_i\right)logq\left(x_i\right) =∑i=1np(xi)logp(xi)−∑i=1np(xi)logq(xi)
= H ( P , Q ) − H ( P ) =H\left(P,Q\right)-H\left(P\right) =H(P,Q)−H(P)
3、 ... and 、 Cross entropy
Measure the predicted distribution of random variables Q And real distribution P disparity
The distribution distance of Yueming novel is small
Only related to the prediction probability of the real label
Because unreal labels P(x)=0 Multiply any number to be 0
H ( P , Q ) = − ∑ i = 1 n p ( x i ) l o g q ( x i ) H\left(P,Q\right)=-\sum_{i=1}^{n}p\left(x_i\right)logq\left(x_i\right) H(P,Q)=−∑i=1np(xi)logq(xi)
H ( P , Q ) = ∑ x p ( x ) l o g 1 q ( x ) H\left(P,Q\right)=\sum_{x}{p\left(x\right)log\frac{1}{q\left(x\right)}} H(P,Q)=∑xp(x)logq(x)1
Most simplified formula Only real label predictions are calculated
C r o s s E n t r o p y ( p , q ) = − l o g q ( c i ) CrossEntropy\left(p,q\right)=-logq\left(c_i\right) CrossEntropy(p,q)=−logq(ci)
II. Classification formula
H ( P , Q ) = ∑ x p ( x ) l o g 1 q ( x ) H\left(P,Q\right)=\sum_{x}{p\left(x\right)log\frac{1}{q\left(x\right)}} H(P,Q)=∑xp(x)logq(x)1
= ( p ( x 1 ) l o g q ( x 1 ) + p ( x 2 ) l o g q ( x 2 ) ) =\left(p\left(x_1\right)logq\left(x_1\right)+p\left(x_2\right)logq\left(x_2\right)\right) =(p(x1)logq(x1)+p(x2)logq(x2))
= ( p l o g q + ( 1 − p ) l o g ( 1 − q ) ) =\left(plogq+\left(1-p\right)log\left(1-q\right)\right) =(plogq+(1−p)log(1−q))
p ( x 1 ) = p p\left(x_1\right)=p p(x1)=p
p ( x 2 ) = 1 − p p\left(x_2\right)=1-p p(x2)=1−p
q ( x 1 ) = q q\left(x_1\right)=q q(x1)=q
q ( x 2 ) = 1 − q q\left(x_2\right)=1-q q(x2)=1−q
The information entropy of real distribution is 0
here KL Divergence is equal to cross entropy
If there is no real distribution, then KL The divergence
CrossEntropyLoss()
entropy = nn.CrossEntropyLoss()
input = torch.tensor([[-0.7715,-0.6205,-0.2562]])
target = torch.tensor([0])
output = entropy(input, target)
l o s s ( x , c l a s s ) = − l o g e x p ( x [ c l a s s ] ) ∑ j e x p ( x [ j ] ) = − x [ c l a s s ] + l o g ∑ j e x p ( x [ j ] ) loss\left(x,class\right)=-log\frac{exp\left(x\left[class\right]\right)}{\sum_{j}exp\left(x\left[j\right]\right)}=-x\left[class\right]+log\sum_{j}exp\left(x\left[j\right]\right) loss(x,class)=−log∑jexp(x[j])exp(x[class])=−x[class]+log∑jexp(x[j])
Pay attention to e Base number
Four 、 Normal distribution KL The divergence
版权声明
本文为[qq1033930618]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230618497006.html
边栏推荐
- 349、两个数组的交集
- SSH利用私钥无密钥连接服务器踩坑实录
- Ansible cloud computing automation
- C language - custom type
- Manjaro installation and configuration (vscode, wechat, beautification, input method)
- Kaggle - real battle of house price prediction
- Strongest date regular expression
- Hikvision face to face summary
- Download and installation steps of xshell + xftp
- 142. Circular linked list||
猜你喜欢
解决方案架构师的小锦囊 - 架构图的 5 种类型
Yarn core parameter configuration
Ueditor -- limitation of 4m size of image upload component
Diary of dishes | Blue Bridge Cup - hexadecimal to octal (hand torn version) with hexadecimal conversion notes
Yarn resource scheduler
SSH uses private key to connect to server without key
C语言——自定义类型
/Can etc / shadow be cracked?
ID number verification system based on visual structure - Raspberry implementation
Chapter 120 SQL function round
随机推荐
ID number verification system based on visual structure - Raspberry implementation
Embedded related surface (I)
1、两数之和(哈希表)
203. Remove linked list elements (linked list)
Sim Api User Guide(8)
Notes on concurrent programming of vegetables (IX) asynchronous IO to realize concurrent crawler acceleration
Jerry's users how to handle events in the simplest way [chapter]
Initial exploration of NVIDIA's latest 3D reconstruction technology instant NGP
JUC concurrent programming 06 -- in-depth analysis of AQS source code of queue synchronizer
Esp32 learning - add folder to project
全栈交叉编译X86完成过程经验分享
Diary of dishes | Blue Bridge Cup - hexadecimal to octal (hand torn version) with hexadecimal conversion notes
Example of pop-up task progress bar function based on pyqt5
Download and installation steps of xshell + xftp
Ansible cloud computing automation
Define linked list (linked list)
IDEA——》每次启动都会Indexing或 scanning files to index
202、快乐数
Linked list intersection (linked list)
Net start MySQL MySQL service is starting MySQL service failed to start. The service did not report any errors.