当前位置：网站首页>Intuitive understanding entropy

Intuitive understanding entropy

2022-04-23 10:48:00 【qq1033930618】

List of articles

One 、 Information entropy
Two 、 Relative entropy KL The divergence
3、 ... and 、 Cross entropy
Four 、 Normal distribution KL The divergence

One 、 Information entropy

$H\left(X\right)=-\sum_{i=1}^{n}p\left(x_i\right)logp\left(x_i\right)$

 The larger the information entropy, the more chaotic   The higher the uncertainty   The closer to uniform distribution   Less information 
n		 Random variables may take values 
x		 A random variable 
p(x)	 A random variable x The probability function of 
 No matter who the base of the logarithm is, it has no effect   General with 10 Base number

Two 、 Relative entropy KL The divergence

$D_{KL}\left(p||q\right)=\sum_{i=1}^{n}{p\left(x_i\right)log\frac{p\left(x_i\right)}{q\left(x_i\right)}}$

 An asymmetric measure of the difference between two probability distributions 
 The distance between two different distributions of the same random variable 
 Asymmetry 		 only PQ The probability distribution is exactly the same 
 Nonnegativity  		 only PQ If the probability distribution is exactly the same, it will be equal to 0
 You can write cross entropy minus information entropy

$D_{KL}\left(p||q\right)=\sum_{i=1}^{n}{p\left(x_i\right)log\frac{p\left(x_i\right)}{q\left(x_i\right)}}$
$=\sum_{i=1}^{n}p\left(x_i\right)logp\left(x_i\right)-\sum_{i=1}^{n}p\left(x_i\right)logq\left(x_i\right)$
$=H\left(P,Q\right)-H\left(P\right)$

3、 ... and 、 Cross entropy

 Measure the predicted distribution of random variables Q And real distribution P disparity 
 The distribution distance of Yueming novel is small 
 Only related to the prediction probability of the real label 
 Because unreal labels P(x)=0 Multiply any number to be 0

$H\left(P,Q\right)=-\sum_{i=1}^{n}p\left(x_i\right)logq\left(x_i\right)$
$H\left(P,Q\right)=\sum_{x}{p\left(x\right)log\frac{1}{q\left(x\right)}}$

 Most simplified formula   Only real label predictions are calculated

$CrossEntropy\left(p,q\right)=-logq\left(c_i\right)$

 II. Classification formula

$H\left(P,Q\right)=\sum_{x}{p\left(x\right)log\frac{1}{q\left(x\right)}}$
$=\left(p\left(x_1\right)logq\left(x_1\right)+p\left(x_2\right)logq\left(x_2\right)\right)$
$=\left(plogq+\left(1-p\right)log\left(1-q\right)\right)$
$p\left(x_1\right)=p$
$p\left(x_2\right)=1-p$
$q\left(x_1\right)=q$
$q\left(x_2\right)=1-q$

 The information entropy of real distribution is 0
 here KL Divergence is equal to cross entropy 
 If there is no real distribution, then KL The divergence

CrossEntropyLoss()
entropy = nn.CrossEntropyLoss()
input = torch.tensor([[-0.7715,-0.6205,-0.2562]])
target = torch.tensor([0])
output = entropy(input, target)

$loss\left(x,class\right)=-log\frac{exp\left(x\left[class\right]\right)}{\sum_{j}exp\left(x\left[j\right]\right)}=-x\left[class\right]+log\sum_{j}exp\left(x\left[j\right]\right)$
Pay attention to e Base number