relative entropy
衡量两个概率分布的散度probability distributions diverges
for discrete probability distributions
image.png
for continuous random variable
image.png
从字面意思来看呢,是一种距离,但是实际上和我们理解的“距离”并不一样。我们常规理解的距离一般来说有几点性质:
1.非负:距离是绝对值,非负好理解。
2.对称:从A到B的距离 = 从B到A的距离
3.勾股定理:两边之和大于第三边
而KL的性质只满足第一点非负性,不满足对称性和勾股定理。
# KL divergence (and any other such measure) expects the input data to have a sum 1
1.import numpy as np
def KL(a, b):
a = np.array(a, dtype=np.float)
b = np.array(b, dtype=np.float)
return np.sum(np.where(a!=0, a*np.log(a/b), 0))
# np.log(a / (b + np.spacing(1))) np.spacing等价于inf
2. scipy.stats.entropy(pk, qk=None, base=None)
当qk != None时计算KL Divergence
automatically normalize x,y to have sum = 1
application:
text similarity, 先统计词频,然后计算kl divergence
用户画像
reference:
https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence
http://www.cnblogs.com/charlotte77/p/5392052.html