Entropy

The entropy of discrete distribution \(p\) (probability mass function) is defined as \[ H(p) = -\mathrm{E}_{x \sim p}\log p(x) \] The entropy reaches its maximum when the underlying distribution \(p\) is a uniform distribution. The maximum value is \(\log k\) if the support is finite and has \(k\) many distinct values. This can be derived with the Jenson’s inequality and understood via the level of chao of a distribution. From an analysis point of view, the entropy is defined on a \(k\)-tuple whose domain is compact so that the maxima must exist; we can always find a larger entropy if any two of \(p_1, \dots,p_k\) are not equal.

The entropy of continuous distribution \(q\) (probability density function) is usually similarly defined as \[ H(q) = -\E_{x\sim q}\log q(x) = -\int q(x)\log q(x)\d x \] This is actually the differential entropy introduced by Shannon. In fact, it is not a good continuous analog of discrete entropy and it was not rigorously derived. For example, this formula can be negative. Therefore, in the case of entropy, the random variable had better be discrete, despite the wide usage of differential entropy.

Gaussian Case

The entropy of a \(n\)-dimensional Gaussian distribution \(p(x) = \frac{e^{-\frac 1 2 (x-\mu)^T \Sigma^{-1}(x-\mu)}}{\sqrt{|2\pi\Sigma|}}\) can be derived as follows: \[ \begin{aligned} H(p) &\triangleq -\int p(x) \log p(x) \d x = -\int p(x) [-\frac 1 2 (x-\mu)^T \Sigma^{-1} (x-\mu) - \frac 1 2 \log |2\pi\Sigma|] \d x \\ &= \frac 1 2 \int p(x) (x-\mu)^T \Sigma^{-1}(x-\mu) \d x + \frac 1 2 \log |2\pi\Sigma| \\ &= \frac 1 2 \int p(x) x^T \Sigma^{-1} x \d x + \frac 1 2 \int p(x) \mu^T \Sigma^{-1} \mu \d x \\ &\quad - \frac 1 2 \int p(x) \mu^T \Sigma^{-1} x \d x - \frac 1 2 \int p(x) x^T \Sigma^{-1} \mu \d x \\ &\quad + \frac 1 2 \log |2\pi\Sigma| \\ &= [\frac 1 2 \tr(\Sigma^{-1} \Sigma) + \frac 1 2 \mu^T \Sigma^{-1} \mu] + \frac 1 2 \mu^T \Sigma^{-1} \mu \\ &\quad - \frac 1 2 \mu^T \Sigma^{-1} \mu - \frac 1 2 \mu^T \Sigma^{-1} \mu \\ &\quad + \frac 1 2 \log |2\pi\Sigma| \\ &= \frac 1 2 n + \frac 1 2 \log |2\pi\Sigma| \end{aligned} \]

Last updated on Apr 28, 2022