Home

Published

- 11 min read

Entropic Uncertainty Principle


Following up from previous article about LpL^p space, I want to discuss an interesting topic about Uncertainty Principle. When we are talking about “Uncertainty Principle”, we usually link it to Heisenberg Uncertainty Principle, which is found from Quantum Mechanical observation in a deductive manner. But in this article we are going to derive it from bottom up. From just a minimal assumptions/axioms.

Start from “information”

When we observe something, we gain information. So if we have a function, we can gain information from observing the function. Let’s say we encode information from an event that has probability distribution ρ\rho. Let’s say it depends on a parameter xx, which is usually a state (be it discrete or continuous) in the sense of statistics or random variables.

We wrote ρ(x)\rho(x).

If this function is measurable and behave nice enough — it can be discrete or continuous, doesn’t matter as long as it is measurable —, then it must have a corresponding dual in the Fourier frequency domain. This is due to Fourier Transform property, which is just a mathematical fact. Let’s just say it is γ(ξ)\gamma(\xi).

But we have several ways to choose on how to build the corresponding dual function. We can transform the function directly from ρ(x)\rho(x), or we can do other transform and then do the Fourier Transform. The thing I want to suggest here is: what if we use Plancherel Theorem, because we can further constraint the relation between the dual function. This is also discussed in my previous article about local time entropy

So we say ρ(x)\rho(x) is in the LpL^p space as square integrable function (means p=2p=2). This means that there exists a function ψ(x)\psi(x) where ρ(x)=ψ(x)2\rho(x)=|\psi(x)|^2.

We then perform the Fourier Transform on ψ(x)\psi(x). We get a dual function ϕ(ξ)\phi(\xi).

This way, we can apply Plancherel Theorem like this:

ψ(x)2dx=ϕ(ξ)2dξρ(x)dx=γ(ξ)dξ=1\begin{align*} \int |\psi(x)|^2 \, dx &= \int |\phi(\xi)|^2 \, d\xi \\ \int \rho(x) \, dx &= \int \gamma(\xi) \, d\xi = 1 \end{align*}

The last line is just the sum total of the probability distribution, so it must equal to 1. Since Plancherel Theorem can be applied, then γ(ξ)\gamma(\xi) also equal 1, meaning it is also some sort of probability distribution.

Now here’s the crucial insight. Because both of ρ\rho and γ\gamma is a probability distribution. It must have Shannon entropy. So, it has information.

Stating the inequalities from LpL^p space

As we have discussed in previous article regarding LpL^p space. It turns out that p=2p=2 is the only optimum solution to derive some sort of “circle” concept, where the distance and angle have uniform length. In this space, the triangle inequalities is a little bit unique. It become the most extremum bound for the inequalities. You can’t get any better than that!

Loosely speaking, in our 2D Cartesian coordinate (flat Euclidean geometry, where L2L^2 space is), any right-triangle will have natural inequalities, such that the hypothenuse length has to be smaller than the other two lengths combined.

This simple idea is what formed another theorem called Hausdorff-Young inequality which I will call HY from now on. It is a direct corollary of the Plancherel Theorem.

The full general statement of HY is like this:

Suppose that pp (from the LpL^p space) is in the range (inclusive) between [1,2][1,2]. There will be a corresponding (conjugate) called qq with the relation:

1p+1q=1\frac{1}{p}+\frac{1}{q} = 1

Then the inequality has this form for multidimensional nn:

(Rnϕ(ξ)qdξ)1/q(Rnψ(x)pdx)1/p{\displaystyle {\Big (}\int _{\mathbb {R} ^{n}}{\big |}{\phi}(\xi ){\big |}^{q}\,d\xi {\Big )}^{1/q}\leq {\Big (}\int _{\mathbb {R} ^{n}}{\big |}\psi(x){\big |}^{p}\,dx{\Big )}^{1/p}}

However, for our case, we already knew that p=2p=2. Simplifying the notation (hiding the parameter and integral by using the norm notation):

ψp=(Rnψ(x)pdx)1/pϕq=(Rnϕ(ξ)qdξ)1/q\begin{align*} \|\psi\|_p &= {\Big (}\int _{\mathbb {R} ^{n}}{\big |}\psi(x){\big |}^{p}\,dx{\Big )}^{1/p} \\ \|\phi\|_q &= {\Big (}\int _{\mathbb {R} ^{n}}{\big |}{\phi}(\xi ){\big |}^{q}\,d\xi {\Big )}^{1/q} \end{align*}

In a simple notation using functions we already defined above:

ϕqψp\|\phi\|_q \leq \|\psi\|_p

You might be wondering, since we can choose any arbitrary function to represent ψ(x)\psi(x), then even if we flipped the inequalities, it should still be apply the same way. So how come this is correct?

The reason why we can flip around the function (it is symmetric after all), is because the Fourier Transform normalization factor will ensure that one side is bigger than the other. However in the case of p=2p=2, it is the most optimum, causing it to be equality due to Pancherel Theorem.

The important insight from these inequalities is that it is not possible to make both sides as dense as possible. If you make one function dense/small, the inequality ensure that the Fourier dual be wider.

In addition to that, Beckner found an even tighter bound. We can introduce a constant:

A(p)=(p1pq1q)12A(p)=\left( \frac{p^{\frac{1}{p}}}{q^{\frac{1}{q}}} \right)^\frac{1}{2}

So that the inequality becomes:

ϕqA(p)nψp\|\phi\|_q \leq A(p)^n \|\psi\|_p

Although in our case, we can easily set p=2p=2 to apply Plancherel Theorem. But we will defer, and do it later. We are going to find the entropy first.

Linking inequalities to entropy

We limit our class of function ψ\psi to have corresponding probability distribution ρ\rho in the L2L^2 space. We can calculate its entropy. We will use the general notion of entropy in information theory called the Rényi entropy

You will see why this is a natural choice, just from the form of it.

Rényi entropy Rα(P)R_\alpha(P) for a probability distribution P(X)P(X) of random variables XX is defined:

Rα(P)=11αln(Pαα)R_\alpha(P)=\frac{1}{1-\alpha}\operatorname{ln}{\left(\|P\|_\alpha^\alpha\right)}

The above formula means for every state in random variables XX, the probability P(x)P(x) will have total norm of Pα\|P\|_\alpha for the parameter α\alpha. To remind that again:

Pα=(P(x)αdx)1αPαα=(P(x)αdx)\begin{align*} \|P\|_\alpha &= {\Big (} \int P(x)^\alpha \, dx {\Big )}^\frac{1}{\alpha} \\ \|P\|_\alpha^\alpha &= {\Big (} \int P(x)^\alpha \, dx {\Big )} \\ \end{align*}

Very appropriate link to the LpL^p space HY inequality we used before.

Some little bit of algebra

Because ψ(x)2=ρ(x)|\psi(x)|^2 = \rho(x) and Rényi entropy uses P(x)αP(x)^\alpha, we are going to substitute that p=2αp=2\alpha. So that:

ψ(x)2α=ρ(x)αψ(x)2α2α=(ρ(x)αdx)Rα(ρ)=11αln(ραα)=11αln(ψ2α2α)\begin{align*} |\psi(x)|^{2\alpha} &= \rho(x)^\alpha \\ \|\psi(x)\|_{2\alpha}^{2\alpha} &= {\Big (} \int \rho(x)^\alpha \, dx {\Big )} \\ R_\alpha(\rho) &= \frac{1}{1-\alpha} \operatorname{ln}{\left(\|\rho\|_\alpha^\alpha\right)} = \frac{1}{1-\alpha} \operatorname{ln}{\left(\|\psi\|_{2\alpha}^{2\alpha}\right)} \\ \end{align*}

We do similar thing for ϕ(ξ)\phi(\xi). Meaning, we choose q=2βq=2\beta. So that we have the following Rényi entropy Rβ(γ)R_\beta(\gamma)

Rβ(γ)=11βln(γββ)=11βln(ϕ2β2β)R_\beta(\gamma) = \frac{1}{1-\beta} \operatorname{ln}{\left(\|\gamma\|_\beta^\beta\right)} = \frac{1}{1-\beta} \operatorname{ln}{\left(\|\phi\|_{2\beta}^{2\beta}\right)}

We have a symmetric relationship between ψ\psi and ϕ\phi as its dual Fourier transform. We can apply HY inequality in either direction. Let’s start with this:

ϕqA(p)nψp\|\phi\|_q \leq A(p)^n \|\psi\|_p

We are going to modify this inequality so that it becomes entropy. Focus on making ϕq\|\phi\|_q becomes Rβ(γ)R_\beta(\gamma)

Substitute p=2αp=2\alpha and q=2βq=2\beta

ϕ2βA(2α)nψ2α\|\phi\|_{2\beta} \leq A(2\alpha)^n \|\psi\|_{2\alpha}

Raise both sides to the power of 2β2\beta

ϕ2β2βA(2α)2βnψ2α2β\|\phi\|_{2\beta}^{2\beta} \leq A(2\alpha)^{2\beta n} \|\psi\|_{2\alpha} ^{2\beta}

Take the logarithm

ln(ϕ2β2β)2β(nln(A(2α))+ln(ψ2α))\operatorname{ln}(\|\phi\|_{2\beta}^{2\beta}) \leq 2\beta (n \operatorname{ln}(A(2\alpha)) + \operatorname{ln}( \|\psi\|_{2\alpha}))

Because ln(ϕ2β2β)=(1β)Rβ(γ)\operatorname{ln}(\|\phi\|_{2\beta}^{2\beta}) = (1-\beta) R_\beta(\gamma)

(1β)Rβ(γ)2β(nln(A(2α))+ln(ψ2α))(1-\beta) R_\beta(\gamma) \leq 2\beta (n \operatorname{ln}(A(2\alpha)) + \operatorname{ln}( \|\psi\|_{2\alpha}))

Now we want to modify ln(ψ2α)\operatorname{ln}( \|\psi\|_{2\alpha}) to become Rα(ρ)R_\alpha(\rho).

Use 1α2αRα(ρ)=ln(ψ2α)\frac{1-\alpha}{2\alpha} R_\alpha(\rho)=\operatorname{ln}( \|\psi\|_{2\alpha})

(1β)Rβ(γ)2βnln(A(2α))+βα(1α)Rα(ρ)(1β)βRβ(γ)2nln(A(2α))+(1α)αRα(ρ)\begin{align*} (1-\beta) R_\beta(\gamma) &\leq 2\beta n \operatorname{ln}(A(2\alpha)) + \frac{\beta}{\alpha}\cdot(1-\alpha) R_\alpha(\rho) \\ \frac{(1-\beta)}{\beta} R_\beta(\gamma) &\leq 2 n \operatorname{ln}(A(2\alpha)) + \frac{(1-\alpha)}{\alpha} R_\alpha(\rho) \\ \end{align*}

We have a relation: 1p+1q=1\frac{1}{p}+\frac{1}{q} = 1

Meaning 1α+1β=2\frac{1}{\alpha}+\frac{1}{\beta} = 2

So that:

1α+1β=21α1+1β1=01αα+1ββ=0\begin{align*} \frac{1}{\alpha}+\frac{1}{\beta} &= 2 \\ \frac{1}{\alpha} - 1 +\frac{1}{\beta} -1 &= 0 \\ \frac{1-\alpha}{\alpha} + \frac{1-\beta}{\beta} &= 0 \\ \end{align*}

Using this relation. We add (1α)αRβ(γ)\frac{(1-\alpha)}{\alpha} R_\beta(\gamma) to the inequality, on both sides

(1β)βRβ(γ)+(1α)αRβ(γ)2nln(A(2α))+(1α)αRα(ρ)+(1α)αRβ(γ)Rβ(γ)(1αα+1ββ)2nln(A(2α))+(1α)α(Rα(ρ)+Rβ(γ))02nln(A(2α))+(1α)α(Rα(ρ)+Rβ(γ))(1α)α(Rα(ρ)+Rβ(γ))2nln(A(2α))\begin{align*} \frac{(1-\beta)}{\beta} R_\beta(\gamma) + \frac{(1-\alpha)}{\alpha} R_\beta(\gamma) &\leq 2 n \operatorname{ln}(A(2\alpha)) + \frac{(1-\alpha)}{\alpha} R_\alpha(\rho) + \frac{(1-\alpha)}{\alpha} R_\beta(\gamma) \\ R_\beta(\gamma) (\frac{1-\alpha}{\alpha} + \frac{1-\beta}{\beta}) &\leq 2 n \operatorname{ln}(A(2\alpha)) + \frac{(1-\alpha)}{\alpha} (R_\alpha(\rho) +R_\beta(\gamma)) \\ 0 &\leq 2 n \operatorname{ln}(A(2\alpha)) + \frac{(1-\alpha)}{\alpha} (R_\alpha(\rho) +R_\beta(\gamma)) \\ \frac{(1-\alpha)}{\alpha} (R_\alpha(\rho) +R_\beta(\gamma)) &\ge - 2 n \operatorname{ln}(A(2\alpha)) \\ \end{align*}

To move the term (1α)α\frac{(1-\alpha)}{\alpha} to right hand side, remember that pp is in the range [1,2][1,2]. Which means 1α1-\alpha is always a positive number. So if we multiply both terms by α(1α)\frac{\alpha}{(1-\alpha)}, we got

Rα(ρ)+Rβ(γ)n2αα1ln(A(2α))\begin{align*} R_\alpha(\rho) +R_\beta(\gamma) &\ge n \cdot \frac{2 \alpha}{\alpha-1}\operatorname{ln}(A(2\alpha)) \\ \end{align*}

In the right hand side, we got nn, representing the number of dimension of the space.

We also have 2αα1ln(A(2α))\frac{2 \alpha}{\alpha-1}\operatorname{ln}(A(2\alpha)) which is a constant that depends on our choice of α\alpha.

Because we establish that p=2p=2, it means α=1\alpha=1, so we can’t evaluate the right side immediately because it will be divided by α1=0\alpha-1=0.

Taking the limit into Shannon Entropy

The constant of the bounds C(α)=2αα1ln(A(2α))C(\alpha)= \frac{2 \alpha}{\alpha-1}\operatorname{ln}(A(2\alpha)) had to be calculated by taking its limit when α1\alpha \to 1. When α1\alpha\to 1. Both the entropy RαR_\alpha and RβR_\beta conveniently becomes the Shannon entropy.

limα1C(α)=limα12αα1ln(A(2α))=limα12αln(A(2α))α1\begin{align*} \lim_{\alpha\to 1} C(\alpha) &= \lim_{\alpha\to 1}\frac{2 \alpha}{\alpha-1}\operatorname{ln}(A(2\alpha)) \\ &= \lim_{\alpha\to 1} \frac{2 \alpha \operatorname{ln}(A(2\alpha))}{\alpha -1} \\ \end{align*}

We already know above, the value of limα1A(2α)=1\lim_{\alpha\to 1}A(2\alpha)=1 from HY inequality matched to Plancherel Theorem. This causes the limit to be in the form of 00\frac{0}{0}. So let’s apply L’Hôpital’s rule.

But before that, let’s expand the expression by substituting A(2α)=(2α)14α(2β)14βA(2\alpha) = \frac{(2\alpha)^{\frac{1}{4\alpha}}}{(2\beta)^{\frac{1}{4\beta}}}

C(α)=2αα1ln(A(2α))=2αα1ln((2α)14α(2β)14β)=12ln(2α)α112αβln(2β)α1=12ln(2α)α1+12ln(2β)β1=12[ln(α)α1+ln(β)β1]+12ln(2)[1α1+1β1]=12[ln(α)α1+ln(β)β1]+12ln(2)[1α1+1β1αα1ββ1]=12[ln(α)α1+ln(β)β1]+12ln(2)[2]=12[ln(α)α1+ln(β)β1]ln(2)\begin{align*} C(\alpha) &= \frac{2 \alpha}{\alpha-1}\operatorname{ln}(A(2\alpha)) \\ &= \frac{2 \alpha}{\alpha-1}\operatorname{ln}{\Big (}\frac{(2\alpha)^{\frac{1}{4\alpha}}}{(2\beta)^{\frac{1}{4\beta}}}{\Big )} \\ &= \frac{1}{2}\cdot\frac{\operatorname{ln}(2\alpha)}{\alpha-1} - \frac{1}{2}\cdot\frac{\alpha}{\beta}\cdot\frac{\operatorname{ln}(2\beta)}{\alpha -1}\\ &= \frac{1}{2}\cdot\frac{\operatorname{ln}(2\alpha)}{\alpha-1} + \frac{1}{2}\cdot\frac{\operatorname{ln}(2\beta)}{\beta-1} \\ &= \frac{1}{2}\cdot\left[ \frac{\operatorname{ln}(\alpha)}{\alpha-1} + \frac{\operatorname{ln}(\beta)}{\beta-1} \right] + \frac{1}{2}\cdot\operatorname{ln}(2) \left[\frac{1}{\alpha-1}+\frac{1}{\beta-1}\right] \\ &= \frac{1}{2}\cdot\left[ \frac{\operatorname{ln}(\alpha)}{\alpha-1} + \frac{\operatorname{ln}(\beta)}{\beta-1} \right] + \frac{1}{2}\cdot\operatorname{ln}(2) \left[\frac{1}{\alpha-1}+\frac{1}{\beta-1} -\frac{\alpha}{\alpha-1}-\frac{\beta}{\beta-1}\right] \\ &= \frac{1}{2}\cdot\left[ \frac{\operatorname{ln}(\alpha)}{\alpha-1} + \frac{\operatorname{ln}(\beta)}{\beta-1} \right] + \frac{1}{2}\cdot\operatorname{ln}(2) \left[-2\right] \\ &= \frac{1}{2}\cdot\left[ \frac{\operatorname{ln}(\alpha)}{\alpha-1} + \frac{\operatorname{ln}(\beta)}{\beta-1} \right] - \operatorname{ln}(2) \\ \end{align*}

We then apply L’Hôpital’s rule to both terms that still has variables, because both have indeterminate form of 00\frac{0}{0}.

limα1C(α)=limα1,β112[ln(α)α1+ln(β)β1]ln(2)=12[limα1ln(α)α1+limβ1ln(β)β1]ln(2)=12[limα11α1+limβ11β1]ln(2)=1ln(2)=ln(e2)\begin{align*} \lim_{\alpha\to 1} C(\alpha) &= \lim_{\alpha\to 1 , \beta\to 1} \frac{1}{2}\cdot\left[ \frac{\operatorname{ln}(\alpha)}{\alpha-1} + \frac{\operatorname{ln}(\beta)}{\beta-1} \right] - \operatorname{ln}(2) \\ &= \frac{1}{2}\cdot\left[\lim_{\alpha\to 1} \frac{\operatorname{ln}(\alpha)}{\alpha-1} + \lim_{\beta\to 1} \frac{\operatorname{ln}(\beta)}{\beta-1} \right] - \operatorname{ln}(2) \\ &= \frac{1}{2}\cdot\left[ \lim_{\alpha\to 1} \frac{\frac{1}{\alpha}}{1} + \lim_{\beta\to 1} \frac{\frac{1}{\beta}}{1} \right] -\operatorname{ln}(2) \\ &= 1 - \operatorname{ln}(2) = \operatorname{ln}(\frac{e}{2}) \\ \end{align*}

The final form of Entropic Uncertainty Principle

Combining all that, we substitute all the results we have so far:

limα1,β1Rα(ρ)+Rβ(γ)limα1n2αα1ln(A(2α))H(ρ)+H(γ)nln(e2)\begin{align*} \lim_{\alpha\to 1 , \beta\to 1} R_\alpha(\rho) +R_\beta(\gamma) &\ge \lim_{\alpha\to 1} n \cdot \frac{2 \alpha}{\alpha-1}\operatorname{ln}(A(2\alpha)) \\ H(\rho) + H(\gamma) &\ge n \operatorname{ln}(\frac{e}{2}) \\ \end{align*}

This is what we re-derive, to state Entropic Uncertainty Principle.

This has such a profound connection. Any function, that has a corresponding probability distribution in L2L^2 will have information such that it is fundamentally limited entropic uncertainty.

Just from a mathematical observation alone, we can know that any function can’t have its information be certain in both its domain and its Fourier dual domain. You can only make certain of it in one domain. But you can’t make both certain at the same time. There is a mathematical limit on how certain you can be, if you want to know both of them. The limiting entropy is exactly lne2\operatorname{ln}\frac{e}{2} at the very minimum.

Testing Entropic Uncertainty Principle

In the previous article, we derived that a normal/Gaussian distribution is the best possible probability distribution with least possible amount of information needed to guess it. We figure that out using Action principle and Lagrangian method to derive normal distribution

Normal distribution is also a distribution that is its own Fourier Transform

In conclusion, a Normal distribution is a good candidate to test Uncertainty Principle.

Suppose that we have our function defined as a Gaussian, with NN as the normalization parameter:

ψ(x)=Nexp(πx2)\psi(x) = N \exp(-\pi x^2)

The corresponding Fourier dual has exactly the same expression:

ϕ(ξ)=Nexp(πξ2)\phi(\xi) = N \exp(-\pi\xi^2)

We have the distribution

ρ(x)=ψ(x)2=N2exp(πx2)2=N2exp(2πx2)\rho(x)=|\psi(x)|^2 = N^2 \exp(-\pi x^2)^2= N^2 \exp(-2\pi x^2)

Because probability density has to sum up to 1. Then N=214N=2^{\frac{1}{4}} to make it normalized.

And

γ(ξ)=ϕ(ξ)2=N2exp(πξ2)2=N2exp(2πξ2)\gamma(\xi)=|\phi(\xi)|^2 = N^2 \exp(-\pi \xi^2)^2 = N^2 \exp(-2\pi\xi^2)

The standard formula for a normal distribution is in the form:

p(x)=1σ2πexp(12(xμσ)2)p(x) = \frac{1}{\sigma \sqrt{2\pi}} \exp(-\frac{1}{2} (\frac{x-\mu}{\sigma})^2)

By matching the parameter, we know that μρ=0\mu_\rho=0 and σρ=12π\sigma_\rho=\frac{1}{2\sqrt{\pi}}.

The corresponding variance for γ\gamma is also the same σγ=12π\sigma_\gamma=\frac{1}{2\sqrt{\pi}}.

Then, the Shannon entropy of the Normal distribution is defined by:

H(p(x))=12ln(2πeσ2)H(p(x))=\frac{1}{2} \operatorname{ln}(2 \pi e \sigma^2)

Inserting the value of σ\sigma, we got:

H(ρ)=H(γ)=12ln(2πe14π)=12ln(e2)H(\rho)=H(\gamma)=\frac{1}{2} \operatorname{ln}(2 \pi e \frac{1}{4\pi})=\frac{1}{2} \operatorname{ln}(\frac{e}{2})

If we substitute it to the uncertainty principle, we got the smallest bound possible correctly:

H(ρ)+H(γ)nln(e2)12ln(e2)+12ln(e2)nln(e2)ln(e2)ln(e2)\begin{align*} H(\rho)+H(\gamma) &\ge n \operatorname{ln}(\frac{e}{2}) \\ \frac{1}{2} \operatorname{ln}(\frac{e}{2}) + \frac{1}{2} \operatorname{ln}(\frac{e}{2}) &\ge n \operatorname{ln}(\frac{e}{2}) \\ \operatorname{ln}(\frac{e}{2}) &\ge \operatorname{ln}(\frac{e}{2}) \end{align*}

One remarkable thing from this Entropic Uncertainty Principle is that how you can intuitively say that for any Gaussian/Normal probability distribution. The total entropy between the function and its dual Fourier transform, has to be always ln(e2)\operatorname{ln}(\frac{e}{2}).

Because, by the Maximum Entropy principle, Gaussian/Normal distribution is the best distribution you can guess with very limited information. It’s the highest entropy you can get, but it’s also the smallest total entropy bound possible, because of Entropic Uncertainty Principle. Combining both principle, the inequality has to become an equality for every Gaussian, just by definition alone.

It’s very easy to derive it mathematically too.

Let’s say that that our probability distribution is a Normal distribution in the form of

ρ(x)=N2exp(2αx2)\rho(x)=N^2 exp(-2\alpha x^2)

Such that

ψ(x)=Nexp(αx2)\psi(x)=N \exp(-\alpha x^2)

This caused

ϕ(x)=Nπαexp(π2αξ2)\phi(x)= N \sqrt{\frac{\pi}{\alpha}} \exp(-\frac{\pi^2}{\alpha} \xi^2)

And

γ(x)=N2παexp(2π2αξ2)\gamma(x) = N^2 \frac{\pi}{\alpha}\exp(-\frac{2\pi^2}{\alpha} \xi^2)

The variance:

σx2=14α\sigma_x^2 =\frac{1}{4\alpha} σξ2=α4π2\sigma_\xi^2 = \frac{\alpha}{4\pi^2}

So whatever the original function is, if it is a Gaussian, then the product of the variances is:

σx2σξ2=14αα4π2=116π2\sigma_x^2 \sigma_\xi^2 = \frac{1}{4\alpha} \cdot \frac{\alpha}{4\pi^2} = \frac{1}{16\pi^2}

The α\alpha parameter that controls how dense the Gaussian is, just cancel out each other.

We can then uses above product to calculate the entropy.

H(ρ)+H(γ)=12ln(2πeσx2)+12ln(2πeσξ2)=12ln(4π2e2σx2σξ2)=12ln(4π2e2116π2)=12ln(e24)=ln(e2)\begin{align*} H(\rho)+H(\gamma) &= \frac{1}{2} \operatorname{ln}(2 \pi e \sigma_x^2) + \frac{1}{2} \operatorname{ln}(2 \pi e \sigma_\xi^2) \\ &= \frac{1}{2} \operatorname{ln}(4 \pi^2 e^2 \sigma_x^2 \sigma_\xi^2) \\ &= \frac{1}{2} \operatorname{ln}(4 \pi^2 e^2 \frac{1}{16\pi^2}) \\ &= \frac{1}{2} \operatorname{ln}(\frac{e^2}{4}) \\ &= \operatorname{ln}(\frac{e}{2}) \\ \end{align*}

So whatever the original parameter of NN and α\alpha, the entropy will be always ln(e2)\operatorname{ln}(\frac{e}{2}) for single dimensional Gaussian/Normal distribution.

In essence, this is the best possible entropy or information we can get out of a function and its Fourier dual. You can’t get any lower than this.

Related Posts

There are no related posts yet. 😢