Information-based approaches to probabilistic approximation.

Normal approximation

Unlike most classical ways of proving the Central Limit Theorem (CLT), the information theoretic approach comes with very nice intuition. Consider the sequence of standardized sums of IID continuous random variables X_i, where the standardization is precisely that which preserves the variance. Then the entropy of the standardized sum S_n increases monotonically with n, until it reaches the maximum entropy possible under the variance constraint. This is the Gaussian entropy. In addition to giving the CLT the flavour of a "second law of thermodynamics" and motivating the appearance of the Gaussian as the limiting distribution, this approach gives convergence in the sense of relative entropy, which is stronger than the classical weak convergence.

The convergence in the entropic CLT was proved by Barron [Bar86], building on work of Linnik and Brown. Artstein, Ball, Barthe and Naor [ABBN04] obtained the monotonicity of entropy as a consequence of a new "entropy power inequality", proved using a functional analytic technique. In joint work with Andrew Barron [MB07], we gave a simpler proof of a more general inequality, elucidating along the way interesting connections to statistics and information theory.

Poisson approximation

A natural question is whether a similar compelling picture obtains for other limit laws, such as the "law of small numbers" (the Poisson limit law). Interestingly, however, the probability distribution on the non-negative integers that has maximal entropy when one fixes the mean is not the Poisson distribution, and the mean is the appropriate (approximate) invariant for sums that converge to a Poisson distribution. It turns out, however, that the Poisson distribution does have a maximum entropy property under the additional constraint of log-concavity, as shown by Johnson [Joh07] (see also Harremoes [Har01]). Furthermore, optimal-order Poisson approximation bounds can be proved for sums by developing the properties of a discrete analogue of Fisher information sometimes called the "scaled Fisher information", as shown by Kontoyiannis, Harremoes and Johnson [KHJ05]. In joint work with Ioannis Kontoyiannis and Oliver Johnson [MJK07], we added to this developing picture by demonstrating connections of the scaled Fisher information to minimum mean square estimation in the "Poisson channel" (which models certain communication systems); this was further developed in the more general compound Poisson setting described below.

Compound Poisson approximation

In one part of my Ph.D. dissertation, I explored the question of extending the above picture to a much more general situation: approximating the distribution of a sum of random variables taking non-negative integer values using a compound Poisson distribution. As in the simple Poisson case, it is easy to get relative entropy approximation bounds that are of suboptimal order for large numbers of summands, but much harder to get bounds of the right order. For the special case of independent summands, we proposed various notions of "local information functionals" that can play the role that Fisher information has in normal approximation, and used them to obtain approximation bounds in total variation and relative entropy. Those results have seen significant further development in collaboration with Andrew Barbour, Oliver Johnson, and Ioannis Kontoyiannis [BJKM10]. With a subset of these authors, we also demonstrated in [JKM08, JKM11] that there is a maximum entropy characterization of some compound Poisson distributions generalizing Johnson's maximum entropy characterization of the Poisson; this non-obvious extension also has some interesting connections to combinatorial questions.

References

[ABBN04]    S. Artstein, K. M. Ball, F. Barthe, and A. Naor. Solution of Shannon’s problem on the monotonicity of entropy. J. Amer. Math. Soc., 17(4):975–982, 2004.
[BJKM10]    A. D. Barbour, O. Johnson, I. Kontoyiannis and M. Madiman: "Compound Poisson Approximation via Information Functionals". Electronic Journal of Probability, 15, paper no. 42, pp. 1344-1368, 2010. [arXiv] [pdf/journal]
[Bar86]    A.R. Barron. Entropy and the central limit theorem. Ann. Probab., 14:336–342, 1986.
[Har01]    P. Harremoes. Binomial and Poisson distributions as maximum entropy distributions. IEEE Trans. Inform. Theory, 47(5):2039–2041, 2001.
[JKM08]    O. Johnson, I. Kontoyiannis and M. Madiman: "On the entropy and log-concavity of compound Poisson measures". Preprint not for publication, 2008. [arXiv] [pdf]
[JKM11]    O. Johnson, I. Kontoyiannis and M. Madiman: "Log-concavity, ultra-log-concavity, and a maximum entropy property of compound Poisson measures". Special issue for Proceedings of Jubilee Conference on Discrete Mathematics (JCDM 2009) edited by D. J. Kleitman, A. Shastri, V. T. Sós, Discrete Applied Mathematics, 2011. [arXiv] [pdf] [journal]
[Joh07]    O. Johnson. Log-concavity and the maximum entropy property of the Poisson distribution. Stochastic Process. Appl., Vol 117(6):791–802, 2007.
[KHJ05]    I. Kontoyiannis, P. Harremoes, and O. Johnson. Entropy and the law of small numbers. IEEE Trans. Inform. Theory, 51(2):466–472, February 2005.
[MB07]    M. Madiman and A. R. Barron: "Generalized Entropy Power Inequalities and Monotonicity Properties of Information". IEEE Transactions on Information Theory, 53, no. 7, pp.2317-2329, July 2007. [arXiv] [pdf] [journal]
[MB07]    M. Madiman, O. Johnson and I. Kontoyiannis: "Fisher Information, Compound Poisson approximation and the Poisson Channel". Proceedings of the 2007 IEEE International Symposium on Information Theory, Nice, France, July 2007. [pdf]

Read the relevant publications for more details, or go back to research summaries.