Computational Information Geometry

The application of geometry to statistical theory and practice has produced a number of different approaches. The first is the application of differential geometry to statistics, which is often called Information Geometry. It largely focuses on typically multivariate, invariant and higher-order asymptotic results in full and curved exponential families through the use of differential geometry and tensor analysis. Also included in this approach are consideration of curvature, dimension reduction and information loss. The second important, but completely separate, approach is in the inferentially demanding area of mixture modelling where convex geometry is shown to give great insight into the fundamental problems of inference in these models and to help in the design of corresponding algorithms. The third approach is the geometric study of graphical models, contingency tables, (hierarchical) log-linear models, and related topics involving the geometry of extended exponential families. Important results are already established mainly in the field known as Algebraic statistics. In practice, a single statistical problem can involve more than one of the above geometries – potentially all three – this plurality should be handled naturally in our unifying framework. The key idea is to represent statistical models – sample spaces, together with probability distributions on them – and associated inference problems, inside adequately large but finite dimensional spaces. In these embedding spaces the building blocks of the three geometries described above can be numerically computed explicitly and the results used for algorithm development.