### Yale University

Department of Statistics

Seminar

#### Friday, March 22, 1996

David van Dyk

Department of Mathematics

Kalamazoo College

#### Augmenting Data Wisely to Speed up the EM Algorithm

This talk is a preview of a tentatively scheduled reading paper for
JRSS-B titled ``The EM Algorithm --- An Old Folk Song Sung To A Fast New
Tune" (Meng and van Dyk, 1996), which has the following abstract:
It has been (almost) 20 years since the publication of Dempster,
Laird, and Rubin (1977), the paper that popularized the EM algorithm. To
celebrate this event, after a brief historical account, we discuss strategies
that aim to make EM converge much faster while maintaining its simplicity and
stability (e.g., monotone convergence in likelihood. First, we introduce the
idea of constructing efficient data-augmentation schemes and thus, fast EM
implementations, by minimizing the augmented-data Fisher information as a
function of a working parameter that indexes a class of data-augmentation
schemes. Second, we establish a theoretical framework for a general EM-type
algorithm that couples flexible data augmentation and model reduction to achieve
more efficient computations, in terms of computer and human time. This general
algorithm, which we term the Alternating Expectation/Conditional Maximization
(AECM) algorithm, uses conditional maximization to simplify the M-step of EM
and, at the same time, allows the data-augmentation scheme to vary within and
between iterations which can substantially improve the rate of convergence. We
illustrate these methods using multivariate t-models with known or unknown
degrees of freedom and Poisson models for image reconstruction. We show, through
both empirical and theoretical evidence, the potential for a dramatic reduction
in computational time with little increase in human effort. The dramatic
reduction (e.g., by a factor of 100) in computational time was also found with
random-effects models, an application we present in an accompanying paper due to
space limitations. We also discuss the intrinsic connection between EM-type
algorithms and the Gibbs sampler, and how our techniques may provide useful
insight for speeding up the latter. The main conclusion of the paper is that it
is possible to construct algorithms that are simple, stable, and fast, a
triplet that is appreciated by every user.
The talk will focus on t-models and random-effects models and
demonstrate that it is possible to have a ``free lunch" -- algorithms that are
simple, stable, and fast.

#### Seminar to be held in Room 107, 24 Hillhouse @ 11:00 am