Statistics 251/551 (Spring 2013)[Back to home page]
This page will be updated as I think of other possible projects.
I am also open to suggestions for other possible projects.
Last revised: 1 April 2013
Most projects should involve about the equivalent of two or three homeworks' worth of work.
The write-up should not be more than 20 pages.
At most two persons may work together on a single project.
- I need to figure out some systematic way of adjusting credit for harder, more ambitious projects as opposed to more routine exercises. At the moment, I intend just to assign some crude "degree of difficulty" to each proposed project.
- Consult with DP about the scope of the project before
the end of the first week of April. In particular, you must
arrange beforehand if you want the project to count for a larger
proportion of your final grade.
- You must properly cite all materials that you use for the project.
You must also acknowledge anyone (apart from DP) who helps you with the project.
- Grades for the course are due by May 9, 2013 (for seniors),
May 14, 2013 (for other undergraduates), and
May 10 f(or candidates for graduate degrees).
It will probably take me more than a week to read carefully through all the projects
before the deadline for grades.
You should plan on handing in your final draft before the start of reading period.
Here are the topics that students have already chosen: project choices
Some possible projects
Random walks and electric networks
There are close connections between Markov chains and the study of the flow of
electricity in networks of resistors. In the past, several students have found interesting
topics in the manuscript Random walks and electric networks by
Peter G. Doyle J. Laurie Snell (available for free
A possible project: Write a self-contained account of the probabilistic proof of
Rayleigh's Monotonicity Law (Section 1.4). Try to use notation and ideas from the Stat 251/551 course.
The algorithm provides a very clever Markov chain method to generate observations
from a stationary distrbution by "coupling from the past". An exposition of a nontrivial application would make a good project.
I would suggest you start by reading Chapters 10 through 12 (only 22 pages) of Häggström's Finite Markov Chains and Algorithmic Applications
[ORBIS]. Then move on to one of the papers listed at the
Web Site for Perfectly Random Sampling with Markov Chains. The paper
An interruptible algorithm for perfect sampling via Markov chains by James A. Fill (The Annals of Applied Probability, 8(1):131--162, 1998) seems promising.
Markov chain algorithms
Algorithms based on Markov chains have become a growth indutry in theoretical Computer Science.
See, for example, the web site of
I am most familiar with
The analysis of a list-coloring algorithm on a random graph
by Dimitris Achlioptas and Michael Molloy (
Foundations of Computer Science, 1997. Proceedings., 38th Annual Symposium on
A summary of the ideas in this paper (translated into Stat 251/551 notation), with perhaps
some sketches of proofs of the main ideas, would be a fairly challenging project.
Write out a complete explanation of the argument
for the upper bound on total variation distance from uniformity for the riffle shuffle.
- Start by reading Chang Notes Section 2.8 then read Aldous & Diaconis (1986),
Shuffling cards and stopping times,
American Mathematical Monthly 93, 333--348
- For a harder project, also explain the role of a-shuffles
in finding the exact total variation distance.
Reference: Mann, "How many times should you shuffle a deck of cards?",
- For a much harder project, explain the duality construction of strong stationary times.
Additional reference: Diaconis & Fill (1990),
times via a new form of duality,
Annals of Probability 18, 1483--1522. [JSTOR]
Hidden Markov models
See Chang (Chap 3).
The description at the end of a handout EM.pdf from 2004 describes a possible project.
See also the discussion paper The EM Algorithm--An Old Folk-Song Sung to a Fast New
Tune by Xiao-Li Meng and David van Dyk (J. Royal
Statistical Society B 59 (1997) 511-567) and the references it cites.
Chang (Section 2.5) described the general method. A computational exploration
of a nontrivial application (not the one described by Chang) of the method would make
a relatively easy project. I would be expecting to see some serious attention paid
to the choice of "cooling schedule".
Write a detailed technical review of:
Besag, "On the statistical
analysis of dirty pictures", Journal of the Royal Statistical Society,
Series B, vol 48 (1986) pp 259--302.
Geman and Geman, "Stochastic relaxation, Gibbs distributions, and the
Bayesian restoration of images", IEEE-PAMI, 6, 1984, 721-741. (ask DP for paper or download from
S. Geman website)
MCMC in Bayesian statistics
Write a short account of resampling methods for calculating Bayesian
posterior distributions, illustrated by some calculations for an
- Smith and Gelfand, "Bayesian Statistics without tears: a
sampling-resampling perspective", American Statistician 46 (1992), pp
- Smith and Roberts, "Bayesian computation via the Gibbs sampler and
related Markov chain Monte Carlo methods", Journal of the Royal
Series B, vol 55 (1993) pp 3--23.
- (See also two other papers in JRSSB vol 55 on MCMC topic, followed by
fifty pages of discussion.)
- Boundary Detection by Constrained Optimization
by Donald Geman, Stuart Geman, Christine Graffigne, And
Ping Dong (IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE
VOL. 12. NO. 7, JULY 1990 609) [available at S. Geman web site]
Give a clear exposition of the Princess/Secretary problem,
as described on the optimal stopping handout.
There is no need to rederive results proved in class.
You should explain the connections between the supermartingale (Snell envelope)
and Markov chain
approaches (as described in the Billingsley text.)
Alternatively you could apply the supermartingale
approach to the pricing of an American option or some other optimality problem.
Message passing algorithms on graphs
I have not yet decided how much to say in class about Markov random fields (Chang Chap 3).
Nevertheless, I am sure I want be saying everything there is to know about message passing. The paper
Graphical Models by
Michael I. Jordan (Statistical Science, Vol. 19, No. 1 (Feb., 2004), pp. 140-155) would be a good place to start looking. If you are interested in the topic, try exploring
Jordan's web site.
Option pricing via arbitrage arguments
Some of you have expressed an interest in this topic even though
it is a potentially dangerous project. I am slightly
fearful, because I have never heard of some of the options mentioned
as possible topics. I don't know how much math is needed to run the
Nevertheless, I am prepared to let you try. Some of the following
might be helpful.
- Wilmot, Howison, and Dewynne The Mathematics of Financial
Derivatives: A student introduction. [I find this book
quite helpful. It reduces many problems to the partial differential
equations (not one of my strengths). Nothing about Girsanov or
- Duffie Dynamic Asset Pricing Theory. [Written slightly
above the level of Stat 251.]
- There are many web sites containing notes for courses on
stochastic calculus or finance. See, for example, Per Myklund's pages
for two courses, Stat 390 and
Markov chains on general state spaces
A hard project, require some understanding of measure theoretic probability (at the level of Stat 330/600). Explain how the theory for countable state spaces (as developed in 251/551)
can be extended to more general state spaces. References:
- Steven Orey, "Limit theorems for Markov chain transition probabilities". (My copy was published by Van Nostrand in 1971.)
[Very concise. Contains many of the main ideas, but without the recent refinements.]
- Esa Nummelin, "General irreducible Markov chains and non-negative operators",
Cambridge University Press 1984. (Paperback 2004.)
[Less concise than Orey. Describes the splitting technique for creating an artificial atom.]
- S.P. Meyn and R.L. Tweedie, "Markov chains and stochastic stability", Springer 1993.
[Clear but it takes a lot of reading to reach the main ideas.
Many examples. I started with this book then moved back to Nummelin then Orey.]
- Persi Diaconis & David Freedman,
Technical reports from http://www.stat.berkeley.edu/tech-reports/index.html:
501. (December 1, 1997)
On Markov Chains with Continuous State Space and 497. (November 24, 1997)
On the Hit & Run Process.