Projects

This page will be updated as I think of other possible projects.

I am also open to suggestions for other possible projects.

Last revised: 1 April 2013

Most projects should involve about the equivalent of two or three homeworks' worth of work. The write-up should not be more than 20 pages.
At most two persons may work together on a single project.
I need to figure out some systematic way of adjusting credit for harder, more ambitious projects as opposed to more routine exercises. At the moment, I intend just to assign some crude "degree of difficulty" to each proposed project.
Consult with DP about the scope of the project before the end of the first week of April. In particular, you must arrange beforehand if you want the project to count for a larger proportion of your final grade.
You must properly cite all materials that you use for the project. You must also acknowledge anyone (apart from DP) who helps you with the project.
Grades for the course are due by May 9, 2013 (for seniors), May 14, 2013 (for other undergraduates), and May 10 f(or candidates for graduate degrees). It will probably take me more than a week to read carefully through all the projects before the deadline for grades. You should plan on handing in your final draft before the start of reading period.

Here are the topics that students have already chosen: project choices

Some possible projects

Random walks and electric networks
There are close connections between Markov chains and the study of the flow of electricity in networks of resistors. In the past, several students have found interesting topics in the manuscript Random walks and electric networks by Peter G. Doyle J. Laurie Snell (available for free online).
A possible project: Write a self-contained account of the probabilistic proof of Rayleigh's Monotonicity Law (Section 1.4). Try to use notation and ideas from the Stat 251/551 course.
Propp-Wilson algorithm
The algorithm provides a very clever Markov chain method to generate observations from a stationary distrbution by "coupling from the past". An exposition of a nontrivial application would make a good project.
I would suggest you start by reading Chapters 10 through 12 (only 22 pages) of Häggström's Finite Markov Chains and Algorithmic Applications [ORBIS]. Then move on to one of the papers listed at the Web Site for Perfectly Random Sampling with Markov Chains. The paper An interruptible algorithm for perfect sampling via Markov chains by James A. Fill (The Annals of Applied Probability, 8(1):131--162, 1998) seems promising.

Markov chain algorithms
Algorithms based on Markov chains have become a growth indutry in theoretical Computer Science. See, for example, the web site of Michael Molloy. I am most familiar with The analysis of a list-coloring algorithm on a random graph by Dimitris Achlioptas and Michael Molloy ( Foundations of Computer Science, 1997. Proceedings., 38th Annual Symposium on [WWW]). A summary of the ideas in this paper (translated into Stat 251/551 notation), with perhaps some sketches of proofs of the main ideas, would be a fairly challenging project.

Card shuffling
Write out a complete explanation of the argument for the upper bound on total variation distance from uniformity for the riffle shuffle.
- Start by reading Chang Notes Section 2.8 then read Aldous & Diaconis (1986), Shuffling cards and stopping times, American Mathematical Monthly 93, 333--348 [JSTOR].
- For a harder project, also explain the role of a-shuffles in finding the exact total variation distance.
  Reference: Mann, "How many times should you shuffle a deck of cards?", [ WWW]
- For a much harder project, explain the duality construction of strong stationary times.
  Additional reference: Diaconis & Fill (1990), Strong stationary times via a new form of duality, Annals of Probability 18, 1483--1522. [JSTOR]

Hidden Markov models
See Chang (Chap 3). The description at the end of a handout EM.pdf from 2004 describes a possible project. See also the discussion paper The EM Algorithm--An Old Folk-Song Sung to a Fast New Tune by Xiao-Li Meng and David van Dyk (J. Royal Statistical Society B 59 (1997) 511-567) and the references it cites.

Simulated annealing
Chang (Section 2.5) described the general method. A computational exploration of a nontrivial application (not the one described by Chang) of the method would make a relatively easy project. I would be expecting to see some serious attention paid to the choice of "cooling schedule".

Image analysis
Write a detailed technical review of:
Besag, "On the statistical analysis of dirty pictures", Journal of the Royal Statistical Society, Series B, vol 48 (1986) pp 259--302.
or of:
Geman and Geman, "Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images", IEEE-PAMI, 6, 1984, 721-741. (ask DP for paper or download from S. Geman website)

MCMC in Bayesian statistics
Write a short account of resampling methods for calculating Bayesian posterior distributions, illustrated by some calculations for an example. References:
- Smith and Gelfand, "Bayesian Statistics without tears: a sampling-resampling perspective", American Statistician 46 (1992), pp 84--88.
- Smith and Roberts, "Bayesian computation via the Gibbs sampler and related Markov chain Monte Carlo methods", Journal of the Royal Statistical Society, Series B, vol 55 (1993) pp 3--23.
- (See also two other papers in JRSSB vol 55 on MCMC topic, followed by fifty pages of discussion.)
- Boundary Detection by Constrained Optimization by Donald Geman, Stuart Geman, Christine Graffigne, And Ping Dong (IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. VOL. 12. NO. 7, JULY 1990 609) [available at S. Geman web site]

Optimal stopping
Give a clear exposition of the Princess/Secretary problem, as described on the optimal stopping handout.
There is no need to rederive results proved in class. You should explain the connections between the supermartingale (Snell envelope) and Markov chain approaches (as described in the Billingsley text.)

Alternatively you could apply the supermartingale approach to the pricing of an American option or some other optimality problem.
Message passing algorithms on graphs
I have not yet decided how much to say in class about Markov random fields (Chang Chap 3). Nevertheless, I am sure I want be saying everything there is to know about message passing. The paper Graphical Models by Michael I. Jordan (Statistical Science, Vol. 19, No. 1 (Feb., 2004), pp. 140-155) would be a good place to start looking. If you are interested in the topic, try exploring Jordan's web site.
Option pricing via arbitrage arguments
Some of you have expressed an interest in this topic even though it is a potentially dangerous project. I am slightly fearful, because I have never heard of some of the options mentioned as possible topics. I don't know how much math is needed to run the arbitrage arguments. Nevertheless, I am prepared to let you try. Some of the following might be helpful. References:
- Wilmot, Howison, and Dewynne The Mathematics of Financial Derivatives: A student introduction. [I find this book quite helpful. It reduces many problems to the partial differential equations (not one of my strengths). Nothing about Girsanov or martingale measures.]
- Duffie Dynamic Asset Pricing Theory. [Written slightly above the level of Stat 251.]
- There are many web sites containing notes for courses on stochastic calculus or finance. See, for example, Per Myklund's pages for two courses, Stat 390 and Stat 391.

Markov chains on general state spaces
A hard project, require some understanding of measure theoretic probability (at the level of Stat 330/600). Explain how the theory for countable state spaces (as developed in 251/551) can be extended to more general state spaces. References:
- Steven Orey, "Limit theorems for Markov chain transition probabilities". (My copy was published by Van Nostrand in 1971.) [Very concise. Contains many of the main ideas, but without the recent refinements.]
- Esa Nummelin, "General irreducible Markov chains and non-negative operators", Cambridge University Press 1984. (Paperback 2004.) [Less concise than Orey. Describes the splitting technique for creating an artificial atom.]
- S.P. Meyn and R.L. Tweedie, "Markov chains and stochastic stability", Springer 1993. [Clear but it takes a lot of reading to reach the main ideas. Many examples. I started with this book then moved back to Nummelin then Orey.]
- Persi Diaconis & David Freedman, Technical reports from http://www.stat.berkeley.edu/tech-reports/index.html: 501. (December 1, 1997) On Markov Chains with Continuous State Space and 497. (November 24, 1997) On the Hit & Run Process.