This page is about drawing a density histogram and overlaying a theoretical density. We will see how to do these things by doing an example in which we:

  1. Simulate a many draws from an Exponential distribution with rate parameter 3.
  2. Draw a density histogram for the resulting data.
  3. On top of the histogram plot, overlay the theoretical exponential probability density function, that is, \(f(t) = 3e^{-3t}\) for \(t>0\) (and 0 for \(t\le 0\)). This should coincide closely with the density histogram.

The whole example in one little chunk

Here it is in just 4 lines – one line to load the MASS package (to be able to use truehist, and then one line for each of the 3 steps listed above:

library(MASS)
mysample <- rexp(n = 100000, rate = 3)
truehist(mysample)
curve(3*exp(-3*x), from = 0, to = 3, add=TRUE, col='red', lwd=1.5)

We generated 100,000 points from the Exp(3) distribution, drew a density histogram in blue, and overlaid the pdf for the Exp(3) distribution in red.

A slow, detailed development and explanation of the code

First a note about plotting functions

Probably the most convenient way to plot a function of a single variable is with R’s curve function. You give it a formula to plot, as a function of a variable “x”, and a range to plot over. For example, to plot \(f(x)=\sin(3x+5)\) from \(x=-2\) to \(x=12\):

curve(sin(3*x+5), -2, 12)

Note you need to write it as above even if you are thinking of another variable in your head, like \(f(t)=\sin(3t+5)\), so the following won’t work:

curve(sin(3*t+5), -2, 12)
## Error in curve(sin(3 * t + 5), -2, 12): 'expr' must be a function, or a call or an expression containing 'x'

See, we got an error. curve is set up so it needs a function of “x”, not “t” or any other letter!

So we can plot the Exp(3) density this way:

curve(3 * exp(-3*x), 0, 5)

Simulate some data.

We’ll use the rexp function to sample random draws from an exponential density.

mysample <- rexp(n = 100000, rate = 3)

OK, that was easy. Now we are ready to draw a histogram of the data in mysample.

Draw a density histogram

The truehist function in the MASS library gives a nice density histogram:

truehist(mysample)

Overlay the exponential density curve on the histogram

We use the curve command with argument add=TRUE to say that we want to add the curve to the current plot already drawn (the histogram), not start a new plot.

truehist(mysample)
curve(3*exp(-3*x), from = 0, to = 3, add=TRUE)

That’s nice agreement! The density histogram of the randomly distributed data indeed does match the theoretical probability density function closely.

To be able to see the f curve better, let’s make it red, and also a bit thicker with the argument lwd=1.5 (lwd is for “line width”):

truehist(mysample)
curve(3*exp(-3*x), from = 0, to = 3, add=TRUE, col='red', lwd=1.5)