The binomial distribution describes the behavior of a count variable X if the following conditions apply:
Example
Suppose individuals with a certain gene have a 0.70 probability of eventually contracting a certain disease. If 100 individuals with the gene participate in a lifetime study, then the distribution of the random variable describing the number of individuals who will contract the disease is distributed B(100,0.7).
Note: The sampling distribution of a count variable is only well-described by the binomial distribution is cases where the population size is significantly larger than the sample size. As a general rule, the binomial distribution should not be applied to observations from a simple random sample (SRS) unless the population size is at least 10 times larger than the sample size.
To find probabilities from a binomial distribution, one may either calculate them directly, use a binomial table, or use a computer. The number of sixes rolled by a single die in 20 rolls has a B(20,1/6) distribution. The probability of rolling more than 2 sixes in 20 rolls, P(X>2), is equal to 1 - P(X<2) = 1 - (P(X=0) + P(X=1) + P(X=2)). Using the MINITAB command "cdf" with subcommand "binomial n=20 p=0.166667" gives the cumulative distribution function as follows:
Binomial with n = 20 and p = 0.166667 x P( X <= x) 0 0.0261 1 0.1304 2 0.3287 3 0.5665 4 0.7687 5 0.8982 6 0.9629 7 0.9887 8 0.9972 9 0.9994The corresponding graphs for the probability density function and cumulative distribution function for the B(20,1/6) distribution are shown below:
Since the probability of 2 or fewer sixes is equal to 0.3287, the probability of rolling more than 2 sixes = 1 - 0.3287 = 0.6713.
The probability that a random variable X with binomial distribution B(n,p) is
equal to the value k, where k = 0, 1,....,n , is given by
, where .
The latter expression is known as the binomial coefficient, stated as
"n choose k," or the number of possible ways to choose k "successes"
from n observations. For example, the number of ways to achieve
2 heads in a set of four tosses is "4 choose 2", or 4!/2!2! = (4*3)/(2*1) =
6. The possibilities are {HHTT, HTHT, HTTH, TTHH, THHT, THTH}, where "H" represents
a head and "T" represents a tail. The binomial coefficient multiplies the probability
of one of these possibilities (which is (1/2)²(1/2)² = 1/16 for a fair coin)
by the number of ways the outcome may be achieved, for a total probability of 6/16.
These definitions are intuitively logical. Imagine, for example, 8 flips of a coin. If the coin is fair, then p = 0.5. One would expect the mean number of heads to be half the flips, or np = 8*0.5 = 4. The variance is equal to np(1-p) = 8*0.5*0.5 = 2.
In the example of rolling a six-sided die 20 times, the probability p of rolling a six on any roll is 1/6, and the count X of sixes has a B(20, 1/6) distribution. The mean of this distribution is 20/6 = 3.33, and the variance is 20*1/6*5/6 = 100/36 = 2.78. The mean of the proportion of sixes in the 20 rolls, X/20, is equal to p = 1/6 = 0.167, and the variance of the proportion is equal to (1/6*5/6)/20 = 0.007.
Note: Because the normal approximation is not accurate for small values of n, a good rule of thumb is to use the normal approximation only if np>10 and np(1-p)>10.
For example, consider a population of voters in a given state. The true proportion of voters who favor candidate A is equal to 0.40. Given a sample of 200 voters, what is the probability that more than half of the voters support candidate A?
The count X of voters in the sample of 200 who support candidate A is distributed B(200,0.4). The mean of the distribution is equal to 200*0.4 = 80, and the variance is equal to 200*0.4*0.6 = 48. The standard deviation is the square root of the variance, 6.93. The probability that more than half of the voters in the sample support candidate A is equal to the probability that X is greater than 100, which is equal to 1- P(X< 100).
To use the normal approximation to calculate this probability, we should first acknowledge that the normal distribution is continuous and apply the continuity correction. This means that the probability for a single discrete value, such as 100, is extended to the probability of the interval (99.5,100.5). Because we are interested in the probability that X is less than or equal to 100, the normal approximation applies to the upper limit of the interval, 100.5. If we were interested in the probability that X is strictly less than 100, then we would apply the normal approximation to the lower end of the interval, 99.5.
So, applying the continuity correction and standardizing the variable X gives the following:
1 - P(X< 100)
= 1 - P(X< 100.5)
= 1 - P(Z< (100.5 - 80)/6.93)
= 1 - P(Z< 20.5/6.93)
= 1 - P(Z< 2.96) = 1 - (0.9985) = 0.0015. Since the value 100 is nearly three
standard deviations away from the mean 80, the probability of observing a count this high is
extremely small.