# Comparison of Two Means

In many cases, a researcher is interesting in gathering information about
two populations in order to compare them. As in statistical inference for
one population parameter, confidence intervals
and tests of significance are useful statistical
tools for the difference between two population parameters.

## Confidence Interval for the Difference Between Two Means

A confidence interval for the difference between two means specifies a
range of values within which the difference between the means of the
two populations may lie. These intervals may be calculated by, for example,
a producer who wishes to estimate the difference in mean daily output
from two machines; a medical researcher who wishes to estimate the
difference in mean response by patients who are receiving two
different drugs; etc.
The confidence interval for the difference between two means contains
all the values of ( - )
(the difference between the two population means) which would not be
rejected in the two-sided hypothesis test of

*H*_{0}: =
against
*H*_{a}:
, i.e.

*H*_{0}: -
= 0 against
*H*_{a}: -
0.
If the confidence interval includes 0 we can say that there is no significant difference
between the means of the two populations, at a given level of confidence.

(*Definition taken from Valerie J. Easton and John H. McColl's
Statistics Glossary v1.1*)

## Tests of Significance for Two Unknown Means and Known Standard Deviations

Given samples from two normal populations of size *n*_{1} and *n*_{2} with
unknown means and and
known standard deviations and
, the test statistic comparing the means is known as
the **two-sample z statistic**

which has the standard normal distribution (*N(0,1)*).
The null hypothesis always assumes that the means are equal, while the alternative hypothesis
may be one-sided or two-sided.

## Tests of Significance for Two Unknown Means and Unknown Standard Deviations

In general, the population standard deviations are not known, and are estimated by the
calculated values *s*_{1} and *s*_{2}. In this case, the test statistic
is defined by the **two-sample t statistic**
.

**Although the two-sample statistic does not exactly follow the ***t* distribution (since
two standard deviations are estimated in the statistic), conservative *P-values* may be
obtained using the *t(k)* distribution where *k* represents the *smaller* of
*n*_{1}-1 and *n*_{2}-1. Another option is to estimate the degrees of
freedom via a calculation from the data, which is the general method used by statistical software
such as MINITAB.
**The confidence interval for the difference in means -
is given by
**

**
where ***t*^{*} is the upper (1-*C*)/2 critical value for the *t* distribution
with *k* degrees of freedom (with *k* equal to either the smaller of *n*_{1}-1 and
*n*_{1}-2 or the calculated degrees of freedom).

### Example

The dataset "Normal Body Temperature, Gender, and Heart Rate" contains 130 observations of
body temperature, along with the gender of each individual and his or her heart rate. In the
dataset, the first column gives body temperature and the second column gives the value "1" (male)
or "2" (female) to describe the gender of each subject. Using the
MINITAB "DESCRIBE" command with the "BY" subcommand to separate the two genders provides the
following information:
Descriptive Statistics
Variable C2 N Mean Median Tr Mean StDev SE Mean
C1 1 65 98.105 98.100 98.114 0.699 0.087
2 65 98.394 98.400 98.390 0.743 0.092
Variable C2 Min Max Q1 Q3
C1 1 96.300 99.500 97.600 98.600
2 96.400 100.800 98.000 98.800

Is there a significant difference between the mean body temperatures for men and women?
To test *H*_{0}: -
= 0 against
*H*_{a}: -
0,
compute the test statistic (98.105 - 98.394)/(sqrt(0.699²/65 + 0.743²/65))
= -0.289/0.127 = -2.276. Using the *t(64)* distribution, estimated in Table E in Moore
and McCabe by the *t(60)* distribution, we see that 2*P(t*__>__2.276) is between 0.04 and 0.02, indicating
a significant difference between the means at the 0.05 level (although not at the 0.01 level).
To compute a 95% confidence interval, we first note that the 0.025 critical value *t*^{*}
for the *t(60)* distribution is 2.000, giving the interval ((98.105 - 98.394) __+__
2.000*0.127) = (-0.289 - 0.254, -0.289 + 0.254) = (-0.543, -0.045). The value 0 is not included in
the interval, again indicating a significant difference at the 0.05 level.

Performing this test in MINITAB using the "TWOT" command gives the results

Two Sample T-Test and Confidence Interval
Two sample T for C1
C2 N Mean StDev SE Mean
1 65 98.105 0.699 0.087
2 65 98.394 0.743 0.092
95% CI for mu (1) - mu (2): ( -0.540, -0.039)
T-Test mu (1) = mu (2) (vs not =): T= -2.29 P=0.024 DF= 127

Although the MINITAB calculated degrees of freedom (127) are much higher than the conservative
estimate of 64, we see that the results are much the same.
*Data source: Data presented in Mackowiak, P.A., Wasserman, S.S., and Levine, M.M. (1992),
"A Critical Appraisal of 98.6 Degrees F, the Upper Limit of the Normal Body Temperature, and
Other Legacies of Carl Reinhold August Wunderlich," *__Journal of the American Medical
Association__, 268, 1578-1580. Dataset available through the
JSE Dataset Archive.

## Pooled *t* Procedures

If it reasonable to assume that two populations have the same standard deviation, than an
alternative procedure known as the **pooled t procedure** may be used instead of the
general two-sample *t* procedure. Since only one standard deviation is to be estimated in
this case, the resulting test statistic will exactly follow a *t* distribution with
*n*_{1} + n_{2} - 2 degrees of freedom. **The ****pooled estimator of
the variance**
is used in the **pooled two-sample
t statistic**
which has a *t(n*_{1} +
n_{2} -2) distribution.

### Example

In the body temperature example above, the sample standard deviations for the male and female
subjects are reasonable close. Using the MINITAB subcommand "POOLED" with the two-sample
*t* test gives the following results:
Two Sample T-Test and Confidence Interval
Two sample T for C1
C2 N Mean StDev SE Mean
1 65 98.105 0.699 0.087
2 65 98.394 0.743 0.092
95% CI for mu (1) - mu (2): ( -0.540, -0.039)
T-Test mu (1) = mu (2) (vs not =): T= -2.29 P=0.024 DF= 128
Both use Pooled StDev = 0.721

The test results were nearly identical in this case.