# Stem and Leaf Plot

A stem and leaf plot is a way of summarizing a set of data measured on
an interval scale. It is often used in exploratory data analysis to
illustrate the major features of the distribution of the data in a
convenient and easily drawn form. A stem and leaf plot is similar to a
histogram but is usually a more informative display for relatively small
data sets (<100 data points). It provides a table as well as a picture
of the data and from it we can readily write down the data in order of
magnitude, which is useful for many statistical procedures.
(*Definition taken from Valerie J. Easton and John H. McColl's
Statistics Glossary v1.1*)

### Example

The following data represent measurements of carbon monoxide content (in mg) for 25 brands of
cigarettes:

13.6, 16.6, 23.5, 10.2, 5.4, 15.0, 9.0, 12.3, 16.3, 15.4, 13.0, 14.4, 10.0,
10.2, 9.5, 1.5, 18.5, 12.6, 17.5, 4.9, 15.9, 8.5, 10.6, 13.9, 14.9.
A MINITAB stemplot for this data (created using the
"STEM" command) is shown to the left. MINITAB first truncates the data by rounding down
to integers, then sorts the data. The resulting dataset is the following:

1, 4, 5, 8, 9, 9, 10, 10, 10, 10, 12, 12, 13, 13, 13, 14, 14, 15, 15, 15, 16, 16, 17, 18,
23.

The first column of the MINITAB stemplot counts the number of values from the top down
and from the bottom up to the middle value (the median).
The number in parantheses represents the count of values in the row containing
the median, which is the thirteenth ordered value in this example, 13.0.

The second column plots the stems, tens of milligrams of carbon monoxide content. Because
the range of the data is small (the values for the stems are 0, 1, and 2), MINITAB divides
the third column, which plots milligrams as leaves, into fifths. In other words,
the first row includes the leaf values 0 and 1, the second row includes the
leaf values 2 and 3, etc. Since there are
no values between 1.5 and 4.9, the second row contains no data points. The stemplot illustrates
that the majority of the measurements lie in the teens, with only 6 of the 25 values less than
10 and only 1 value greater than 20.

*
Data source: Mendenhall, William, and Sincich, Terry (1992), *__Statistics for
Engineering and the Sciences__ (3rd ed.), New York: Dellen Publishing
Co. (ISBN: 0 02380552 8) (Original source: Federal Trade Commission, USA)
Dataset available through the
JSE Dataset Archive.

RETURN TO MAIN PAGE.