# Sampling

Since it is generally impossible to study an entire population (every individual in a country, all college students, every geographic area, etc.), researchers typically rely on sampling to acquire a section of the population to perform an experiment or observational study. It is important that the group selected be representative of the population, and not biased in a systematic manner. For example, a group comprised of the wealthiest individuals in a given area probably would not accurately reflect the opinions of the entire population in that area. For this reason, randomization is typically employed to achieve an unbiased sample. The most common sampling designs are simple random sampling, stratified random sampling, and multistage random sampling.

### Simple Random Sampling

Simple random sampling is the basic sampling technique where we select a group of subjects (a sample) for study from a larger group (a population). Each individual is chosen entirely by chance and each member of the population has an equal chance of being included in the sample. Every possible sample of a given size has the same chance of selection.
(Definition taken from Valerie J. Easton and John H. McColl's Statistics Glossary v1.1)

### Stratified Random Sampling

There may often be factors which divide up the population into sub-populations (groups / strata) and we may expect the measurement of interest to vary among the different sub-populations. This has to be accounted for when we select a sample from the population in order that we obtain a sample that is representative of the population. This is achieved by stratified sampling.

A stratified sample is obtained by taking samples from each stratum or sub-group of a population.

When we sample a population with several strata, we generally require that the proportion of each stratum in the sample should be the same as in the population.

Stratified sampling techniques are generally used when the population is heterogeneous, or dissimilar, where certain homogeneous, or similar, sub-populations can be isolated (strata). Simple random sampling is most appropriate when the entire population from which the sample is taken is homogeneous. Some reasons for using stratified sampling over simple random sampling are:

a) the cost per observation in the survey may be reduced;
b) estimates of the population parameters may be wanted for each sub-population;
c) increased accuracy at given cost.

Example

Suppose a farmer wishes to work out the average milk yield of each cow type in his herd which consists of Ayrshire, Friesian, Galloway and Jersey cows. He could divide up his herd into the four sub-groups and take samples from these.
(Definition and example taken from Valerie J. Easton and John H. McColl's Statistics Glossary v1.1)

### Multistage Random Sampling

A multistage random sample is constructed by taking a series of simple random samples in stages. This type of sampling is often more practical than simple random sampling for studies requiring "on location" analysis, such as door-to-door surveys. In a multistage random sample, a large area, such as a country, is first divided into smaller regions (such as states), and a random sample of these regions is collected. In the second stage, a random sample of smaller areas (such as counties) is taken from within each of the regions chosen in the first stage. Then, in the third stage, a random sample of even smaller areas (such as neighborhoods) is taken from within each of the areas chosen in the second stage. If these areas are sufficiently small for the purposes of the study, then the researcher might stop at the third stage. If not, he or she may continue to sample from the areas chosen in the third stage, etc., until appropriately small areas have been chosen.