Sampling Techniques

In order to collect unbiased data, it is important that the sample be representative of the population.

When a study is done with faulty data, the results are questionable.

Usually only a part of the population can be analyzed.

How do you choose your sample?

The process is called sampling.

Random Sample: Each member of the population has an equal chance of being selected.

Simple Random Sample: when every possible sample of size n out of a population of N has an equally likely chance of occurring

Assign a number to each member of the population.

A simple (but not foolproof) method

Write each individual’s name on a separate piece of paper

Put all the papers into a hat

Draw a random paper from the hat

Physical methods have some issues

Are the papers sufficiently mixed?

Are some of the papers folded?

Assign a number to each member of the population.

Random numbers can be generated by a random number table, software program or a calculator.

Data from members of the population that correspond to these numbers become members of

the sample.

Simple random sampling requires that we have a list of all the individuals within a population

This list is called a frame

If we do not have a frame, then a different sampling method must be used

There are other effective ways to collect data

Stratified sampling

Systematic sampling

Cluster sampling

Each of these is particularly appropriate in certain specific circumstances

A stratified sample is obtained when we choose a simple random sample from subgroups of a population

This is appropriate when the population is made up of nonoverlapping (distinct) groups called strata

Within each strata, the individuals are likely to have a common attribute

Between the stratas, the individuals are likely to have different common attributes

Example – polling a population about a political issue

It is reasonable to divide up the population into Democrats, Republicans, and Independents

It is reasonable to believe that the opinions of individuals within each party are the same

It is reasonable to believe that the opinions differ from group to group

Therefore it makes sense to consider each strata separately

A systematic sample is obtained when we choose every kth individual in a population

The first individual selected corresponds to a random number between 1 and k

Systematic sampling is appropriate

When we do not have a frame

When we do not have a list of all the individuals in a population

Example – polling customers about satisfaction with service

We do not have a list of customers arriving that day

We do not even know how many customers will arrive that day

Simple random sampling (and stratified sampling) cannot be implemented

A cluster sample is obtained when we choose a random set of groups and then select all individuals within those groups

We can obtain a sample of size 50 by choosing 10 groups of 5

Cluster sampling is appropriate when it is very time consuming or expensive to choose the individuals one at a time

Example – testing the fill of bottles

It is time consuming to pull individual bottles

It is expensive to waste an entire cartons of 12 bottles to just test one bottle

If we would like to test 240 bottles, we could

Randomly select 20 cartons

Test all 12 bottles within each carton

This reduces the time and expense required

Convenience sampling often leads to a biased study since it consists of only available people.

Convenience sampling has little statistical validity

The design is poor

The results are suspect

However, there are times when convenience sampling could be useful as a rough guess

If none of your co-workers are concerned about a particular issue, then it is possible that the set of all employees would not be concerned about that issue