Two-sample hypothesis testing are used to test if there is a difference between two means from two different populations.

For example, a two-sample hypothesis could be used to test if there is a difference in the average salary between males and females in a profession.

A two-sample hypothesis test could also be used to test if the average number of defective parts produced in one assembly line is more or less than the average number of defective parts produced in another.

Similar to one-sample hypothesis tests, a one-tailed or two-tailed test of the null hypothesis can be performed in two-sample hypothesis testing.

The two-sample hypothesis test of no difference between the average salaries of males and females is an example of a two-tailed test.

The test of whether or not the average number of defective parts produced on one assembly line is more or less than the average number of defective parts produced on another assembly line is an example of a one-tailed test.

It is important to note that when we perform a two-sample hypothesis, we are testing a claim concerning the difference between the parameters of two populations, not the values of the parameters themselves. We are interested in the difference between two averages, not the averages themselves.

The samples must be randomly selected.

The samples are independent. Two samples are independent if the sample values selected from one population are not related to the sample values from another population.

Each sample size must be at least 30 or, if not, each population must have a normal distribution with a known standard deviation.

If these requirements are met, then the sampling distribution for ˜_{1} – ˜_{2} (the difference of the sample means)

is a normal distribution with mean and standard error of and

.

.

Notice that the variance of the sampling distribution is the sum of the variances of the individual sampling distributions.

Because the sampling distribution for ˜_{1} – ˜_{2} is a normal distribution, we can use the z-test to test the difference between two populations means.

A two-sample z-test can be used to test the difference between two population means when a large sample, at least 30, is randomly selected from each population and the samples are independent.

The test statistic is ˜_{1} – ˜_{2} and the standardized test statistic is

.

When the samples are large, we can use s_{1} and s_{2} in place of σ_{1} and σ_{2}.

If the samples are not large, we can still use a two-sample z-test provided the populations are normally distributed and the population standard deviations are known.

One more thing. If the null hypothesis states μ_{1} = μ_{2 }, μ_{1} ≤ μ_{2 }, μ_{1} ≥ μ_{2 }, then μ_{1} = μ_{2}, is assumed and the expression μ_{1} – μ_{2} is equal to 0.

Example. The amount of a certain element in blood is different for men and women. A sample of 75 male blood donors were tested and it was found that their blood contained a concentration mean of 28 parts per million with a standard deviation of 14.1 ppm of this element. A smaller sample of 50 female blood donors whose blood was also tested found that their blood contained a concentration mean of 33 ppm with a standard deviation of 9.5 ppm. Does this data support the claim that the population means of concentrations of the element are the same for men and women? Use α = 0.01 and α = 0.05.

Claim → μ_{1} = μ_{2 } This is the null hypothesis, H_{0}.

H_{a}: μ_{1} ≠ μ_{2 } . This is a two-tailed test.

Sample 1 – Male blood donors.

n_{1} = 75

˜_{1} = 28

s_{1} = 14.1

Sample 2 – Female blood donors.

n_{2} = 50

˜_{2} = 33

s_{2} = 9.5

Find μ_{1} – μ_{2} and σ_{˜1} – σ_{˜2}.

1. μ_{1} – μ_{2} = 0

2. σ_{˜1} – σ_{˜2} =

.

Method 1. Find z-score and use rejection regions.

Rejection regions.

If α = 0.01, two-tailed test, critical values ±2.576. Positive or negative because it’s a two-tailed test. So if our z-score is less than –2.576 or greater than 2.576, then we will reject the null. Our z-score is -2.3687. this is not in our rejection regions, therefore at α = 0.01, we will fail to reject the null.

If α = 0.05, two-tailed test, critical values ±1.960. Our z-score is -2.3687, this is inside our rejection region, therefore we will reject our null and accept the alternate hypothesis.

Method 2. Find z-score and then use P-value.

Find P-value.

Remember to double the first value in order to get the P-value for two tails.

Compare P-value to α.

If α = 0.01, since P > α, we will fail to reject the null.

If α = 0.05, P ≤ α, therefore we will reject our null and accept the alternate hypothesis.

Method 3. Thank you TI. Take care entering the values.

Compare P-value to α.

If α = 0.01, since P > α, we will fail to reject the null.

If α = 0.05, P ≤ α, therefore we will reject our null and accept the alternate hypothesis.

Nice.

Example. The American Automobile Association claims that the average daily meal and lodging costs while vacationing in Florida are greater than the same average costs while vacationing in Delaware. The two samples are independent. At α = 0.05, is there enough evidence to support the claim?

Florida: mean = $252, standard deviation = 22, number in sample = 150

Delaware: mean = $242, standard deviation = 18, number in sample = 200

Claim → μ_{1} > μ_{2 }. This is the alternate hypothesis, H_{a}. Right-tail test.

H_{0}: μ_{1} ≤ μ_{2 }

Sample 1 – Florida.

n_{1} = 150

˜_{1} = 252

s_{1} = 22

Sample 2 – Delaware.

n_{2} = 200

˜_{2} = 242

s_{2} = 18

Compare P-value to α.

If α = 0.05, P ≤ α, therefore we will reject our null and accept the alternate hypothesis.

Nice.

Example. A study of seat belt use involved children who were hospitalized as a result of motor vehicle crashes. For a group of 123 children who were wearing seat belts, the number of days in ICU has a mean of 0.83 and a standard deviation of 1.77. For a group of 290 children who were not wearing seat belts, the number of days spent in ICU has a mean of 1.39 and a standard deviation of 3.06. At the 0.01 significance level test the claim that the population of children wearing seat belts has a lower mean number of days spent in ICU.

Claim → μ_{1} < μ_{2 }. This is the alternate hypothesis, H_{a}. Left-tail test.

H_{0}: μ_{1} ≥ μ_{2 }

Sample 1 – Wearing seatbelts.

n_{1} = 123

˜_{1} = 0.83

s_{1} = 1.77

Sample 2 – Not wearing seatbelts.

n_{2} = 290

˜_{2} = 1.39

s_{2} = 3.06

Compare P-value to α.

If α = 0.01, P ≤ α, therefore we will reject our null and accept the alternate hypothesis.

α = 0.01 is considered a stringent test of a claim.

Stringent – meaning rigorous; demanding strict attention to rules and procedures; “rigorous discipline”; “tight security”; “stringent safety measures”

Nice.

Motor vehicle crashes are the leading cause of death for 15 to 20 year olds in the United States.

Teens have the highest fatality rate in motor vehicle crashes than any other age group. There are many reasons; while teens are learning the new skills needed for driving, many frequently engage in high-risk behaviors, such as speeding and/or driving after using alcohol or drugs. Studies also have shown that teens may be easily distracted while driving. One key reason for high traffic fatalities among this age group is that they have lower safety belt use rates than adults.