Bowman’s Website

April 29, 2009

Statistics Notes — Linear Regression

Filed under: Statistics — bowman @ 8:28 pm

A regression line or line of best fit allows us to use the explanatory variable x to make predictions for the response variable y.

The equation of a regression line for an independent variable x and a dependent variable y is where is the predicted y-value for a given x-value.

The slope m and y-intercept b are given by

and

where is the mean of the y-values in the data set and is the mean of the x-values. The regression line always passes through the point .

.

Example. Find the equation of the regression line for the advertising expenditures and company sales data.

Advertising expenses (1000s of $), x

Company sales (1000s of $), y

2.4

225

1.6

184

2.0

220

2.6

240

1.4

180

1.6

184

2.0

186

2.2

215

.

Let’s look at the scatter plot.

.

Guidelines.

1. Find the sum of the x values.

2. Find the sum of the y values

3. Multiply each x value by its corresponding y value and find the sum.

4. Square each x value and find the sum.

5. Square each y value and find the sum. Actually, we don’t need this one for this application, but it was fun finding it.

6. Use these five sums to calculate the slope and y-intercept.

.

Advertising expenses (1000s of $), x

Company sales (1000s of $), y

xy

x2

y2

2.4

225

540

5.76

50625

1.6

184

294.4

2.56

33856

2.0

220

440

4

48400

2.6

240

624

6.76

57600

1.4

180

252

1.96

32400

1.6

184

294.4

2.56

33856

2.0

186

372

4

34596

2.2

215

473

4.84

46225

∑x = 15.8

∑y = 1634

∑xy = 3289.8

∑x2 = 32.44

∑y2 = 337,558

.

Using these sums and n = 8, the slope is

and the y-intercept is

.

Here’s what we have…

.

The regression line is

.

Now check this out.

.

We can also use our LinReg feature. In this screen a is the slope m, and b is the y-intercept.

.

Example. Use the regression line in the previous example to predict the expected company sales (in 1000s of dollars) for the following advertising expenditures (in 1000s of dollars).

a. 1.5 thousand dollars

When the advertising expenditures are $1500, the company sales are about $180,154.

.

b. $2800, which is 2.8 thousand dollars

When the advertising expenditures are $2800, the company sales are about $246,101.

.

An important consideration… Prediction values are meaningful only for x-values in or close to the range of the data. The x-values in the original data set range from 1.4 to 2.6. So, it would not be appropriate to use the regression line to predict company sales for advertising such as 0.5 ($500) or 5.0 ($5000).

.

A long alternative…

.

April 28, 2009

Statistics Notes — Correlation

Filed under: Statistics — bowman @ 6:56 pm

In statistics, correlation (often measured as a correlation coefficient) indicates the strength and direction of a linear relationship between two random variables.

The data can be represented by the ordered pairs (x, y) where x is the independent, or explanatory, variable and y is the dependent, or response, variable.

A scatter plot can be used to determine whether a linear (straight line) correlation exists between two variables. In a scatter plot, the ordered pairs (x, y) are graphed as points in a coordinate plane.

The following scatter plots show several types of correlations.

+

+

Interpreting correlation using a scatter plot can be subjective. A more precise way to measure the type and strength of a linear correlation between two variables is to calculate the correlation coefficient.

The correlation coefficient is a measure of the strength and the direction of a linear relationship between two variables. The symbol r represents the sample correlation coefficient. The formula for r is

where n is the number of pairs of data.

The population correlation coefficient is represented by ρ (the lowercase Greek letter rho).

The range of the correlation coefficient is –1 to 1.

If x and y have a strong positive linear correlation, r is close to 1.

If x and y have a strong negative linear correlation, r is close to –1.

If there is no linear correlation or a weak linear correlation, r is close to 0.

+

+

Example. Calculate the correlation coefficient for the advertising expenditures and company sales data. What can you conclude?

Advertising expenses (1000s of $), x

Company sales (1000s of $), y

2.4

225

1.6

184

2.0

220

2.6

240

1.4

180

1.6

184

2.0

186

2.2

215

.

Let’s look at the scatter plot.

.

Guidelines.

1. Find the sum of the x values.

2. Find the sum of the y values

3. Multiply each x value by its corresponding y value and find the sum.

4. Square each x value and find the sum.

5. Square each y value and find the sum.

6. Use these five sums to calculate the correlation coefficient.

.

Advertising expenses (1000s of $), x

Company sales (1000s of $), y

xy

x2

y2

2.4

225

540

5.76

50625

1.6

184

294.4

2.56

33856

2.0

220

440

4

48400

2.6

240

624

6.76

57600

1.4

180

252

1.96

32400

1.6

184

294.4

2.56

33856

2.0

186

372

4

34596

2.2

215

473

4.84

46225

∑x = 15.8

∑y = 1634

∑xy = 3289.8

∑x2 = 32.44

∑y2 = 337,558

.

Using these sums and n = 8, the correlation coefficient is

Because r is close to 1, there is a strong positive linear correlation. As the amount of spending on advertising increases, the company sales also increase.

.

.

Find the sums.

These are your 5 sums. Use these and n = 8 to determine the correlation coefficient.

Because r is close to 1, there is a strong positive linear correlation. As the amount of spending on advertising increases, the company sales also increase.

.

If your calculator is like mine, you needed to turn on a feature. I didn’t know until I saw that r was missing. r should be below the a and b.

Here’s what we need to do. Turn on DiagnosticOn. DiagnosticOn is within our CATALOG. Now we have the ability to find r using our lists.

Because r is close to 1, there is a strong positive linear correlation. As the amount of spending on advertising increases, the company sales also increase.

.

.

Example. Calculate the correlation coefficient for the income level and donating percent data. What can you conclude?

Income level (in 1000s of $), x

Donating percent, y

42

9

48

10

50

8

59

5

65

6

72

3

.

Find the five sums and use the correlation coefficient formula.

.

Using these sums and n = 6, the correlation coefficient is

.

Use the Linear Regression feature to check. Nice work.

Because r is close to –1, there is a strong negative linear correlation. As the level of income rises, the percentage of donating decreases.

April 27, 2009

Statistics Assignment — Handout 8410

Filed under: Statistics — bowman @ 9:28 am

stats-8410

issuu

Lesson Plans — Week 34 — April 27 – May 1, 2009

Filed under: Lesson Plans — bowman @ 7:18 am

Lesson Plans — Week 34 — April 27 – May 1, 2009

Pre-Calculus
Standards: Models for Real World Phenomena; Algebraic Functions; Trigonometric Functions; Sequences and Series
Monday, April 27: Test
Tuesday, April 28: 12.2 Solve counting problems using the multiplication principle; Solve counting problems using permutations; Solve counting problems using combinations; Solve counting problems using permutations involving n non-distinct objects
Wednesday, April 29: 12.2 Solve counting problems using the multiplication principle; Solve counting problems using permutations; Solve counting problems using combinations; Solve counting problems using permutations involving n non-distinct objects
Thursday, April 30: 12.3 Construct probability models; Compute probabilities of equally likely outcome; Utilize the addition rule to find probabilities; Utilize the complement rule to find probabilities; Compute probabilities using permutations and combinations
Friday, May 1: 12.3 Construct probability models; Compute probabilities of equally likely outcome; Utilize the addition rule to find probabilities; Utilize the complement rule to find probabilities; Compute probabilities using permutations and combinations

Geometry
Standards: Number and Operations; Algebra; Geometry; Measurement; Data Analysis and Probability
Monday, April 27: 10.4 and 10.6 Find the surface area of a pyramid; Find the surface area of a cone; Find the volume of a pyramid; Find the volume of a cone
Tuesday, April 28: 10.7 Find the surface area and volume of a sphere
Wednesday, April 29: 10.8 Find relationships between the ratios of the areas and volumes of similar solids; Review Perimeter, Area, Volume
Thursday, April 30: Test
Friday, May 1: Test

Statistics
Standards: Experimental Design; Data Analysis
Monday, April 27: Review Chapter 8
Tuesday, April 28: Test
Wednesday, April 29: 9.1 An introduction to linear correlation, independent and dependent variables, and the types of correlation; How to find a correlation coefficient; How to perform a hypothesis test for a population correlation coefficient
Thursday, April 30: 9.1 An introduction to linear correlation, independent and dependent variables, and the types of correlation; How to find a correlation coefficient; How to perform a hypothesis test for a population correlation coefficient
Friday, May 1: 9.1 An introduction to linear correlation, independent and dependent variables, and the types of correlation; How to find a correlation coefficient; How to perform a hypothesis test for a population correlation coefficient

April 24, 2009

Girl, 12, retires 18 boys in perfect game

Filed under: Uncategorized — bowman @ 12:50 pm

April 23, 2009

Statistics Assignment — Book 8.4

Filed under: Statistics — bowman @ 7:39 am

issuu

issuu

April 22, 2009

TMTA Geometry 2009

Filed under: TMTA — bowman @ 12:37 pm

issuu

TMTA Pre-Calculus 2009

Filed under: TMTA — bowman @ 12:37 pm

issuu

TMTA Statistics 2009

Filed under: TMTA — bowman @ 12:37 pm

issuu

Statistics Assignment — Book 8.3

Filed under: Statistics — bowman @ 11:13 am

issuu

issuu

Older Posts »

Blog at WordPress.com.