Bowman’s Website

November 24, 2009

Statistics Notes — Multinomial Distribution

Filed under: Statistics, Statistics Notes — Tags: — bowman @ 8:07 am
Multinomial Distribution
A multinomial experiment is a statistical experiment that has the following properties:
1. The experiment consists of n repeated trials.
2. Each trial has a discrete number of possible outcomes.
3. On any given trial, the probability that a particular outcome will occur is constant.
4. The trials are independent; that is, the outcome on one trial does not affect the outcome on other trials.
.
Multinomial Formula. Suppose a multinomial experiment consists of n trials, and each trial can result in any of r possible outcomes: E1, E2, . . . , Er.
Each possible outcome can occur with probabilities p1, p2, . . . , pr.
Then, the probability that E1 occurs n1 times, E2 occurs n2 times, . . . , and Er occurs nr times is
.
P = \sf \frac {n!}{ n_1! * n_2! * ... n_r! } * ( p1n1 * p2n2 * . . . * prnr )
where n = n1 + n2 + . . . + nr.
.
Example: Suppose a card is drawn randomly from an ordinary deck of playing cards, and then put back in the deck. This experiment is repeated five times.
What is the probability of drawing 1 spade, 1 heart, 1 diamond, and 2 clubs?
To solve this problem, we apply the multinomial formula. We know the following:
The experiment consists of 5 trials, so n = 5.
The 5 trials produce 1 spade, 1 heart, 1 diamond, and 2 clubs; so n1 = 1, n2 = 1, n3 = 1, and n4 = 2.
On any particular trial, the probability of drawing a spade, heart, diamond, or club is 0.25, 0.25, 0.25, and 0.25, respectively.
Thus, p1 = 0.25, p2 = 0.25, p3 = 0.25, and p4 = 0.25.
We plug these inputs into the multinomial formula, as shown below:
P = \sf \frac {n!}{ n_1! * n_2! * ... n_r! } * ( p1n1 * p2n2 * . . . * prnr )
P = \sf \frac {5!}{1! * 1! * 1! * 2!} * [ (0.25)1 * (0.25)1 * (0.25)1 * (0.25)2 ]
P = 0.05859
Thus, if we draw five cards with replacement from an ordinary deck of playing cards, the probability of drawing 1 spade, 1 heart, 1 diamond, and 2 clubs is 0.05859.
.
Example: Suppose we have a bowl with 10 marbles – 2 red marbles, 3 green marbles, and 5 blue marbles. We randomly select 4 marbles from the bowl, with replacement. What is the probability of selecting 2 green marbles and 2 blue marbles?
Solution: To solve this problem, we apply the multinomial formula. We know the following:
The experiment consists of 4 trials, so n = 4.
The 4 trials produce 0 red marbles, 2 green marbles, and 2 blue marbles; so nred = 0, ngreen = 2, and nblue = 2.
On any particular trial, the probability of drawing a red, green, or blue marble is 0.2, 0.3, and 0.5, respectively. Thus, pred = 0.2, pgreen = 0.3, and pblue = 0.5
We plug these inputs into the multinomial formula, as shown below:
P = \sf \frac {n!}{ n_1! * n_2! * ... n_r! } * ( p1n1 * p2n2 * . . . * prnr )
P = \sf \frac {4!}{0! * 2! * 2!} * [ (0.2)0 * (0.3)2 * (0.5)2 ]
P = 0.135
Thus, if we draw 4 marbles with replacement from the bowl, the probability of drawing 0 red marbles, 2 green marbles, and 2 blue marbles is 0.135.
.
Example: From Baseball Reference . com, the hitting numbers for Dustin Pedroia (2008), Go Boston!
726 Plate Appearances
653 Official At Bats
213 Hits; Batting Average .326; The probability of Dustin getting on base because of a hit in a plate appearance is p = 213 / 726 = 0.293
57 Walks (50 BB and 7 HBP); p = 57 / 726 = 0.079
140 Singles; p = 140 / 726 = 0.193
54 Doubles; p = 54 / 726 = 0.074
2 Triples; p = 2 / 726 = 0.003
17 Home Runs; p = 17 / 726 = 0.023
456 Batting Events that might not have went the way Dustin would have liked, this does include 16 Sacrifices; p = 456 / 726 = 0.628
P(Walk) + P(Single) + P(Double) + P(Triple) + P(Home Run) + P(Unfortunate Batting Experience) = 0.079 + 0.193 + 0.074 + 0.003 + 0.023 + 0.628 = 1
The probability that Dustin Pedroia, Go Boston, hits for the cycle (gets a single, double, triple and home run) in the next four at-bats is
P = \sf \frac {n!}{ n_1! * n_2! * ... n_r! } * ( p1n1 * p2n2 * . . . * prnr )
P = \sf \frac {4!}{0! * 1! * 1! * 1! * 1! * 0!} * [ (0.079)0 * (0.193)1 * (0.074)1 * (0.003)1 * (0.023)1 * (0.628)0 ] = 0.000004663187256

November 23, 2009

Statistics Notes — Hypergeometric Distribution

Filed under: Statistics, Statistics Notes — Tags: — bowman @ 8:01 am

Hypergeometric distribution is a discrete probability distribution that describes the number of successes in a sequence of n draws from a finite population without replacement.
.
A hypergeometric distribution is a statistical distribution that has the following properties:
1. A sample of size n is randomly selected without replacement from a population of N items.
2. In the population, M items can be classified as successes, and N – M items can be classified as failures.
.
Consider the following statistical experiment.
We draw cards from a deck of well-shuffled cards with replacement, one card per each draw.
We do this 5 times and record whether the outcome is spade or not. Then this is a binomial experiment.
If we do the same thing without replacement, then it is no longer a binomial experiment.
However, if we are drawing from 100 decks of cards without replacement and record only the first 5 outcomes, this is approximately a binomial experiment. Recall, we talked about this before. If the population is large, then even without replacement, an experiment will be approximately a binomial experiment.
.
Consider this following statistical experiment. You have an urn of 10 marbles – 5 red and 5 green. You randomly select 2 marbles without replacement and count the number of red marbles you have selected. This would be a hypergeometric experiment. The population is too small.
With the above experiment, the probability of a success changes considerably on every trial. In the beginning, the probability of selecting a red marble is 5/10. If you select a red marble on the first trial, the probability of selecting a red marble on the second trial is 4/9. And if you select a green marble on the first trial, the probability of selecting a red marble on the second trial is 5/9.
.
Notation
The following notation is helpful, when we talk about hypergeometric distributions and hypergeometric probability.
N: The number of items in the population.
M: The number of items in the population that are classified as successes.
n: The number of items in the sample.
x: The number of items in the sample that are classified as successes.
h(N, M, n, x):  hypergeometric probability – the probability that an n-trial hypergeometric experiment results in exactly x successes, when the population consists of N items, M of which are classified as successes.
.
Hypergeometric Formula.
h(N, M, n, x) = \sf \frac {[ _MC_x ] [ _{N-M}C_{n-x} ]}{[ _NC_n ]}
.
The hypergeometric distribution has the following properties:
The mean of the distribution is equal to \sf \dfrac {n * M}{N} .
The variance is σ2 = \sf \frac {n * M * ( N - M ) * ( N - n )}{[ N^2 * ( N - 1 ) ]} .
.
Example:  Suppose we randomly select 5 cards without replacement from an ordinary deck of playing cards. What is the probability of getting exactly 2 red cards?
Solution: This is a hypergeometric experiment in which we know the following:
N = 52; since there are 52 cards in a deck.
M = 26; since there are 26 red cards in a deck.
n = 5; since we randomly select 5 cards from the deck.
x = 2; since 2 of the cards we select are red.
We plug these values into the hypergeometric formula as follows:
h(N, M, n, x) = \sf \frac {[ _MC_x ] [ _{N-M}C_{n-x} ]}{[ _NC_n ]}
h(52, 26, 5, x=2) = \sf \frac {[ _{26}C_2 ] [ _{26}C_{3} ]}{[ _{52}C_5 ]} = \sf \frac {[ 325 ] [ 2600 ]}{[ 2,598,960 ]} = 0.32513
Thus, the probability of randomly selecting 2 red cards is 0.32513.
.
Example:  A batch of 100 computer chips contains 10 defective chips. Five chips are chosen at random, without replacement.
a. Compute the probability density function of the number of defective chips in the sample.
N = 100
n = 5
M = 10
x = x
h(N, M, n, x) = \sf \frac {[ _MC_x ] [ _{N-M}C_{n-x} ]}{[ _NC_n ]}
h(100, 10, 5, x=x) = \sf \frac {[ _{10}C_x ] [ _{90}C_{5-x} ]}{[ _{100}C_5 ]}
.
b. Compute the mean and variance of the number of defective chips in the sample
µ = \sf \dfrac {n * M}{N} = \sf \dfrac {5 * 10}{100} = .5
σ2 = \sf \frac {n * M * ( N - M ) * ( N - n )}{[ N^2 * ( N - 1 ) ]} = \sf \frac {5 * 10 * ( 100 - 10 ) * ( 100 - 5 )}{[ 100^2 * ( 100 - 1 ) ]} = 0.4318181818
.
c. Find the probability that the sample contains one defective chip.
N = 100
n = 5
M = 10
x = 1
h(N, M, n, x) = \sf \frac {[ _MC_x ] [ _{N-M}C_{n-x} ]}{[ _NC_n ]}
h(100, 10, 5, x=1) = \sf \frac {[ _{10}C_1 ] [ _{90}C_4 ]}{[ _{100}C_5 ]} = 0.339390911
.
d. Find the probability that the sample contains at least one defective chip.
N = 100
n = 5
M = 10
x = 1, 2, 3, 4, 5
h(100, 10, 5, x ≥ 1) = h(100, 10, 5, x=1) + h(100, 10, 5, x=2)  + h(100, 10, 5, x=3)  + h(100, 10, 5, x=4)  + h(100, 10, 5, x=5)
or
h(100, 10, 5, x ≥ 1) = 1 -  h(100, 10, 5, x=0) = 1 – \sf \frac {[ _{10}C_0 ] [ _{90}C_5 ]}{[ _{100}C_5 ]} = 1 – .5837523669 = 0.4162
.
Example: A club contains 50 members; 20 are men and 30 are women. A committee of 10 members is chosen at random.
a. Compute the probability density function of the number of women on the committee.
N = 50
n = 10
M = 30
x = x
h(N, M, n, x) = \sf \frac {[ _MC_x ] [ _{N-M}C_{n-x} ]}{[ _NC_n ]}
h(50, 30, 10, x=x) = \sf \frac {[ _{30}C_x ] [ _{20}C_{10-x} ]}{[ _{50}C_{10} ]}
.
b. Give the mean and variance of the number of women on the committee.
µ = \sf \dfrac {n * M}{N} = \sf \dfrac {10 * 30}{50} = 6
σ2 = \sf \frac {n * M * ( N - M ) * ( N - n )}{[ N^2 * ( N - 1 ) ]} = \sf \frac {10 * 30 * ( 50 - 30 ) * ( 50 - 10 )}{[ 50^2 * ( 50 - 1 ) ]} = 1.959183673
.
c. Give the mean and variance of the number of men on the committee.
µ = \sf \frac {n * M}{N} = \sf \frac {10 * 20}{50} = 4
σ2 = \sf \frac {n * M * ( N - M ) * ( N - n )}{[ N^2 * ( N - 1 ) ]} = \sf \frac {10 * 20 * ( 50 - 20 ) * ( 50 - 10 )}{[ 50^2 * ( 50 - 1 ) ]} = 1.959183673
.
d. Find the probability that the committee members are all women.
N = 50
n = 10
M = 30
x = 10
h(N, M, n, x) = \sf \frac {[ _MC_x ] [ _{N-M}C_{n-x} ]}{[ _NC_n ]}
h(50, 30, 10, x=10) = \sf \frac {[ _{30}C_{10} ] [ _{20}C_{0} ]}{[ _{50}C_{10} ]} = \sf \frac {[30045015] [1]}{[1.027227817E10]} = 0.0029248638
.
e. Find the probability that the committee members are all the same gender.
You must find the sum of the probability of all men and the probability of all women.
P(men) + P(women) = h(50, 20, 10, x=10) + h(50, 30, 10, x=10) =
.
Example:  A jar of jellybeans contains 20 yellow jellybeans and 25 red jellybeans. If 5 jellybeans were drawn from the jar randomly, what is the expected number of red jellybeans drawn?
N = 45
n = 5
M = 25
x = x
µ = \sf \frac {n * M}{N} = \sf \frac {5 * 25}{45} = 2.7778
Therefore, if you were to withdraw 5 jellybeans you would expect about 3 to be red.

Statistics Notes — Negative Binomial Distribution

Filed under: Statistics, Statistics Notes — Tags: — bowman @ 8:00 am

We all know what the World Series is in baseball. It is finals. When the best of the National League plays the best of the American League, go Boston.
The winner of the series is the first to win four games.
The probability that a team wins their fourth game in the fifth, or sixth, or seventh game of the world series is a form of negative binomial distribution.
We are not necessarily finding the probability that a team wins four out of five, or four out of six, or four out of seven, we are interested in the probability that their final win is in a defined game.
This means, if we are interested in the probability of winning their fourth game in the sixth game of the series, we are assuming that they won three of the first five and then they finished by winning the sixth game.
.
The negative binomial distribution is a discrete probability distribution. It can be used to describe the distribution arising from an experiment consisting of a sequence of independent trials, subject to several constraints.
1. The experiment consists of n repeated trials.
2. Each trial can result in just two possible outcomes. We call one of these outcomes a success and the other, a failure.
3. The probability of success is the same on every trial.
4. The trials are independent; that is, the outcome on one trial does not affect the outcome on other trials.
5. The experiment continues until final number of desired successes have observed.
.
Suppose we flip a coin repeatedly and count the number of tails (successes). If we continue flipping the coin until it has landed 2 times on tails, we are conducting a negative binomial experiment.
.
Formula:
P(x=x) = n – 1Cx – 1 px qn – x
where,
n = Number of events.
x= Number of successful events.
p = Probability of success on a single trial.
q = Probability of failure.
b-(n, p, x=x) = n – 1Cx – 1 px qn – x
.
Example: Find the probability that a man flipping a coin gets the fourth tail on the ninth flip.
n = 9, x = 4, p = .5, q = .5
P(x=4) = b-(9,.5, x=4) =  n – 1Cx – 1 px qn – x = 8C3 .54 .55 = 0.109375
The probability that the coin will land on tails for the fourth time on the ninth coin flip is 0.109375
.
Example: Find the probability that a person will select three heart from a deck of standard cards by the tenth selection. There must be replacement.
n = 10, x = 3, p = .25, q = .75
P(x=3) = b-(10, .25, x=3) =  9C2 .253 .757 = 0.0750846863
.
Example: Bobby is a high school basketball player. He is a 70% free throw shooter. That means his probability of making a free throw is 0.70. During the season, what is the probability that Bob makes his third free throw on his fifth shot?
This assumes he made two of his first four, and then he made the fifth.
n = 5, x = 3, p = .70, q = .30
P(x=3) = b-(5, .70, x=3) =  4C2 .703 .302 = 0.18522
.
The mean and variance are determined using the following formulas.
µ = \sf \dfrac {x}{p}
σ2 = \sf \dfrac {xq}{p^2}
.
Example A standard, fair die is thrown until 3 sixes occur. Let T denote the number of throws.
n = T, x = 3, p = \sf  \dfrac{1}{6} , q = \sf \dfrac{5}{6}
a. Find the probability function of T
P(x=3) = b-(T, \sf \dfrac{1}{6}, x=3) =  T-1C2 \sf \left ( \dfrac{1}{6} \right )^3 \sf \left ( \dfrac{5}{6} \right )^{T - 3}
b. Find the mean of T
µ = \sf \dfrac {x}{p} = \sf \dfrac {3}{\dfrac{1}{6}} = 18
c. Find the variance of T
σ2 = \sf \dfrac {xq}{p^2} = \sf \dfrac {3 * \dfrac{5}{6}}{\left ( \dfrac{1}{6} \right )^2} = 90
d. Find the probability that 20 throws will needed.
n = 20, x = 3, p = \sf \dfrac{1}{6}, q = \sf \dfrac{5}{6}
P(x=3) = b-(20, \sf \dfrac{1}{6}, x=3) = 19C2 \sf \left ( \dfrac{1}{6} \right )^3 \sf \left ( \dfrac{5}{6} \right )^{17} = 0.0356829849

November 20, 2009

Statistics Assignment — Computer Lab — Chapter 4

Filed under: Statistics, Statistics Assignment — Tags: — bowman @ 8:00 am

Go to the following link and do the required work. Do this work on your own paper and turn this in.

Quiz 1: Problems 1 – 15

November 11, 2009

Statistics Notes — Poisson Distribution Table (partial) — Not Cumulative

Filed under: Statistics, Statistics Notes — Tags: — bowman @ 3:15 pm

This is a partial Poisson Distribution Table. I will use it only to point out that it exist and for one or two examples. I will always use my outstanding graphing utility. Thanks TI.

poisson-distribution-table1

November 10, 2009

Statistics — Biography — Simeon Denis Poisson

Filed under: History of Math, Statistics, Statistics Notes — Tags: — bowman @ 3:16 pm

poisson

Simeon Denis Poisson developed many novel applications of mathematics for statistics and physics. He was born at Pithviers on June 21, 1781, and died at Paris on April 25, 1840. His father had been a private soldier, and on his retirement was given a small administrative post in his native village. When the French revolution broke out, his father assumed the government of the village, and soon became a local dignitary.

He was educated by his father who prodded him to be a doctor. His uncle offered to teach him medicine, and began by making him prick the veins of cabbage-leaves with a lancet. When he had perfected this, he was allowed to practice on humans, but in the first case of bloodletting that he did this by himself, the patient died within a few hours. Although the other physicians assured him that this was not an uncommon occurance, he vowed he would have nothing more to do with the medical profession. His main weakness was the lack of coordination which had made a career as a surgeon impossible. This weakness followed him in some respects for drawing mathematical diagrams was quite beyond him.

Upon returning home, he discovered a copy of a question set from the Polytechnic school among the official papers sent to his father. This chance event determined his career. At the age of seventeen he entered the Polytechic. A memoir on finite differences which he wrote when only eighteen was so impressive that it was rapidly published in a prestigious journal. As soon as he had finished his studies he was appointed as a lecturer. Throughout his life he held various scientific posts and professorships. He made the study of mathematics his hobby as well as his business.

Over his life Simeon Poisson wrote between 300-400 manuscripts and books on a variety of mathematical topics, including pure mathematics, the application of mathematics to physical problems, the probability of random events, the theory of electrostatics and magnetism (which led the forefront of the new field of quantum mechanics), physical astronomy, and wave theory.

One of Simeon Poisson’s contributions was the development of equations to analyze random events, later dubbed the Poisson Distribution. The fame of this distribution is often attributed to the following story. Many soldiers in the Prussian Army died due to kicks from horses. To determine whether this was due to a random occurance or the wrath of god, the Czar commissioned the Russian mathematician Ladislaus Bortkiewicz to determine the statistical significance of the events. Fourteen corps were examined, each for twenty years. For over half the corps-year combinations there were no deaths from horse kicks; for the other combinations the number of deaths ranged up to four. Presumably the risk of lethal horse kicks varied over years and corps, yet the over-all distribution fit remarkably well to a Poisson distribution.

Statistics Notes — Poisson Distribution

Filed under: Statistics, Statistics Notes — Tags: — bowman @ 3:15 pm

A type of probability distribution that is often useful in describing the number of events that will occur in a specific period of time or in a specific area or volume is the Poisson distribution, named after the 18th century mathematician Simeon Poisson. Typical examples of random variables for which the Poisson probability distribution provides a good model are

  1. The number of traffic accidents per month at a busy intersection
  2. The number of noticeable surface defects (scratches, dents, etc.) found by quality inspectors on a new automobile
  3. The parts per million of some toxin found in the water or air emission from a manufacturing plant
  4. The number of diseased trees per acre in certain woodland
  5. The number of death claims received per day by an insurance company
  6. The number of unscheduled admissions per day to a hospital

Characteristics of a Poisson Distribution
The Poisson distribution is a discrete probability distribution of a random variable x that satisfies the following conditions.

  1. The experiment consists of counting the number of times x, an event occurs in a given interval. The interval can be an interval of time, area, or volume.
  2. The probability of the event occurring is the same for each interval.
  3. The number of occurrences in one interval is independent of the number of occurrences in other intervals.

.

The following notation is helpful, when we talk about the Poisson distribution.
e: A constant equal to approximately 2.71828.
µ: The mean number of successes that occur in a specified region.
x: The actual number of successes that occur in a specified region.
p(µ, x=x): The Poisson probability that exactly x successes occur in a Poisson experiment, when the mean number of successes is µ.

The probability of exactly x occurrences in an interval is

P(x=x) = p(µ, x=x) = \dfrac {\mu^x e^{-\mu}}{x!}

where e is the natural number and µ is the mean number of occurrences per interval unit.

.

Example. The mean number of accidents per month at a certain intersection is 3. What is the probability that in any given month 4 accidents will occur at this intersection?

µ = 3 and x = 4

P(x=3) = p(3, x=4) = \dfrac {3^4 e^{-3}}{ 4!} = 0.1680313557

stats-43-pastats-43-pbstats-43-pc

.

The Poisson probability distribution also provides a good approximation to a binomial probability distribution with its mean equal to the product of n and p, when n is large, p is small, and np ≥ 7.

μ = np
σ2 = μ

Oftentimes with Poisson Distribution the Greek letter lambda λ is used for the mean.

.

Example: Ecologists often use the number of reported sightings of a rare species of animal to estimate the remaining population size. For example, suppose the number x of reported sighting per week of blue whales is recorded. Assume that x has approximately a Poisson probability distribution. Furthermore, assume that the average number of weekly sightings is 2.6.

a. Find the mean and standard deviation of x, the number of blue whales sightings per week.

μ = 2.6
σ2 = 2.6
σ = 1.61

Remember that the mean measures the central tendency of the distribution and does not necessarily equal a possible value of x. In this example, the mean is 2.6 sightings, and although there cannot be 2.6 sightings during a given week, the average number of weekly sighting is 2.6. Similarly, the standard deviation of 1.61 measures the variability of the number of sightings per week. Perhaps a more helpful measure is the interval µ ± 2σ, which in this case stretches from -.65 to 5.82. We expect the number of sightings to fall in this interval most of the time – with at least 75% relative frequency, according to Chebychev’s Rule, and with about 95% relative frequency, according to the Empirical Rule.

b. Find the probability that fewer than two sightings are made during a given week.

P(fewer than 2) =  P(x<2) = P(x=0) + P(x=1) = p(2.6, x=0) + p(2.6, x=1)  =  \dfrac {2.6^0 e^{-2.6}}{ 0!}  +  \dfrac {2.6^1 e^{-2.6}}{ 1!}  =  0.2673848816

stats-43-pdstats-43-pestats-43-pfstats-43-pg

c. Find the probability that exactly five sightings are made during a given week.

P(x=5) = p(2.6, x=5) =  \dfrac {2.6^5 e^{-2.6}}{ 5!} =  0.0735393591

stats-43-pi

d. Find the probability that more than five sightings are made during a given week.

P(x> 5)  =  p(2.6, x=6) + p(2,6, x=7) + p(2.6, x=8) + p(2.6, x=9) + p(2.6, x=10) + ….

=  1  -  [ p(2,6, x=0) + p(2,6, x=1) + p(2,6, x=2) + p(2,6, x=3) + p(2,6, x=4) + p(2,6, x=5) ]  =  wow  =  0.0490371519

stats-43-pjstats-43-pkstats-43-pl

Remember when you are finding probabilities that are to the right of a probability distribution, you work it as complement event.

Statistics Notes — Geometric Distribution

Filed under: Statistics, Statistics Notes — Tags: — bowman @ 3:14 pm

Many actions in life are repeated until a success occurs.

You may take your driver’s exam several times before you pass and acquire your driver’s license.
You may attempt to dial your internet connection many times before successfully logging on.

Situations such as these can be represented by a geometric distribution.

Suppose we want to know how long we will have to wait for the first success. More precisely, we want to know what the chance is the first success will occur on the first trial, on the second trial, etc.

Geometric distribution is a probability model that helps us determine how many failures occur before a single success.

Example: Suppose that I am roller skating back in the 70’s and I start asking girls to slow skate with me during a slow smooth outstanding love ballad that I had recorded earlier that day in my studio at Slow Smooth Sounding Cool Records. Outstandingly smooth. Let x be the number of girls that I need to ask in order to find a slow dance skate partner. Of course, I would only have to ask one because I’m smooth and cool and amazingly outstanding. So let’s consider someone else. Let’s consider you.

So, if the first person accepts, then x = 1.
If the first person declines but the next person accepts, then x = 2. And so on.
When x = n, it means that you failed on the first n – 1 tries and succeeded on the nth try.

Your probability of failing on the first try is (1 – p).
Your probability of failing on the first two tries is (1 – p)(1 – p).
Your probability of failing on the first three tries is (1 – p)(1 – p)(1 – p). And so on.
Your probability of failing on the first n – 1 tries is (1 – p)n – 1.

Then, your probability of succeeding on the nth try is p. Thus, we have P(n) = (1 – p)n – 1 p

This is known as the geometric distribution.

A geometric distribution is a discrete probability distribution of a random variable x that satisfies the following conditions.
1. A trial is repeated until a success occurs.
2. The repeated trials are independent of each other.
3. The probability of success p is constant for each trial.

The probability that the first success will occur on trial number x is  P(x=x) = g(p, x=x) = qx – 1p

.

Example: Bobby is a high school basketball player. He is a 70% free throw shooter. That means his probability of making a free throw is 0.70. During one of his games, what is the probability that Bob makes his first free throw on his fifth attempt?
This assumes that he misses the first four and succeeds on his fifth attempt. Highly unlikely, but this is the problem I’m asking.

P(x=5) = g(.70, x=5) = .304 × .70 = 0.00567

stats-43-geometric-b

.

Example: Bobby is a high school basketball player. He is a 70% free throw shooter. That means his probability of making a free throw is 0.70. During one of his games, what is the probability that Bob makes his first free throw on either his first or second attempt? Much more likely.

P(either the first or second attempt) = P(x=1) + P(x=2) = g(.70, x=1) + g(.70, x=2) = .300 × .70 + .301 × .70 = 0.91

stats-43-geometric-a

.

The mean of the geometric distribution is equal to \dfrac{1}{p}.

µ = \dfrac{1}{p}

In opening example, if we are trying to estimate how many people you will have to ask to skate until you find a partner, and p, the probability of someone accepting is .25, then on average you will have to ask \dfrac{    1}{.25} or four people.

In the free throw example, the mean is \dfrac{   1}{.7} or 1.4286. Meaning Bobby should always have success on either his first or second attempt of the night.

The variance of the geometric distribution is  \dfrac{1 - p}{p^2} or  \dfrac{q}{p^2} .

σ2 = \dfrac{1 - p}{p^2} = \dfrac{q}{p^2}

Statistics Assignment — Handout 4220

Filed under: Statistics, Statistics Assignment — Tags: — bowman @ 9:09 am

stats 4220 1stats 4220 2stats 4220 3

November 8, 2009

Statistics Notes — Binomial Probability Distribution — How to Use That Calculator to Construct a Binomial Probability Distribution Histogram

Filed under: Statistics, Statistics Notes — Tags: — bowman @ 4:17 pm

Let’s make a histogram of our data.

A certain surgical procedure has an 85% chance of success. A doctor performs the procedure on eight patients. Find the mean, variance, and standard deviation for the number of successful surgical procedures.
n = 8, p = .85, q = .15, and x = 0, 1, 2, 3, 4, 5, 6, 7, 8

Here’s the table we created earlier.

stats-42-cal-19

I am going to construct a histogram using the random variable x, List 1, and the probability of the random variable, P(x), List 2.

Let’s turn on the STAT PLOT feature of our wonderful graphing utility. You will have to scroll and change some things. Press the histogram icon. Change your Xlist to the random variable, which was in L1. And change the Freq to the probability of the random variable, which was L2.

stats-42-cal-20

Let’s adjust our WINDOW. Adjust the X to fit our random variable. Our list had 9 (recall: 0 through 8). Adjust the Y to a probability type scale. Plus our graph is going to be only in the First Quadrant. Outstanding.

stats-42-cal-21stats-42-cal-22

Older Posts »

Blog at WordPress.com.