You are on page 1of 3

Experimental unit=the object which a variable is measured.

Population=set of all measurements of interest

Sample=subset of measurements selected from the population.
Univariate data=result when a single variable is measured on a single experiment unit
Bivariate data result when 2 variables are measured on a single experimental unit
Multivariate=more than 2.
Qualitative variables measure a quality or characteristic on each experimental unit. Quantitative measure a #
quantity or amount on each experimental unit.
Discrete=variable can assume only a finite or countable number of values; likewise, continuous variable is the
opposite.
Frequency = number of measurements in each category
Relative frequency= proportion of measurements in each category

Unimodal-one peak, bimodal-distribution has 2 peaks

Parameters-numerical descriptive measures associated with a population of measurements
Statistics are those computed from sample measurements.
Random variable= if the value that it assumes, corresponding to the outcome of an experiment is a chance or
random event (randomly selected ppl, etc.)
Requirements for a discrete probability distrib. = 0 <= p (x) <= 1, and SigmaP (x) = 1
Binomial experiment= n identical trials, success and failure, (2) trials are independent, we are interested in x,
the # of successes observed during the n trials, x = 0 1 2.n, (1-p=q)
Simple random sampling is a commonly used plan = every sample of size n has = chance.
Stratified random sample = pop consist of 2 or more sub pops called strata, a sampling plan that ensures that
each sub pop is represented in the sample.
Cluster sample= simple random sample of clusters from the available clusters in the pop.
1 in k systematic random sample involves the random selection of one of the first k elements in an ordered
population then the systematic collection of every kth element thereafter.
Calculating probabilities for the sample mean= 1) find u and calculate SE (x-) = stnd/sqrt (n)
2) Write down the event of interest in terms of x and locate the appropriate area on the normal curve.
3) Convert the necessary values= z = xhat – u/ stnd/sqrt (n). Then use appendix #s.
Calculating probabilities for sample proportion phat= 1) find n and p. 2) check whether the normal
approximation to the binomial distribution is appropriate (np>5 and nq>5) 3) write down the event of interest
in terms of phat, and locate the area on the normal curve. 4)
Z = phat – p / sqrt (pq/n). 5) Use table with appendix.

Rage-largest to smallest given measurements. Deviation=

Quartiles (IQR) = Q3 – Q1. Position q1 = .25(n+1) , q3=.75(n+1) = arrange data from smallest
to largest.

Number of ways we can arrange n distinct objects taking them r at a time = Pn/r = n! / (N – r)! Counting
rule for combinations = the # of distinct combinations of n objects that can be formed, taking them r at a
time, = Cn/r = n! / R! (N-r)!
Probabilities: (A and B) independent variables= heads on a coin 2x in a row? (Example) =
P|A ^ B| = P [A] P [B]. ½ x ½ = ¼
Probabilities: dependent= draw 2 cards, both aces? (Example) = P [A ^B] = P [A] P [B | A]
4/52 = one ace, 3/51 = second card ace, x those = 1/221.
. Probabilities: (A or B) mutually exclusive variables (one event occurs-other cannot) = P [A V B] = P [A] + P
[B], not mutually exclusive variables = p [A V B] = p [A] + p [B] – p [A ^ B]
Binomial distribution (r success in N trials)
P [x = k] = N! / K! (N – k)! X p^k x q^(n-k). Mean = u = np, variance = s^2 = npq, s = sqrt (npq)
Cumulative binomial probabilities = example= 3 successes= p (x <= 3) = p (0) + p (1) + p (2) + p (3) =
p (x<= 2) = p (0) + p (1) + p (2) so p (x=3) = P (x<= 3) – p (x<= 2). To find 3 or more, 1 – p (x <3) = 1 – p
(x<= 2).
Poisson random variable = u = average # of times that an event occurs in period of time. K occurrences
= p (x = k) u^k e^-u / k!
Hyper geometric probability distribution (m&m problem) = pop. Contains M successes and N-M failures
the probability of k successes in a random sample of size n is: Px=k) = C (m over k) C (N-M over n-k) / C
(N over n). Example: case of wine has 12 bottles, 3 of which are spoiled. 4 bottles is randomly selected.
Probability = N = 12, n = 4, M = 3, (N-M) = 9. Solve for p (0), 1, 2, 3 = mean = 1, variance = .5455.
Standard normal random variable = z = x – u/std, if x < mean=z is -. Z+ = x>mean. If =, z=0
Rule of thumb=np>5 and nq>5 = normal approximation to the binomial probabilities will be adequate.

Statistical process control x2hat = 3 x s/sqrt (n), phat = sigma (phati) / k, create upper and lower control
limits = phat = 3 sqrt (phat (1-phat)/n)
Examples of samples= random sample of n=50 city blocks are selected and a census is done for each single
family dwelling= cluster sample. Highway patrol stops every 10th vehicle given city artery between 9am and
3pm etc= 1 in 10 systematic sample. One hundred household in each of 4 cities wards are surveyed
concering blah = stratified.

Sample problems
7.60 finite pop consists of four elements. 6 1 3 2. How many diff samples of size n =2 can be selected from
this pop if you sample without replacement. List possible samples of size n=2. Compute the sample mean for
each sample. Find the sampling distribution of xhat. Do any values = mean?
a) 4!/(2!)(2!) = 6
b) (6,1) (6,3) (6,2) (1,3) (1,2) (3,2)
d) mean = 3.5 + 4.5 + 2 + 1.5 + 2.5 / 6 = 3 (- the histogram)
s2= xi2-xi 2n n-1 = 7/5 = 1.4, standard deviation = s = sqrt s^2 = 1.18
e) population mean = (6+1+3+2) /4 = 3, so no sample produces a mean equal to the mean in part b.
7.70 (proportion of individuals with Rh positive blood is 85%. N = 500. What are the mean and snd of phat, the
sample proportion. Is the distrib. Of phat approximately normal. Whats the prob. That the sample proportion
phat exceeds 82. What is the prob that the sample lies between 83% and 88%? 99% of the time the sample
proportion would lie between what 2 limits?
n = 500 and p = .85, q = .15
a) SE(proportion) = sqrt (p x q)/n = .0160
b) np = 500 x .85 = 425 which is greater than 5, and nq = 500 x .15 = 75 which is greater than 5 as well.
Therefore it is appropriate to use the normal distribution because np > 5 as well as q > 5.
c) phat > .82 = z = phat – p / sqrt(p x q)/n = -1.88 (used calculator)
using the table in the back of the book, I get 1 - .0301 = .9699 = the probability that the sample proportion
exceeds 82%
d) 0.83 < phat < 0.88 =
phat = .83 = (use same formula above) = -1.25, and phat = .88 (same formula) = 1.88
using the back of the book, we get .9699 - .1056 = .8643
e) z_0 = p_0 – p / sqrt(p x q / n ) = +- 2.58
= 0.85 +- (2.58 x 0.0160) = (0.81 , 0.
Marksman hits a target 80% of the time, he fires 5 shots. What is the probability that 3 are hit? = P(x=3) = C(n/3) p^3q^n-3 = 10(.8^3)
(.2^2) = .2048. Hitting 3 or more = P(x>3) = C(5/4) p^4 q^5-4 + C(5/5)p^5q^5-5 = answer.

Soda machine fills cans of soda with 12 flouid ounces. Suppose the fills are arctually distributed with a mean of 12.1 oz and a stnd of .2oz.
wat is the probability that the average for a 6 pack is less than 12 oz? = p(xhat <12) = p(xhat – u)/stnd/sqrt(n) < 12 – 12.1/ .2/sqrt(6) = .
1112.

soda bottler in the example claims that only 5% of the soda cans are underfilled. Randomly sampled 200 cans, what is the probability that
more than 10 % of the cans are underfilled? = n=200,s=underfilled can, p P(S) = 0.05, q = .95, np = 10, nq = 190.

Equations: normal distribution = f(x) = 1/deviation x sqrt(2pie)

Make a box plot = lower fence |-------Q1box | median | box-Q3-----| outer fence
Normal distr. = mean = n(M/N) variance= n(M/N)(N-M/N)(N-n/N-1) , f(x) = 1/s(sqrt(2pie)) x e^.5(x-u/s)^2
Empirical rule: for a distribution that is approx.. mound shaped: interval m +- s contains approx. 68% of the
measurements
The interval m+- 2s contains approx. 95% of the measurements
The interval m+- 3s contains approx. 99.7% of the measurements.
Tchebscheffs theorem can be used for x & s or u & sntd given that k is greater than or equal to 1 and set of
measurements:
If k = 2 at least 1 -1/2^2 = ¾ of the measurements are within 2 stnd of the mean, if k = 3, 8/9 of the
measurement are within3 stnd dev of the mean.

Position of q1 = .25(n+1) , q3 = .75(n+1), lower fence: q1 – 1.5(IQR), and upper fence: q3 + 1.5(IQR), IQR
=q3 – q1.

Draw a card from deck. 10 bucks if its an ace or a spade card. Probability?The correct answer is C. Let S = the
event that the card is a spade; and let A = the event that the card is an ace. We know the following: There are
52 cards in the deck. There are 13 spades, so P(S) = 13/52. There are 4 aces, so P(A) = 4/52. There is 1 ace
that is also a spade, so P(S ∩ A) = 1/52. Therefore, based on the rule of addition: P(S ∪ A) = P(S) + P(A) -
P(S ∩ A) … P(S ∪ A) = 13/52 + 4/52 - 1/52 = 16/52 = 4/13