Beruflich Dokumente
Kultur Dokumente
http://www.psych.uiuc.edu/~jrfinley/p235/
Random Variables
Random Variable:
variable that takes on a particular numerical value based on outcome of a random experiment
Random Experiment
trial that will result in one of several possible outcomes cant predict outcome of any specific trial can predict pattern in the LONG RUN
Random Variables
Example: Random Experiment:
flip a coin 3 times
Random Variable:
# of heads
Random Variables
Discrete vs Continuous
finite vs infinite # possible outcomes
Scales of Measurement
Categorical/Nominal Ordinal Interval Ratio
So far...
Graphing & summarizing sample distributions (DESCRIPTIVE) Counting Rules Probability Random Variables one more key concept is needed to start doing INFERENTIAL statistics:
SAMPLING DISTRIBUTION
Binomial Situation
Bernoulli Trial
a random experiment having exactly two possible outcomes, generically called "Success" and "Failure probability of Success = p probability of Failure = q = (1-p) Examples: Coin toss: Success=Heads p=.5 Robot Factory: Success=Good Robot p=.75
Heads
Tails
Good Robot
Bad Robot
Binomial Situation
Binomial Situation:
n: # of Bernoulli trials trials are independent p (probability of success) remains constant across trials
Sample:
X=
....
0.20
0.15
probability
0.10
0.05
0.00 0 1 2 3 4 5 6 7 8 9 10
# of successes
Sampling Distribution
Sampling Distribution: Distribution of values that your sample statistic would take on, if you kept taking samples of the same size, from the same population, FOREVER (infinitely many times). Note: this is a THEORETICAL PROBABILITY DISTRIBUTION
Bernoulli Trial: one coin toss Success=Heads p=.5 Lets do 10 tosses n=10 (sample size) Binomial Random Variable: X=# of the 10 tosses that come up heads (aka Sample Statistic)
Sampling Distribution
0.2 0.15
probability
0.1
0.05
0 0 1 2 3 4 5 6 7 8 9 10
# of successes
Sample:
X=
....
Bernoulli Trial: one coin toss Success=Heads p=.5 Lets do 10 tosses n=10 (sample size) Binomial Random Variable: X=# of the 10 tosses that come up heads (aka Sample Statistic)
Sampling Distribution
0.2 0.15
probability
0.1
0.05
0 0 1 2 3 4 5 6 7 8 9 10
# of successes
Sample:
X=
Binomial Formula
P(X = k) = P(exactly k many successes)
probability of success specific # of failures
k n n- k P(X = k) = p (1- p) k
Binomial Random Variable combination called the Binomial Coefficient n n! = k k!(n - k)!
probability of failure
pu possib oin)
ir c
n: oin t tio le c la
all fa ra (fo
Binomial Formula
p=.5 n=10
0.3
0.25
Sampling Distribution
0.2 0.15
probability
0.1
0.05
0 0 1 2 3 4 5 6 7 8 9 10
# of successes
p(X=3) =
Hmm... what if we had gotten X=0?... pretty unlikely outcome... fair coin? Remember this idea....
= np(1- p)
X
Standard Deviation : s
= np(1- p)
Variance Std. Dev. 1.25 1.12
0.3
0.25
0.2
0.15 probability
0.1
0.05
0 0 1 2 3 4 5
# of successes
0.25
0.2
0.15
probability
0.1
0.05
0 0 1 2 3 4 5 6 7 8 9 10
# of successes
0.18
0.16
0.14
0.12
0.1
probability 0.08
0.06
0.04
0.02
0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
# of successes
0.1
0.08
0.06
probability
0.04
0.02
0 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50
# of successes
0.08
0.07
0.06
0.05
0.04
probability
0.03
0.02
0.01
0 0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99
# of successes
Whoah.
Anyone else notice those DISCRETE distributions starting to look smoother as sample size (n) increased? Lets look at a few more binomial distributions, this time with a different probability of success...
Youd like to know about how many BAD robots youre likely to get before placing an order... p = .10 (... success)
0.6
0.5
0.4
0.3 probability
0.2
0.1
0 0 1 2 3 4 5
# of successes
0.4
0.35
0.3
0.25
probability
0.15
0.2
0.1
0.05
0 0 1 2 3 4 5 6 7 8 9 10
# of successes
0.25
0.2
0.15
probability
0.1
0.05
0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
# of successes
0.18
0.16
0.14
0.12
0.1
probability 0.08
0.06
0.04
0.02
0 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50
# of successes
0.12
0.1
0.08
probability
0.04
0.06
0.02
0 0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99
# of successes
= np
s = np(1- p)
0.3
0.25
0.2
0.15
probability
0.1
0.05
Normal Distributions
(aka Bell Curve) Probability Distributions of a Continuous Random Variable
(smooth curve!)
Class of distributions, all with the same overall shape Any specific Normal Distribution is characterized by two parameters:
mean:
standard deviation:
Standardizing
Standardizing a distribution of values results in re-labeling & stretching/squishing the x-axis useful: gets rid of units, puts all distributions on same scale for comparison HOWTO:
simply convert every value to a: Z SCORE:
x- m z= s
Standardizing
Z score:
x- m z= s
Conceptual meaning:
how many standard deviations from the mean a given score is (in a given distribution)
=0 =1
Next Time
More different types of distributions
Binomial, Normal t, Chi-square F
And then... how will we use these to do inference? Remember: biggest new idea today was:
SAMPLING DISTRIBUTION