Sie sind auf Seite 1von 41

Psyc 235: Introduction to Statistics

http://www.psych.uiuc.edu/~jrfinley/p235/

DONT FORGET TO SIGN IN FOR CREDIT!

Independent vs. Dependent Events


Independent Events: unrelated events that intersect at chance levels given relative probabilities of each event Dependent Events: events that are related in some way So... how to tell if two events are independent or dependent?
Look at the INTERSECTION: P(AB)

if P(AB) = P(A)*P(B) --> independent if P(AB) P(A)*P(B) --> dependent

Random Variables
Random Variable:
variable that takes on a particular numerical value based on outcome of a random experiment

Random Experiment

(aka Random Phenomenon):

trial that will result in one of several possible outcomes cant predict outcome of any specific trial can predict pattern in the LONG RUN

Random Variables
Example: Random Experiment:
flip a coin 3 times

Random Variable:
# of heads

Random Variables
Discrete vs Continuous
finite vs infinite # possible outcomes

Scales of Measurement
Categorical/Nominal Ordinal Interval Ratio

Data World vs. Theory World


Theory World: Idealization of reality (idealization of what you might expect from a simple experiment)
Theoretical probability distribution POPULATION parameter: a number that describes the population. fixed but usually unknown

Data World: data that results from an actual simple experiment


Frequency distribution SAMPLE statistic: a number that describes the sample (ex: mean, standard deviation, sum, ...)

So far...
Graphing & summarizing sample distributions (DESCRIPTIVE) Counting Rules Probability Random Variables one more key concept is needed to start doing INFERENTIAL statistics:

SAMPLING DISTRIBUTION

Binomial Situation
Bernoulli Trial
a random experiment having exactly two possible outcomes, generically called "Success" and "Failure probability of Success = p probability of Failure = q = (1-p) Examples: Coin toss: Success=Heads p=.5 Robot Factory: Success=Good Robot p=.75

Heads

Tails

Good Robot

Bad Robot

Binomial Situation
Binomial Situation:
n: # of Bernoulli trials trials are independent p (probability of success) remains constant across trials

Binomial Random Variable:


X = # of the n trials that are successes

Binomial Situation: collect data!


Population:
Outcomes of all possible coin tosses (for a fair coin) Bernoulli Trial: one coin toss Success=Heads p=.5 Lets do 10 tosses n=10 (sample size) Binomial Random Variable: X=# of the 10 tosses that come up heads (aka Sample Statistic)

Sample:

X=

....

Binomial Distribution p=.5, n=10


0.30 0.25

0.20

0.15

This is the SAMPLING DISTRIBUTION of X!

probability
0.10

0.05

0.00 0 1 2 3 4 5 6 7 8 9 10

# of successes

Sampling Distribution
Sampling Distribution: Distribution of values that your sample statistic would take on, if you kept taking samples of the same size, from the same population, FOREVER (infinitely many times). Note: this is a THEORETICAL PROBABILITY DISTRIBUTION

Binomial Situation: collect data!


Population:
Outcomes of all possible coin tosses (for a fair coin)
0.3 0.25

Bernoulli Trial: one coin toss Success=Heads p=.5 Lets do 10 tosses n=10 (sample size) Binomial Random Variable: X=# of the 10 tosses that come up heads (aka Sample Statistic)

Sampling Distribution
0.2 0.15

probability
0.1

0.05

0 0 1 2 3 4 5 6 7 8 9 10

# of successes

Sample:

X=

....

Binomial Situation: collect data!


Population:
Outcomes of all possible coin tosses (for a fair coin)
0.3 0.25

Bernoulli Trial: one coin toss Success=Heads p=.5 Lets do 10 tosses n=10 (sample size) Binomial Random Variable: X=# of the 10 tosses that come up heads (aka Sample Statistic)

Sampling Distribution
0.2 0.15

probability
0.1

0.05

0 0 1 2 3 4 5 6 7 8 9 10

# of successes

Sample:

X=

Binomial Formula
P(X = k) = P(exactly k many successes)
probability of success specific # of failures

k n n- k P(X = k) = p (1- p) k
Binomial Random Variable combination called the Binomial Coefficient n n! = k k!(n - k)!

specific # of successes you could get

probability of failure

pu possib oin)
ir c

n: oin t tio le c la

all fa ra (fo

Binomial Formula
p=.5 n=10

0.3

0.25

Sampling Distribution
0.2 0.15

probability
0.1

0.05

0 0 1 2 3 4 5 6 7 8 9 10

# of successes

p(X=3) =

Hmm... what if we had gotten X=0?... pretty unlikely outcome... fair coin? Remember this idea....

More on the Binomial Distribution


X ~ B(n,p)
Expected Value and Variance for X~B(n,p) mX = np s
2 X

these are the parameters for the sampling distribution of X

= np(1- p)
X

Ex: # heads in 5 tosses of a coin: # heads in 5 tosses of a coin:

Standard Deviation : s

= np(1- p)
Variance Std. Dev. 1.25 1.12

X~B(5,1/2) Expectation 2.5

Lets see some more Binomial Distributions


What happens if we try doing a different # of trials (n) ? That is, try a different sample size...

Binom ial Distribution, p=.5, n=5


0.35

0.3

0.25

0.2

0.15 probability

0.1

0.05

0 0 1 2 3 4 5

# of successes

Binom ial Distribution, p=.5, n=10


0.3

0.25

0.2

0.15

probability
0.1

0.05

0 0 1 2 3 4 5 6 7 8 9 10

# of successes

Binom ial Distribution, p=.5, n=20


0.2

0.18

0.16

0.14

0.12

0.1

probability 0.08
0.06

0.04

0.02

0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

# of successes

Binom ial Distribution, p=.5, n=50


0.12

0.1

0.08

0.06

probability
0.04

0.02

0 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50

# of successes

Binomial Distribution, p=.5, n=100


0.09

0.08

0.07

0.06

0.05

0.04

probability
0.03

0.02

0.01

0 0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99

# of successes

Whoah.
Anyone else notice those DISCRETE distributions starting to look smoother as sample size (n) increased? Lets look at a few more binomial distributions, this time with a different probability of success...

Binomial Robot Factory


2 possible outcomes:
Good Robot 90% Bad Robot 10%

Youd like to know about how many BAD robots youre likely to get before placing an order... p = .10 (... success)

n = 5, 10, 20, 50, 100

Binom ial Distribution, p=.1, n=5


0.7

0.6

0.5

0.4

0.3 probability

0.2

0.1

0 0 1 2 3 4 5

# of successes

Binom ial Distribution, p=.1, n=10


0.45

0.4

0.35

0.3

0.25

probability
0.15

0.2

0.1

0.05

0 0 1 2 3 4 5 6 7 8 9 10

# of successes

Binom ial Distribution, p=.1, n=20


0.3

0.25

0.2

0.15

probability
0.1

0.05

0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

# of successes

Binom ial Distribution, p=.1, n=50


0.2

0.18

0.16

0.14

0.12

0.1

probability 0.08
0.06

0.04

0.02

0 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50

# of successes

Binomial Distribution, p=.1, n=100


0.14

0.12

0.1

0.08

probability
0.04

0.06

0.02

0 0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99

# of successes

Normal Approximation of the Binomial


If n is large, then X ~ B(n,p) {Binomial Distribution}

can be approximated by a NORMAL DISTRIBUTION with parameters:

= np

s = np(1- p)

0.3

0.25

0.2

0.15

probability

0.1

0.05

Normal Distributions
(aka Bell Curve) Probability Distributions of a Continuous Random Variable
(smooth curve!)

Class of distributions, all with the same overall shape Any specific Normal Distribution is characterized by two parameters:
mean:

standard deviation:

different means different standard deviations

Standardizing
Standardizing a distribution of values results in re-labeling & stretching/squishing the x-axis useful: gets rid of units, puts all distributions on same scale for comparison HOWTO:
simply convert every value to a: Z SCORE:

x- m z= s

Standardizing
Z score:

x- m z= s

Conceptual meaning:
how many standard deviations from the mean a given score is (in a given distribution)

Any distribution can be standardized Especially useful for Normal Distributions...

Standard Normal Distribution


has mean:

=0 =1

has standard deviation:

ANY Normal Distribution can be converted to the Standard Normal Distribution...

Standard Normal Distribution

Normal Distributions & Probability


Probability = area under the curve
intervals cumulative probability [draw on board]

For the Standard Normal Distribution:


These areas have already been calculated for us (by someone else)

Standard Normal Distribution


So, if this were a Sampling Distribution, ...

Next Time
More different types of distributions
Binomial, Normal t, Chi-square F

And then... how will we use these to do inference? Remember: biggest new idea today was:
SAMPLING DISTRIBUTION

Das könnte Ihnen auch gefallen