Sie sind auf Seite 1von 39

Probability

Agenda
Probability

What is probability? There are three interpretations:


1. Classical Interpretation of Probability
2. Relative Frequency Concept of Probability
(Empirical Approach)
3. Subjective Probability
Classical Interpretation of Probability

• Event is a collection of outcomes, denoted


by A, B, C, etc. The probability that event E occurs
is denoted by P(E). When all outcomes are equally
likely, then:

For example, flip a coin. What is the chance of getting a head (H)?

Answer:
• Is the coin fair?
• If the coin is fair, then P(H)=1/2
Relative Frequency Concept of Probability
(Empirical Approach)
• Flip the coin (since we don’t know whether it is fair
or not) a very large number of times and count the
number of H out of the total number of flips.

• For example, if we flip the given coin 10,000 times and observe 4555
heads and 5445 tails, then for that coin:
Subjective Probability

• Subjective probability reflects personal belief which


involves personal judgment, information, intuition, etc.

• For example, what is P (you will get an A in a certain course)? Each


student may have a different answer to the question.
Probability Law/ Set Operations

1. Union
A or B also written as A U B = outcomes in A or B or both (shaded part)

This type of diagram that represents sets operations is called a Venn


Diagram.
Probability Law/ Set Operations Contd..

2. Intersection
A and B also written as A B = outcomes in both A and B (shaded part)
Probability Law/ Set Operations Contd..

3. Complement
Also written as = outcomes not in A (shaded part)

A and B are called mutually exclusive (disjoint) if the occurrence of


outcomes in A excludes the occurrence of outcomes in B. In other words
there are no elements in A B and thus P(A B) = 0

One example of two mutually exclusive events is that A and are


mutually exclusive.
Conditional Probability
Conditional Probability and Independence

• The concept of conditional events and independent events is


determine whether or not one of the events has an effect on the
probability of the other event. You can possibly imagine several daily
conversations you may have that invoke these concepts.

• For instance, a team might have a probability of 0.6 of winning the


Super Bowl or a country a probability of 0.3 of winning the World
Cup. However, if it rains the team's chances may change (for the
better or possibly for the worse): the probability of winning is affected
by the weather - conditional. But what if the game is being played in
an enclosed stadium? In such a case the weather may have no
affect on the team's chances - independent.
CONDITIONAL PROBABILITY NOTATION

The expression P(A|B) is interpreted as the, "Probability


event A happens given that event B has happened". The '|'
notation does NOT imply 'divide by'. Instead this implies
that all events following it have taken place.
Marginal & Conditional Probability

Example: There are three balls in the jar:


One ball is drawn from the jar. Find the probability that the
number on that ball is 1:

P (the number is 1) = 2/3 [marginal probability or the


probability of an event without reference to any other event
or events occurring.]
Marginal & Conditional Probability Contd..

But If someone tells you that the ball is blue, find the
probability that the number on the ball is 1, then:
P (the number is 1 | ball is blue) = 1 [conditional probability
because of the underlying condition of being told the ball
chosen was blue.]

Note: In this example, we see that the marginal probability may not be the same
as the conditional probability. This will be reflective of events that
are not independent i.e. they are considered dependent events.
Another Example
The conditional and marginal probabilities can be different
and for some situations, the two probabilities can be equal
too. For Instance:
For the jar that contains the following balls:

Four balls in the urn, one red ball marked 1, one red ball
marked 2, one blue ball marked 1, one blue ball marked 2.
Another Example Contd..
One ball is drawn from the jar. Find the probability that the
number on that ball is 1:
P (the number is 1) = 2/4 = 1/2 [marginal probability]
And If someone tells you that the ball is blue, find the
probability that the number on the ball is 1:
P (the number is 1 | ball is blue) = 1/2 [conditional
probability]
Note: In this example, the marginal probability is the same as the conditional
probability. This will be reflective of events that are independent.
Probability Distributions
Random Variables

• Random variable is a variable that takes on


different values determined by chance.

• Types of Random Variables:


• Qualitative Random Variable
• Quantitative Random Variable
Types of Random Variables

• Qualitative Random Variable - The possible


values vary in kind but not in numerical degree.
They are also called categorical variables.
• Quantitative Random Variable - Can be of 2 types:
• Discrete Random Variable
• Continuous Random Variable
Quantitative Random Variables
• Discrete Random Variable - When the random
variable can assume only a countable, sometimes
infinite, number of values.
• For example: Number of children born in State College this year, or the
number of tosses to get the first Head when flipping a fair coin.

• Continuous Random Variable - When the random


variable can assume uncountable number of values
in a line interval.
• For example: Height of a STAT 500 student, or the weight of a
chocolate chip cookie.
Probability Distributions for Discrete Random
Variables
• Another way to think about a random variable is that a random
variable is a numerical characteristic of each event in a sample
space or population.

• Let's take a look at some of the characteristics of discrete


random variables. Example:
Consider the dataset with the values: 0, 1, 2, 3, 4.
1. What is the probability you select 2? It would be 1/5.
2. What is expected value, (i.e., the mean)? This would be 2.
3. How do you get the mean? Sum all the data points and divide by number of observations.

Computing the mean as we just did is no different than saying this:

1/5 (0) + 1/5 (1) + 1/5 (2) + 1/5 (3) + 1/5 (4)
Probability & Cumulative Distribution
Example: Number of Prior Convictions
• Let X = number of prior convictions for prisoners at a state
prison at which there are 500 prisoners.
X=x 0 1 2 3 4

80 265 100 40 15

P(X = x) 80/500 265/500 100/500 40/500 15/500

P(X = x) 0.16 0.53 0.2 0.08 0.03

X = the # prisoners
x = 0, 1, 2, 3, 4
Example - Number of Prior Convictions

• Now, let's say you went out and surveyed convicts. What is
expected value or expected number of priors that you would
report? Which of the three values makes the most sense?
a. 1,
b. 1.29, or
c. 2?

Ans: For this we need a weighted average since not all the outcomes have equal
chance of happening (i.e. they are not equally weighted). So, we need to find
P(X=x). Our expected value of X, or mean of X, or E(X):

E(X) =(0.16)(0)+(0.53)(1)+(0.2)(2)+(0.08)(3)+(0.03)(4)=1.29
Example Contd…

• What is the variance of X or Var(X) or V(X)? Here is


the formula we will use:

Now, if we use the data in our example we find that:

• Now, What is the standard deviation of X or SD(X) or


S(X)?
Probability Distribution Table

• From our Previous example:

X=x 0 1 2 3 4

P(X = x) 0.16 0.53 0.2 0.08 0.03

The above is called a probability distribution table. This table


provides the probability of each individual outcome.
Cumulative Distribution Table

X=x 0 1 2 3 4

P(X ≤ x) 0.16 0.69 0.89 0.97 1.0

The above is called a cumulative distribution table. This


table provides the probability of each outcome and those prior
to it. Thus the probability for the last event in the cumulative
table is 1 since that outcome or any previous outcomes must
occur.
Levels of measurement
Types of measurement
• Nominal Scale – 1st level of measurement.
It is a scale used for labeling variables into distinct
classifications and doesn’t involve a quantitative value or
order. This scale is the simplest of the four variable
measurement scales.
Eg . What is the place where you live?
Ans: Delhi/Mumbai/Chennai/Bangalore.
Ordinal Scale
• Ordinal Scale: 2nd Level of Measurement.
• It is used to simply depict the order of variables and
not the difference between each of the variables.
• Eg- Tournament team rankings, order of product
quality, order of agreement or satisfaction
• used in market research to gather and evaluate
relative feedback about product satisfaction, changing
perceptions with product upgrades etc.
.
Interval Scale
• Interval Scale: 3rd Level of Measurement
• It is a numerical scale where the order of the variables
is known as well as the difference between these
variables. Eg . Temperature, Time etc.
• In this scale we can calculate the stats values. Only
drawback is there is no absolute 0.
. Eg temp is Celcius scale is 100 and in Fahrenheit scale
it is 150. It is the same difference as 50 degress C and
100 degree F.
Ratio Scale
• Ratio Scale: 4th Level of Measurement.
• Apart from the properties of order and difference
between variables, we also know the Absolute 0.
• Eg. Height and Weight.
• Ratio scale accommodates the characteristic of three
other variable measurement scales, i.e. labeling the
variables, the significance of the order of variables
and a calculable difference between variables (which
are usually equidistant).
Some Questions

• What % of students in Simplilearn like Python?


• What randomly selected % of students in
Simplilearn like Python?
• How many credit cards do people in India have?
• Do girls fare better in studies than boys?
Definitions

• Sample space: The collection of all possible


outcomes of a random study
• Eg For a tossing of a coin, the sample space is
{Head,Tails}
Types of data

At the highest level, two kinds of data


exist: quantitative and qualitative.
Quantitative data deals with numbers and things you
can measure objectively: All maths operations can be
done on them
Eg: dimensions such as height, width, and length.
Temperature and humidity. Prices. Area and volume.
Quantitative Data –Discrete and Continuous

Discrete data is a count that can't be made more


precise. Typically it involves integers.

For instance, the number of children in your family is


discrete data, because you are counting whole,
indivisible entities: As you can't have 2.5 kids, or 1.3
pets.
Quantitative Data –Discrete and Continuous

Continuous data could be divided and reduced to


finer and finer levels.

For example, you can measure the height more


precise scales—meters, centimeters, millimeters, and
beyond—so height is continuous data.
Qualitative data

Qualitative data deals with characteristics and


descriptors that can't be easily measured, but can be
observed subjectively— eg color, name, etc.
We cannot add/subtract/multiply or divide this type of
data.
Qualitative data-Binary, Nominal &
Ordinal data
Binomial Data : Binary data place things in one of two
mutually exclusive categories: right/wrong, true/false, or
accept/reject.
Nominal Data: When you have several categories so that
they cant be ordered . Eg name of cars, address etc.
Ordinal data: This data can be arranged like ranks in the
class, Risk as high/medium/low etc.
Thank You!

Das könnte Ihnen auch gefallen