Sie sind auf Seite 1von 2

Chapter 2

Saturday, September 13, 2014

10:41 AM

2.1
Most important rule of data analysis Make a picture - a display of your data will reveal things that you won't see in a table of numbers, a welldesigned display will show the important features and patterns in your data, and the best way to tell
others about your data is with a picture.
Check categorical data condition that the if the data is counts or percentages of the individuals in the
categories.
2.2
Tables show how the individuals are distributed along with each variable, which is contingent on the
value of the other variable, is called a contingency table (also called a frequency table).
Marginal distribution is the frequency distribution of one of the variables. It can be expressed as either
counts or percentages.
To find the marginal distribution, you divide the total of each by the complete total.
In a contingency table, when the distribution of one variable is the same for all categories of another, we
say that the variables are independent and there is no association between these variables.
Segmented bar chart treats each bar as the whole and divides it up proportionally into segments
corresponding to the percentage in each group.
2.3
Summarize a single categorical variable - number, proportion or percentage of who's in each category
Summarize with Frequency Table (relative frequency table)
Bar Chart
Pie Chart
Frequency Table - Lists categories and number or proportion of Who's in each category
Compare numbers and/or proportions
Bar chart displays either number or percentage for each category
compare heights of bars
do not need to have all categories in display
Pie Chart Displays percentage of whole for each category
Compare sizes of pie slices
Must have all categories in display
Is there an association between the two categorical variables?
Two variables Variable of interest = response variable
Other variable = explanatory variable (the explanatory variable is being used to explain the differences
Stat 101 Page 1

Other variable = explanatory variable (the explanatory variable is being used to explain the differences
in the response variable)
Is the distribution of the response variable different for each categories of the explanatory variable?
- There is an association between the two variables
Is the distribution of the response variable approximately the same for the different categories of the
other variable?
- There is NOT an association between the two variables

- Relationship Between Two Categorical Variables Data = Two-Way Table (contingency table)
Rows = Categories of the explanatory variable
Columns = Categories of the response variable
Table Entries = number of observations belonging to a particular category of explanatory variable and
particular category of response variable
Marginal Distributions - Looks at percentages for each variable separately (ignoring the other variable)
Margins of the contingency Table
Same as looking at two variables separately
Conditional Distributions - Looks at percentages for one variable conditioned on a particular category for
the other variable
(conditioning variable = explanatory variable)
Compare conditional distributions to marginal distributions for same variable
Differences indicate a potential dependence (association) between the two variables
Mosaic Plot - graphical summary of conditional distributions in contingency table
Similar to segmented bar charts
includes summary of marginal distributions
Association - The lines (segments) in the mosaic plot do not line up, means conditional distributions are
different
No association - The lines (segments) in the mosaic plot line up (means conditional distributions are the
same)

Chapter 2
Powerpoint

Chapter 2
Homework

Stat 101 Page 2

Das könnte Ihnen auch gefallen