Sie sind auf Seite 1von 37

STAT 2 Lecture 4: The histogram

Review

If you can't do an experiment, may do an observational study Control for as many potential confounders as you can Prospective studies are generally preferable to retrospective studies

Review

With observational studies: Quality of the study depends heavily on quality of controls Causation is very difficult to establish or define! "trong associations can be sho#n

Today

$ar graphs "tem%and%leaf plots &istograms: % #hat are they % ho# do you dra# them % ho# do you interpret them

Graphing a variable

Types of variables

Categorical: favourite colour red, blue, green! 'umerical discrete: die roll (, ), *, +, ,, -! 'umerical continuous: #eight ()*.*+ pounds, )((.,- pounds!

Example: basketball scores

/* -, ,) 0+ ,- 0, 00 -0 -/ ,( 0+ -- 11 ,0 0* -( -( -+ -- ,) -0 -2 ,- ,* 02 -- ,/ -( -- -0 -, *, 00 ,* 3echnically discrete, but because there are so many possibilities, doesn't matter much if #e treat it as discrete or continuous

Stem-and-leaf
35 4 5 261726383 6 578611467061675 7 4564307 83 99

Stem-and-leaf

Histogram

Histogram

Histogram/bar graph

hen to draw a!!!

&istogram: continuous data or grouped numerical data $ar graph: categorical data or ungrouped discrete data "tem%and%leaf: #hen you're in a hurry Pie chart: controversial

II

Drawing a histogram

"rawing a histogram
Score Count 30-39 1 40-49 0 50-59 9 60-69 15 70-79 7 80-89 1 90-99 1

"rawing a histogram

"rawing a histogram

"rawing a histogram

#o$r t$rn

)220%)22/ Cal men's bas4etball scores: -0 0+ 00 0+ /- ((0 0, 0+ -, (2) /1) ,/ 02 -1 12 0, 00 -1 01 /( 02 0* 0- -1 +1 /+ /1 /2 /+ -- -/ ,-

#o$r t$rn

III

Histogram variations

%re&$ency vers$s percentage


Score Count Percent of total Height of block 40-49 1 3 0.3 50-59 2 6 0.6 60-69 7 21 2.1 70-79 12 36 3.6 80-89 7 21 2.1 90-99 2 6 0.6 100-109 1 3 0.3 110-119 1 3 0.3

%re&$ency vers$s percentage

How many blocks'

5ule of thumb don't follo# this exactly!:


Number in sample Number of blocks 9-16 5 17-32 6 33-64 7 65-128 8 129-256 9 257-512 10 513-1024 11

(locks of $ne&$al width

In most cases, avoid these 3he &6I7&3 of the bloc4 gives the percent per x%unit 3he 8568 of the bloc4 gives the percent of the total that falls #ithin the limits of the bloc4

(locks of $ne&$al width

Score 40-59 60-69 70-79 80-89 90-119

Count Percent of total Height of block 3 9 0.5 7 21 2.1 12 36 3.6 7 21 2.1 4 12 0.4

(locks of $ne&$al width

(locks of $ne&$al width

I9

Comparing histograms

)omparing histograms

We may have additional information that lets us divide the data into groups :ra# a histogram for each group and compare them 3his is a 4ind of control

)omparing histograms

)omparing histograms

"escribing distrib$tions

Distribution: the set of values a variable ta4es, along #ith ho# often the variable ta4es each value &ome distribution has a higher centre than a#ay distribution &ere, the t#o distributions have similar spreads

Recap

hen to draw a!!!

&istogram: continuous data or grouped numerical data $ar graph: categorical data or ungrouped discrete data "tem%and%leaf: #hen you're in a hurry Pie chart: controversial

Recap

y%axis: can be frequency or percentage &eight of the bloc4 gives the percent per x%unit 8rea of the bloc4 gives the percent of the total that falls #ithin the limits of the bloc4

Tomorrow: ;un #ith graphs

Das könnte Ihnen auch gefallen