Sie sind auf Seite 1von 7

How to Read (and Use) a Box-andWhisker Plot

Definition: A bar or diagram using a number line to show the distribution of data.

FEBRUARY 15, 2008 TO STATISTICAL VISUALIZATION BY NATHAN YAU

The box-and-whisker plot is an exploratory graphic, created by John W. Tukey, used to show the distribution of a dataset (at a glance). Think of the type of data you might use a histogram with, and the boxand-whisker (or box plot, for short) could probably be useful. The box plot, although very useful, seems to get lost in areas outside of Statistics, but I'm not sure why. It could be that people don't know about it or maybe are clueless on how to interpret it. In any case, here's how you read a box plot. Reading a Box-and-Whisker Plot

Let's say we ask 2,852 people (and they miraculously all respond) how many hamburgers they've consumed in the past week. We'll sort those responses from least to greatest and then graph them with our boxand-whisker. Take the top 50% of the group (1,426) who ate more hamburgers; they are represented by everything above the median (the white line). Those in the top 25% of hamburger eating (713) are shown by the top "whisker" and dots. Dots represent those who ate a lot more than normal or a lot less than normal (outliers). If more than one outlier ate the same number of hamburgers, dots are placed side by side.

Draw the box-and-whisker plot for the following data set:

77, 79, 80, 86, 87, 87, 94, 99


My first step is to find the median. Since there are eight data points, the median will be the average of the two middle values: (86 + 87) 2 = 86.5 = Q2 This splits the list into two halves: 77, 79, 80, 86 and 87, 87, 94, 99. Since the halves of the data set each contain an even number of values, the sub-medians will be the average of the middle two values. Copyright Elizabeth Stapel 2004-2011 All Rights Reserved

Q1 = (79 + 80) 2 = 79.5 Q3 = (87 + 94) 2 = 90.5


The minimum value is min:

77 and the maximum value is 99, so I have:

77, Q1: 79.5, Q2: 86.5, Q3: 90.5, max: 99

Then my plot looks like this:

As you can see, you only need the five values listed above (min, Q1, Q2, Q3, and max) in order to draw your box-and-whisker plot. This set of five values has been given the name "the five-number summary". Give the five-number summary of the following data set:

79, 53, 82, 91, 87, 98, 80, 93


The five-number summary consists of the numbers I need for the box-and-whisker plot: the minimum value, Q1 (the bottom of the box), Q2 (the median of the set), Q3 (the top of the box), and the maximum value (which is also Q4). So I need to order the set, find the median and the sub-medians, and then list the required values in order. ordering the list: is

53, 79, 80, 82, 87, 91, 93, 98, so the minimum is 53 and the maximum

98 (82 + 87) 2 = 84.5 = Q2 53, 79, 80, 82, so Q1 = (79 + 80) 2 = 79.5 87, 91, 93, 98, so Q3 = (91 + 93) 2 = 92 53, 79.5, 84.5, 92, 98

finding the median:

lower half of the list: upper half of the list:

five-number summary:

Box-and-Whisker Plot
Definition: A box-and-whisker plot or boxplot is a diagram based on the five-number summary of a data set. To construct this diagram, we first draw an equal interval scale on which to make our box plot. Do not just draw a boxplot shape and label points with the numbers from the 5-number summary. The boxplot is a visual representation of the distribution of the data. Greater distances in the diagram should correspond to greater distances between numeric values. Using the equal interval scale, we draw a rectangular box with one end at Q1 and the other end at Q3. And then we draw a vertical segment at the median value. Finally, we draw two horizontal segments on each side of the box, one down to the minimum value and one up to the maximum value, (these segments are called the "whiskers"). Example 1: Draw a box-and-whisker plot for the data set {3, 7, 8, 5, 12, 14, 21, 13, 18}. From our Example 1 on the previous page, we had the five-number summary: Minimum: 3, Q1 : 6, Median: 12, Q3 : 16, and Maximum: 21.

Notice that in any box-and-whisker plot, the left-side whisker represents where we find approximately the lowest 25% of the data and the right-side whisker represents where we find approximately the highest 25% of the data. The box part represents the interquartile range and represents approximately the middle 50% of all the data. The data is divided into four regions, which each represent approximately 25% of the data. This gives us a nice visual representation of how the data is spread out across the range. Example 2: Draw a box-and-whisker plot for the data set {3, 7, 8, 5, 12, 14, 21, 15, 18, 14}. From our Example 2 on the previous page, we had the five-number summary: Minimum: 3, Q1: 7, Median: 13, Q3: 15, and Maximum: 21.

When we relate two data sets based on the same scale, we may examine box-and-whisker plots to get an idea of how the two data sets compare. Example 3: Suppose that the box-and-whisker plots below represent quiz scores out of 25 points for Quiz 1 and Quiz 2 for the same class. What do these box-and-whisker plots show about how the class did on test #2 compared to test #1?

These box-and-whisker plots show that the lowest score, highest score, and Q3 are all the same for both exams, so performance on the two exams were quite similar. However, the movement Q1 up from a score of 6 to a score of 9 indicates that there was an overall improvement. On the first test, approximately 75% of the students scored at or above a score of 6. On the second test, the same number of students (75%) scored at or above a score of 9.

Drawing a box and whisker plot


Example : Construct a box plot for the following data: 12, 5, 22, 30, 7, 36, 14, 42, 15, 53, 25 Solution: Step 1: Arrange the data in ascending order. Step 2: Find the median, lower qurtile and upper quartile

Median (middle value) = 22 Lower quartile (middle value of the lower half) = 12 Upper quartile (middle value of the upper half) = 36 (If there is an even number of data items, then we need to get the average of the middle numbers.) Step 3: Draw a number line that will include the smallest and the largest data.

Step 4: Draw three vertical lines at the lower quartile (12), median (22) and the upper quartile (36), just above the number line.

Step 5: Join the lines for the lower quartile and the upper quartile to form a box.

Step 6: Draw a line from the smallest value (5) to the left side of the box and draw a line from the right side of the box to the biggest value (53).

Das könnte Ihnen auch gefallen