Beruflich Dokumente
Kultur Dokumente
Definition
A box and whisker plot (sometimes called a box plot) is a graph that presents
information from a five-number summary. It does not show a distribution in as
much detail as a stem and leaf plot or histogram does, but is especially useful
for indicating whether a distribution is skewed and whether there are potential
unusual observations (outliers) in the data set. Box and whisker plots are also
very useful when large numbers of observations are involved and when two or
more data sets are being compared.
Box and whisker plots are very effective and easy to read. They
summarize data from multiple sources and display the results in a single
graph. Box and whisker plots allow for comparison of data from different
categories for easier, more effective decision-making.
Use box and whisker plots when you have multiple data sets from
independent sources that are related to each other in some way.
Examples include test scores between schools or classrooms, data from
before and after a process change, similar features on one part such as
cam shaft lobes, or data from duplicate machines manufacturing the same
products.
the ends of the box are the upper and lower quartiles, so the box spans
the interquartile range
the median is marked by a vertical line inside the box
the whiskers are the two lines outside the box that extend to the highest
and lowest observations.
Steps In Making a Box Plot
The minimum is the smallest value in the data set, and the maximum is
the largest value in the data set. Use the following steps to find the 25th
percentile (known as Q1), the 50th percentile (the median), and the 75th
percentile (Q3).
1.1 Order all the values in the data set from smallest to largest.
1.2 Multiply k percent times the total number of values in the data, n.
a. Count the values in your data set from left to right (from the smallest
to the largest value) until you reach the number indicated by Step 3. The
corresponding value in your data set is the kth percentile.
b. Count the values in your data set from left to right (smallest to largest)
until you reach the number indicated by Step 2. The kthpercentile is the
average of that corresponding value in your data set and the value that
directly follows it.
2. Create a vertical (or horizontal) number line whose scale includes the
values in the five-number summary and uses appropriate units of equal
distance from each other.
3. Mark the location of each value in the five-number summary just above
the number line (for a horizontal boxplot) or just to the right of the number
line (for a vertical boxplot).
4. Draw a box around the marks for the 25th percentile and the 75th
percentile.
8. If there are outliers (according to your results of Step 6), indicate their
location on the boxplot with * signs.
Instead of drawing a line from the edge of the box all the way to the most
extreme outlier, stop the line at the last data value that isn’t an outlier.
Due to the five number data summary, a box plot is able to handle and
present a summary of a large amount of data. A box plot consists of the
median, which is the midpoint of the range of data; the upper and lower
quartiles, which represent the numbers above and below the highest and
lower quarters of the data; and the minimum and maximum data values.
Organizing data in a box plot by using five key concepts is an efficient way
of dealing with large data that is too unmanageable for other graphs, such
as line plots or stem and leaf plots.
2. Summarizing
3. Outliers
A box plot is one of very few statistical graph methods that show outliers.
There might be one outlier or multiple outliers within a set of data, which
occurs both below and above the minimum and maximum data values. An
outlier is an obscure result that can be detected by extending the
minimum and maximum data values to a maximum of 1.5 times the inter-
quartile range. Any results of data that fall outside of the minimum and
maximum values are considered outliers, which are easy to determine on
a box plot graph.
The issue with handling such large amounts of data in a box plot is that
the exact values and details of the distribution of results are not retained.
A box plot shows only a simple summary of the distribution of results, so
that it can be quickly viewed and compared with other data. For a
thorough, more detailed analysis of data a box plot should be used in
combination with another statistical graph method, such as a histogram.
REFERENCES
1. http://www.statcan.gc.ca/edu/power-pouvoir/ch12/5214889-eng.htm
2. http://www.ehow.com/info_12025269_advantages-disadvantages-box-
plot.html
3. http://www.dummies.com/education/math/statistics/how-to-make-a-
boxplot-from-a-five-number-summary/
4. http://asq.org/learn-about-quality/data-collection-analysis-
tools/overview/box-whisker-plot.html