Sie sind auf Seite 1von 2

SMARTLY

ONE-VARIABLE STATISTICS

MEASURING THE CENTER

Averages, Quartiles, and the Five Number Summary

Descriptive statistics: A way to quickly summarize data within a


set using just a few numbers.
Mean: The average of a set calculated by adding all the values in
the set and dividing by the number of values in the set.
Outlier: A value or values significantly higher or lower than the rest The mean is very sensitive to
of the set that can skew the mean of a set.
outliers, while the median is not.
Median: The middle value in a data set.
Mode: The value that appears most often in the set.

When a set has two modes it is called bimodal. When it has


more than two modes, it is multimodal.

√Σ
Standard deviation: A measurement of the amount of variation 1 N 2
from the mean in a data set.
σ= (xi µ)
N i=1
For example, if a data set has a mean of 50 units and a standard
deviation of 20 units, we can conclude that most of the data will σ = standard deviation
fall between 30 and 70 units. th
xi = the value of the i
Five number summary: The minimum, first quartile, median, third observation
quartile, and maximum of a data set. N = Number of data points
Each quartile represents 25% of the data within a set. µ = mean of data values
The first and third quartiles can be found by identifying the
medians of the lower and upper halves of the data.

Range: The distance between the maximum and minimum.


Interquartile range (IQR): The distance between the third and first
quartiles.

min median max

Q0 Q1 Q2 Q3 Q4

©2017 Pedago, LLC. All rights reserved.


ONE-VARIABLE STATISTICS SMARTLY

Graphical Organization

Boxplot: A graph representing the five number summary. Boxplot


The boxed area represents the IQR with the median at the Q0 Q1 Q2 Q3 Q4
center.
Frequency distribution: A table that sorts data into equally-sized
classes.
Ages of Mobile Phone Customers min median max
Cumulative Relative Cumulative 20 30 40 50 60
Class Frequency Frequency Frequency Relative Frequency
If Q1 and the If Q3 and the
20 ≤ X < 30 17 17 34.00% 34.00%
minimum are the maximum are
30 ≤ X < 40 16 33 32.00% 66.00% same value, you the same, there
40 ≤ X < 50 12 45 24.00% 90.00% won’t see a tail will be no tail on
50 ≤ X < 60 4 49 8.00% 98.00% on the left side. the right.
60 ≤ X < 70 1 50 2.00% 100.00%
Total 50 100.00%

Frequency: The amount of data points that fall into each class.
Cumulative frequency: The running total of the frequencies.
Relative frequency: The frequency divided by the total number of
data points.
Cumulative relative frequency: The running total of the relative
frequencies.
Histogram: A frequency distribution shown in graph form.

Positive skew (right skew): Negative skew (left skew):


When values pull a chart to When values pull a chart to
the right. the left.
35 35
35 35 35
30 30

25 25
25
20 20
20 20
15
15
10
10
10 10 10
5
5 5 5 5
$35

$45

$55

$65

$75

$85

$95

$115

$35

$45

$55

$65

$75

$85

$95

$115
$105

$125

$105

$125

In a histogram with a positive In a histogram with a negative


skew, the mean is greater skew, the median is greater
than the median. than the mean.

©2017 Pedago, LLC. All rights reserved.

Das könnte Ihnen auch gefallen