Sie sind auf Seite 1von 24

INFERENTIAL STATISTICS

Descriptive Statistics Concepts & Exercises

LECTURE 2

MS Project Management @ CIIT Islamabad

Descriptive Statistics

Use of numerical information to summarize, simplify, and present masses of data. Data may come from studies of populations (often called a census study) or samples Use of summary measurestypically measures of central tendency and spread Used for both ungrouped and grouped data.

Data properties

The shape The mean The variability (standard deviation)

Shapes

Example Ungrouped and Grouped Data


A company records the excessive length of a part that its workers make for customers (in mm). Variable is excessive length. Are the data interval or ratio level? Are the data gathered in discrete or continuous form? How would you make sense of the data/present the data to a manager? (all observations are dimensions of objects in mm)? 5,5,12,13,14,14,20,21,21,21,23,23,24,25,33,34,36,3 7,42,43.

Stem and Leaf Plot Ungrouped Data

Organizing data using numbers


0 55 1 2344 2 01113345 3 3467 4 23

Box and Whisker Plot - Both

Represents normality if all points are equally spaced. Positive skew if Q2 is close to Q1 and right whisker is larger than the left one. Negative skew if Q2 is close to Q3 and right whisker is smaller than left one.

IQR

Q1 Q2 Min. value

Q3

Max. value

Histogram Grouped Data


A histogram is simply a bar-chart depicting a frequency distribution. The bigger the bar, the more frequent the level. Groups 0 - 10 10 - 20 20 - 30 Freq 2 4 8
8

6 4 2 0 10 20 30 40 50

30 - 40 40 - 50

4 2

Groups

Central Tendency

MEAN average MEDIAN -- middle value MODE -- most frequently observed value(s).

Population Mean

Definition: For ungrouped data, the population mean is the sum of all the population values divided by the total number of population values. To compute the population mean, use the following formula.
Sigma

mu

X N
Population Size

Individual value

The Sample Mean

Definition: For ungrouped data, the sample mean is the sum of all the sample values divided by the number of sample values. To compute the sample mean, use the following formula.
X-bar Sigma Individual value

X X n
Sample Size

The Mean Of Grouped Data

The mean of a sample of data organized in a frequency distribution is computed by the following formula:
X values X-bar f - class frequency

Xf Xf X n f
Sum of frequencies Sample size

Mean An Example
Groups
0 - 10 10 - 20

f
2 4

X
5 15

fX
10 60

20 - 30
30 - 40 40 - 50

8
4 2

25
35 45

200
140 90 500

fX fX X n f
= 500/20 = 25 mm length

20

Median

The most central value in arranged data. Essentially the median has 50% observations above and below it. In ungrouped data:

For even values the value at n/2th position is median. For odd values, the value at (n+1)/2th position is median.

In grouped data, median can be computed using formula: median = L + h/f (f/2 - C)

Median An example
Groups f C

Median = L + h/f (f/2 - C)


f/2 = 20/2 = 10

0 - 10
10 - 20 20 - 30 30 - 40 40 - 50

2
4 8 4 2

2
6 14 18 20

Median = 20 + 10/8 (10 6) Median = 25

20

Quartiles The Partition Values


Q1 = L + h/f (f/4 - C) Q2 = Median = 25 Q3 = L + h/f (3f/4 - C)


f/4 = 5

Groups 0 - 10 10 - 20

f 2 4 8 4 2

C 2 6 14 18 20

Q1 = 10 + 10/4 (5 2) = 17.5

20 - 30 30 - 40

3 f/4 = 15
Q3 = 30 + 10/4 (15 14) = 32.5

40 - 50

20

Dispersion
RANGE

highest to lowest values STANDARD DEVIATION how closely do values cluster around the mean value SKEWNESS refers to symmetry of curve Kurtosis refers to the peak of the curve

Range
Range: For ungrouped data, the range is the

difference between the highest and lowest values in a set of data. To compute the range, use the following formula.
RANGE = HIGHEST VALUE - LOWEST VALUE
Example : For the given data on excessive

length lowest value is 5 and highest value is 43. hence, Range = 43 5 = 38.

Variance

Population Variance: The population variance for ungrouped data is the arithmetic mean of the squared deviations from the population mean. It is computed from the formula below:
Individual value Population mean

2
Sigma square

( X ) 2 N
Population size

Variance An Example
Groups f X fX (X-) (X - )

0 - 10
10 - 20

2
4

5
15

10
60

-20
-10

400
100 = (X)/N = 500/20 = 25.

20 - 30
30 - 40 40 - 50

8
4 2

25
35 45

200 0
140 10 90 20

0
100 400 1000

2 = (X - )2/N = 1000/20 = 50.

20

500 0

Standard Deviation
Population Standard Deviation: The

population standard deviation () is the square root of the population variance. For the previous example, the population standard deviation is = 7.07 (square root of 50).
Note: If you are given the population standard

deviation, just square that number to get the population variance. Standard deviation is always a positive number > 0.

Empirical Rule for Symmetry


For any symmetrical,

bell-shaped distribution, approximately 68% of the observations will lie within 1s of the mean (m); approximately 98% within 2s of the mean (m); and approximately 99.7% within 3s of the mean (m).

Between: 1. 68.26% 2. 95.44% 3. 99.97%

m3s

m-2s m-1s m

m+1s m+2s m+ 3s

Coefficient of Variation

Coefficient of Variation: The ratio of the standard deviation to the arithmetic mean, expressed as a percentage. It is a measure of relative dispersion.

s (100%) CV X

Practical Exercise

Das könnte Ihnen auch gefallen