Sie sind auf Seite 1von 6

MBA Quantitative Analysis

Topic 1 (Statistics): Presentation and Analysis of Data

Issues to Learn and Practice


The need for studying statistics

to
to
to
to

present and describe information properly


draw conclusions about populations based on sample information
improve processes
obtain reliable forecasts

Sources of data

secondary or published sources


primary sources
surveys (sample or census surveys)
observations

Types of data

Categorical
Numerical
discrete

continuous

Sampling methods

Non-probability sampling
judgment

quota

chunk

Probability sampling
simple random

systematic

stratified

cluster

Presentation of data

The ordered array


The stem-leaf display

Tables and charts

Frequency distribution
Relative or percentage frequency distribution
Bar diagram
Histogram
Polygon
Cumulative frequency polygon

Presentation of Data
The Ordered Array: The ordered array consists of an ordered sequence of raw
data, in rank order from the smallest observation to the largest.
The Stem-and-Leaf Diagram: A stem-and-leaf display separates data entries
into leading digits, or stems, and trailing digits, or leaves. The stem-and-leaf

display is a valuable tool for organizing a set of data and understanding how the
values distribute and cluster over the range of the data set.
Frequency: The number of times a particular value or observation occurs.
Frequency Distribution: A frequency distribution is a summary table in which
the data are arranged into conveniently established, numerically ordered class
groupings or categories.
Relative Frequency Distribution: A relative frequency distribution is formed
by dividing the frequencies in each class of the frequency distribution by the
total number of observation.
Percentage Distribution: A percentage distribution is formed by multiplying
each elative frequency or proportion by 100.
The Histogram
The histogram is a summary graph showing a count of the data points falling in
various ranges. The effect is a rough approximation of the frequency distribution
of the data.
The groups of data are called classes, and in the context of a histogram they are
known as bins, because one can think of them as containers that accumulate
data and "fill up" at a rate equal to the frequency of that data class.
Consider the exam scores of a group of students. By defining data classes each
spanning an interval of 10 points and counting the number of scores in each
data class, a frequency table can be constructed as in the following example:
Frequency Table

Group

Count

0-9

10 - 19

20 - 29

30 - 39

40 - 49

50 - 59

60 - 69

70 - 79

80 - 89

90 - 99

To construct the histogram, groups are plotted on the x axis and their
frequencies on the y axis. The following is a histogram of the data in the above
frequency table.
Histogram

Information Conveyed by Histograms


Histograms are useful data summaries that convey the following information:

The general shape of the frequency distribution (normal, chi-square, etc.)


Symmetry of the distribution and whether it is skewed
Modality - unimodal, bimodal, or multimodal

The histogram of the frequency distribution can be converted to a probability


distribution by dividing the tally in each group by the total number of data points
to give the relative frequency.
The shape of the distribution conveys important information such as the
probability distribution of the data. In cases in which the distribution is known, a

histogram that does not fit the distribution may provide clues about a process
and measurement problem. For example, a histogram that shows a higher than
normal frequency in bins near one end and then a sharp drop-off may indicate
that the observer is "helping" the results by classifying extreme data in the less
extreme group.
Bin Width
The shape of the histogram sometimes is particularly sensitive to the number of
bins. If the bins are too wide, important information might get omitted. For
example, the data may be bimodal but this characteristic may not be evident if
the bins are too wide. On the other hand, if the bins are too narrow, what may
appear to be meaningful information really may be due to random variations
that show up because of the small number of data points in a bin. To determine
whether the bin width is set to an appropriate size, different bin widths should be
used and the results compared to determine the sensitivity of the histogram
shape with respect to bin size. Bin widths typically are selected so that there are
between 5 and 20 groups of data, but the appropriate number depends on the
situation.
Bar Diagram
A graph consisting of parallel, usually vertical bars or rectangles with lengths
proportional to the frequency with which specified quantities occur in a set of
data. A bar diagram is also called a bar chart.

Frequency Polygon
A frequency polygon is a graphical display of a frequency table. The intervals are
shown on the X-axis and the number of scores in each interval is represented by
the height of a point located above the middle of the interval. The points are
connected so that together with the X-axis they form a polygon.
A frequency table and a relative frequency polygon for response times in a study
on weapons and aggression are shown below. The times are in hundredths of a
second.
Lower Upper
Limit
Limit
25
30
35
40
45
50

30
35
40
45
50
55

Count
1
4
8
15
3
1

Cumulative
Count

Per
Cent

1 3.12
5 12.48
13 24.96
28 46.80
31 9.36
32 3.12

Cumulative
Per Cent
3.12
15.62
40.62
87.50
96.88
100.00

Note: Values in each category are > the lower limit and
to the upper limit.
Frequency polygons are useful for comparing distributions. This is achieved by
overlaying the frequency polygons drawn for different data sets. The figure
below provides an example. The data come from a task in which the goal is to
move a computer mouse to a target on the screen as fast as possible. On 20 of
the
trials,
the target was a small rectangle; on the other 20, the target was a large
rectangle. Time to reach the target was recorded on each trial. The two
distributions (one for each target) are plotted together. The figure shows that
although there is some overlap in times, it generally took longer to move the
mouse to the small target than to the large one.

Das könnte Ihnen auch gefallen