Sie sind auf Seite 1von 25

Welcome to Business Statistics Lecture 1 & 2

Contents: Basic Statistical Concepts Summarisation of Data Frequency Distribution Measures of Central Tendency Measures of Dispersion Relative Dispersion, Skewness.
1

Using Statistics.

Malcom Forbes a businessman and a key hot air balloon enthusiast lost his way & landed in the middle of a cornfield. He saw a man running to him and had the following conversation, Forbes Sir, Can you tell me where I am? Man Certainly, you are in a basket in a field of Corn. Forbes Sir, You must be a Statistician. Man Thats amazing. How did you know? Forbes Easy. Your information is concise, precise and absolutely useless!!! A GOOD STUDENT of Statistics should ensure that the information resulting from a good statistical analysis is always CONCISE, often PRECISE and never USELESS.

Types of Variables
A.

Qualitative or Attribute variable - the characteristic being studied is generally nonnumeric.


A.

B.

Qualitative variables could also be described by numbers, although the description might be arbitrary.
A.

EXAMPLES: Gender, religious affiliation, type of automobile owned, state of birth, eye color are examples.

C.

Quantitative variable Can be described by a number for which arithmetic operations such as averaging makes sense.
A.

Examples: Car Registration number, State of birth 1, 2, 3, 4, etc.

D.

Quantitative Variable can be either Discrete or Continuous.


3

EXAMPLES: Balance in your mobile account, minutes remaining in class, or number of children in a family.

Summary of Types of Variables

Four Scales of Measurement Weakest 1, Strongest 4.


1 Nominal scale - data that is classified into categories and cannot be arranged in any particular order. Numbers are just labels for groups or classes. Nominal stands for NAME 3 Interval scale - similar to the ordinal scale, with the additional property that meaningful amounts of differences between data values can be determined. There is no natural zero point.
EXAMPLE: Time of a day. 10:00 a.m. is not twice of 5:00 a.m. but the interval between 00:00 & 10:00 a.m. is twice the interval between 00:00 and 5:00 a.m..

EXAMPLES: eye color, gender, religious affiliation, Platform number.

Ordinal scale involves data arranged in some order according to their relative size or quality. The differences between data values cannot be determined or are meaningless. We know one is better than the other but how much better is not known.

EXAMPLE: During a taste test of 4 soft drinks, Coca Cola was ranked number 1, Sprite number 2, Sevenup number 3, and Orange Mirinda number 4.

4 Ratio scale - the interval scale with an inherent zero starting point. Differences and ratios are meaningful for this level of measurement.

EXAMPLES: Monthly income of surgeons, or distance traveled by manufacturers representatives per month.
5

Population versus Sample


A population is a collection of all possible individuals, objects, or measurements of interest. The population is also called the UNIVERSE. Greek letters, like or are used for population & termed as Population Parameter. A sample is a portion, or part, or subset of measurements selected from the population of interest. Roman letters, x, s are used for describing sample statistic.

Types of Statistics Descriptive Statistics


Data and Data Collection A set of measurements obtained on some variable is called a data set. Descriptive Statistics - methods of organizing, summarizing, and presenting data in an informative way. Generally when the entire population space is considered, tabulating & presenting the data is a challenge.

Inferential Statistics: A decision, estimate, prediction, or generalization about a population, based on a sample.

Problems to be solved.

Percentiles & Quartiles. Measures of Central Tendency,


Measures of Dispersion,

Mean, Arithmetic, Geometric, Harmonic. Mean for individual, discrete, continuous distribution. Mean from Assumed mean. Median for individual, discrete, continuous distribution. Mode for individual, discrete, continuous distribution Range. Mean Deviation. Standard Deviation. Coefficient of Variation. Combined Standard Deviation. Test for Skewness.
8

Skewness,

Requisites of a Good Measure of Central Tendency

It should be rigidly defined, which means that it should be calculated and interpreted in the same way by everyone. It should be based on all values of the data. It should not be unduly affected by the extreme values. It should be amenable for further algebraic treatment. It should be amenable to sampling, by which we mean that the results obtained by various samples should be similar. It should be simple to compute.
9

Some measures of Central Tendency.


Arithmetic Mean: It is an mathematical average and is obtained by dividing the sum of the observations by the number of observations. Median: It refers to the VALUE of the middle observation of the array & is an positional average. Quartiles, Deciles, Percentiles: These are also positional averages and divides the series into four parts, ten parts and 100 parts respectively. MODE: MODE is the Value of the data that occurs most frequently. Geometric Mean: It is a specialised average and is applicable when quantities requiring averaging are drawn from situations following Exponential law of growth or decline. Harmonic Mean: Harmonic Mean is used to average rates.

10

Arithmetic Mean
Merits Easy to understand and simple to calculate. It is based on all items of the series. Rigidly defined by a mathematical formula. It is capable of further algebraic treatment. It has sampling stability and is least affected by sampling fluctuations. Arrangement of items is not required.

Demerits It is affected by extreme values & thus for distributions where concentration is on small or big values the mean is not an ideal representative. For open ended distributions mean cannot be calculated with accuracy. Mean is not useful for studying quantitative phenomena like beauty, intelligence, honesty, etc. Mean does not have a life of its own. Average number of children is 3.6 in India is meaningless. Mean averages out the positive and negative deviations, which is incorrect.
11

Median
Merits Useful in Open ended series as it is based on position and not on the values. Easier to compute as compared to mean in case of unequal class intervals. It is not affected by extreme values. Suitable in case of Qualitative Data It minimises total absolute deviations. Demerits Requires arrangement of data. It is not based on all the items of the series. Incapable of any algebraic treatment & combined medians cannot be obtained. Assumption of uniformly distributed median class is not always true.

12

MODE
Merits In certain situations mode is the only suitable average, e.g. size of shoes, garments, wages, etc. It is not affected by extreme values. It can be used for qualitative phenomena. It indicates point of maximum concentration in case of highly skewed distributions. Limitations In case of bi modal or multi modal series, mode cannot be uniquely defined. It is incapable of further algebraic treatment. It is not based on all the items of the series. It is not rigidly defined because different formulae will give different answers. Its value is affected by size of class interval.

13

Case Study Descriptive Statistics


Ms. Kathryn Ball of AutoUSA wants to develop tables, charts, and graphs to show the typical selling price on various dealer lots. The table on the right reports only the price of the 80 vehicles sold last month at Whitner Autoplex.

14

Constructing a Frequency Table Example


Step 1: Decide on the number of classes. A useful recipe to determine the number of classes (k) is the 2 to the k rule. such that 2k > n. There were 80 vehicles sold. So n = 80. If we try k = 6, which means we would use 6 classes, then 26 = 64, somewhat less than 80. Hence, 6 is not enough classes. If we let k = 7, then 27 128, which is greater than 80. So the recommended number of classes is 7. Step 2: Determine the class interval or width. The formula is: i (H-L)/k where i is the class interval, H is the highest observed value, L is the lowest observed value, and k is the number of classes. ($35,925 - $15,546)/7 = $2,911 Round up to some convenient number, such as a multiple of 10 or 15 100. Use a class width of $3,000

Constructing a Frequency Table - Example

Step 3: Set the individual class limits

16

Constructing a Frequency Table

Step 4: Tally the vehicle selling prices into the classes. Step 5: Count the number of items in each class.
17

Relative Frequency Distribution


To convert a frequency distribution to a relative frequency distribution, each of the class frequencies is divided by the total number of observations.

18

Graphic Presentation of a Frequency Distribution


The three commonly used graphic forms are:
Histograms Frequency polygons Cumulative frequency distributions

19

Histogram
Histogram for a frequency distribution based on quantitative data is very similar to the bar chart showing the distribution of qualitative data. The classes are marked on the horizontal axis and the class frequencies on the vertical axis. The class frequencies are represented by the heights of the bars.

20

Histogram Using Excel

21

Frequency Polygon
A frequency polygon also shows the shape of a distribution and is similar to a histogram. It consists of line segments connecting the points formed by the intersections of the class midpoints and the class frequencies.

22

Cumulative Frequency Distribution

23

Cumulative Frequency Distribution

24

Standard Deviation,
Merits. It is based on all items of the distribution. It is amenable to algebraic treatment. It is least affected by fluctuations in sampling. It facilitates the calculation of combined standard deviation of two or more groups. It provides a unit of measurement for normal distribution. Demerits. It cannot be used for comparing the variability of two or more series of observations given in different units. It is difficult to compute as compared with other measures of dispersion. It is very much affected by the extreme values & importance is given to extreme values from the mean than the near values.
25

Das könnte Ihnen auch gefallen