Sie sind auf Seite 1von 85

Introduction To Statistics

What is Statistics ?
Statistics has become an important subject having useful applications in various walks of life. Statistical is a discipline which is concerned with:

1. Designing experiments and collection of data.


2. Summarizing information to aid understanding.

3. Drawing conclusions from data


4. Estimating the present and predicating the future.

What is Statistics ?
The word Statistics is derived from the Italian word Statista or Latin word Status both meaning

Political State.
The purpose of Statistics is to develop and apply methodology for extracting useful knowledge from data.

Meaning of Statistics ?
Statistics is described in two senses: Plural noun: describes a collection of numerical data. Singular noun: describes as a branch of applied or

mathematical science pertaining to the collection, analysis, interpretation

explanation and presentation of data. It


also provides tools for prediction and forecasting based on data.

Meaning of Statistics ?
Statistics is a way to get information from data.

Statistics

Data

Information

Meaning of Statistics ?
Statistics is a way to get information from data. Data: Facts, especially numerical facts, collected together for reference or information. Statistics: is a tool for creating new understanding from a set of numbers. Information: Knowledge communication concerning some particular facts.

Definition of Statistics ?
Statistics refers to the body of techniques which

has been developed for the collection, presentation and


analysis of quantitative data and for the use of such data in decision.

-Nester & Wasserman.


Statistics is a body of methods of making wise

decisions in the face of uncertainty.


W.A Wallis & H.V. Robert.

Definition of Statistics ?
Major activities in Statistics involves:

Collection of Data Organization of Data

Presentation of data
Analysis of data

Interpretation of data.

Branches of Statistics
Statistics

Descriptive

Inferential

Hypothesis

Estimation

Branches of Statistics
Descriptive Statistics:
is concerned with exploring, visualizing and

summarizing data but without fitting the data to any


models. Such as- Frequency count, ranges Means, Mode, Median, Variance & Standard Deviation.

Branches of Statistics
Inferential Statistics: are used to draw inferences about a population from a sample. There are two methods:1. Estimation

2. Hypothesis testing.
Estimation: Estimate the population mean weight using

the sample mean weight.


Hypothesis testing: Test the claim that the population mean weight is 74 K.g.

What is Business Statistics?


Business Managers & Business Statistics
A very basic role business manager has to perform is to take business decisions. Business decision making is a process of selecting the best out of alternative opportunities open to firm. Many business decisions are taken under the condition of uncertainty & risk.

Use of statistics helps to identify the uncertainty


& reduce risk up to the extent.

What is Business Statistics?


Business Statistics:
Business Statistics is application of Statistical tools and techniques in business Decision making. Business Statistics is a science assisting you to make business decisions under uncertainties, based

on some numerical and measurable scales.


Decision making processes must be based on data, neither on personal opinion nor on belief.

Limitations of Statistics
1. Statistics does not deal with individual measurement 2. Statistics cannot be used to study qualitative phenomenon

3. Statistical results are true only on an average.


4. Statistical data being approximations, are mathematically incorrect. 5. Statistical laws are not exact. 6. Statistical table may be misused

7. Statistics is only one of the methods of studying a


problem

Basic Statistical Concepts

Basic Statistical Concepts


1. Population 2. Census 3. Sampling & Sampling Principles.

4. Parameter
5. Statistic

Population
The group of individuals or units under study is called as Population or Universe.

The population may be finite or infinite.


Finite Population: consists of finite number of units.

For ex. No. of workers in the factory


Infinite Population: if it has infinite number of units. For ex. No. of people seeing television programme.

Census/Sampling
Information on population can be collected in two ways: 1. Census method 2. Sample method Census Method: In census method each any every element of the population is included in the investigation. Sample method: Few representative items of the universe

under study is called as sample.

Census/Sampling
Merits of Census: 1. Data will be collected from each and every item of population. 2. Accurate and reliable results. 3. Intensive study is possible. 4. Information may be used for further survey. Limitations Census: 1. Costly method. 2. It requires more money, time, labour & energy.

3. It is not possible where the population is infinite.


4. Sometime only sample method can be used.

Sampling
Sample: Finite subgroup of population that is representative items

of the universe under study is called as sample.


Sampling: The process selecting a Sampling from a population is called sampling. In sampling representative sample of elements of a population is selected and then analyzed.

Principles of Sampling
Sample: The theory of sampling is based on the principle of

statistical regularity.
It state that a moderately large number of items chosen at random from a large group are almost sure on an average to possess the characteristics of the large group.

Principles of Sampling
Following are the principles of Sampling:
1. Principle of Statistical Regularity

2. Principle of large number


3. Principle of validity 4. Principle of optimisation

Principles of Sampling
Following are the principles of Sampling: 1. Principle of Statistical Regularity : It state that a

moderately large number of items chosen at random


from a large group are almost sure on an average to possess the characteristics of the large group. 2. Principle of large number: other things being stable as the sample size increase the accuracy of results

increase.

Principles of Sampling
3. Principle of validity: It state that sampling method

provides valid estimate about the population parameter.


4. Principle of Optimization: This principle state

desirability of obtaining optimum sample design.

Parameter/Statistic
Parameter is a characteristic of a population.

The statistical constants of population like mean(),


variance(2), Skewness (1 ), kurtosis (2 ), correlation coefficient (p) etc. are called as parameter. Generally the population parameters are unknown.

Statistic
Parameter is a characteristic of a population whereas statistic is a characteristic of a sample. Certain measures worked out from sample such as

mean, median, mode or variance, standard deviation are


called statistic(s). Parameters are the function of population values while statistics are functions of sample observation. Generally, the population parameter are unknown and

they can be estimated by appropriate sample statistics.

Parameter/Statistic

Inferential statistics helps us to guess about population parameters based on random sampling.

Sampling:

Types of sample design.


The way of selecting the sample is known as sample design. (i) ) Probability Sampling (ii) Non Probability Sampling

29

Types of Sampling

Probability Sampling

Non Probability Sampling

Simple Random Sampling

Stratified Random Sampling

Systematic Sampling

Cluster Sampling

Multi Stage Sample

Sequential Sample

Judgmental sampling

Convenience Sample

Quota Sample

Snowball Sample

Types of sample design.


(ii) Probability Sampling
Probability sampling is a method of sampling that

ensures that every unit in the population has a known


non zero chance of being included in the sample.

31

Simple Random Sampling:


It is foundation of probability sampling. In probability sampling every unit has equal chance to
be included in a sample.

Sampling with replacement: when the sampling is with


replacement, the units drawn are placed before the next selection is made.

Sampling without replacement: when the sampling is


without replacement, the units drawn are not placed

before the next selection is made.

Lottery method is mostly used method for sample


section.
32

Stratified Random Sampling:


If the population is heterogeneous then stratified sampling technique is applied so as to obtain a representative sample.

Under it population will be divided into


number of groups called strata in such a manner that the units within a stratum are homogeneous and the units between the strata are heterogeneous. Then next step is to select a sample using

simple random sample of appropriate size from each


stratum.
33

Systematic Sampling:
In systematic sampling units are selected from the population at a uniform interval. Say, select every 15th name on a list, select every 10th house on one side of a street and so on. To facilitate this one has to arrange items in numerical, alphabetical or in any other manner.

This method can be used only when the complete


list of population is available.
34

Cluster Sampling:
Cluster sampling involves grouping the population
and then selecting the groups or the clusters rather than individual elements for inclusion in the sample.

For Example:
1. Suppose some departmental store wishes to sample its
credit card holders. 2. It has issued its cards to 15,000 customers. The sample size is to be kept say 450. 3. For cluster sampling this list of 15,000 card holders could

be

formed into 100 clusters of 150 card holders each. Three

clusters might then be selected for the sample randomly.

Area sampling: If clusters are formed considering

geographic area it is called as Area sampling.

35

Cluster Sampling:
Cluster sampling involves grouping the population
and then selecting the groups or the clusters rather than individual elements for inclusion in the sample.

For Example:
1. Suppose some departmental store wishes to sample its
credit card holders. 2. It has issued its cards to 15,000 customers. The sample size is to be kept say 450. 3. For cluster sampling this list of 15,000 card holders could

be

formed into 100 clusters of 150 card holders each. Three

clusters might then be selected for the sample randomly.

Area sampling: If clusters are formed considering

geographic area it is called as Area sampling.

36

Multi-stage sampling:
This is a further development of the idea of cluster sampling. Under multi-stage sampling the first stage may be to select large primary sampling units such as states, then districts, then towns and finally certain families within towns.

37

Sequential sampling:
This is somewhat a complex sample design. Size of the sample is not fixed in advance but is determined according to mathematical decisions on the basis of information yielded as survey progresses. This design is usually adopted in the in the context of statistical quality control. In practice, several of the methods of sampling described above may well be used in the same study 38

Types of sample design.


(ii) Non Probability Sampling

Also known as purposive or deliberate sampling


This sampling method involves purposive or deliberate selection of particular units from the universe for the study. It includes two types of Sampling

Convenience sampling
Judgment sampling Quota sampling
39

Non Probability Sampling


Convenience sampling: when population elements are selected for inclusion in the sample based on the

ease of access, it can be called convenience


sampling . Judgment sampling: the researcher judgment is used for selecting items which he considered as representative of the population. Quota sampling: In this method interviewers are simply given quota to be filled from different strata. The actual selection of items, left to the

interviewers judgment.

40

Data Measurement

Measurement & scaling techniques


Measurement: By measurement we mean the process of assigning numbers to objects or observations. Properties like weight, height, length etc., can be measured directly with some standard unit of measurement etc. However it is difficulty to measure properties like motivation to succeed, ability to stand against stress etc. Researcher has to create various scaling technique so that each and every variable under study can be measure accurately.
42

Measurement Scales
Four Basic Scales of Measurement Nominal Scales Ordinal Scales Interval Scales Ratio Scales

Nominal scale:
Nominal scale is simply a system of assigning number symbols to events in order to label them.

For ex. Numbers on the cricket players jersey .


These numbers does not have any ordered scale. These numbers are not useful to conduct any further statistical calculations.. In spite of all this nominal scales are still very useful for classifying major sub-groups of the population.

44

Nominal scale:
If one describes respondents in a survey according to their occupation such as banker, doctor, computer programmer one has used a nominal scale. If one has used question as check all the brands you would consider purchasing

I. Sony
II.Videocon

III. Samsung
IV. L. G
45

Ordinal scale
The ordinal scale places events in order.
Rank orders represent ordinal scales and are

frequently used in research relating to qualitative


phenomena. A students rank in his graduation class involves the use of an ordinal scale. The appropriate measure of central tendency is the median.
46

Ordinal scale
For example if one has used question such, as please rank each brand in terms of your preference such as 1 represent your first choice, and 2 represent your second choice, and so on. Sony Videocon Samsung L.G B.P.L Phillips The ordinal scale places events in order.
47

Interval scale
Interval scales are those in which the distance between each variable is known. It demonstrates absolute differences between each scale point.

The distance is normally defined as one scale unit.


The location of zero point is not fixed, since zero does not denote the absence of the attribute.

Both zero point and units of measurement are arbitrary


48

Interval scale
Interval scale Examples:
If customers were asked to evaluate a salesmen performance from the list, such as: A. Extremely friendly B. Very friendly

C. Somewhat friendly
D. Somewhat unfriendly E. Very unfriendly F. Extremely unfriendly
49

Interval scale
(c) Interval scale Examples: Please rate each brand in terms of its overall performance.
1
Brand Very Poor Sony Videocon L.G B.P.L Phillips
50

5
Very Good

(d)Ratio Scale:
Ratio scales are the ones in which true zero origin exists such as actual number of purchases in a certain time period, rupees spent, miles traveled etc. A ratio scale allows the researcher not only to identify the absolute differences between each scale point but also to make absolute comparisons between the responses.

51

(d)Ratio Scale:
d) Ration scale Examples:

Please indicate your age in year________


Approximately how many times in the last month have you purchased anything over Rs. 1000 in value at BigBazar?

0 1

(More specify_ _ _ _ )

52

Primary Scales of Measurement


Scale

Nominal

Numbers Assigned to Runners Rank Order of Winners

Finish

Ordinal

Finish Third place 8.2 Secon d place 9.1 First place 9.6

Interval

Performance Rating on a 0 to 10 Scale Time to Finish, in

Ratio
Seconds

15.2

14.1

13.4

Frequency Tables, Charts, & Graphs.

Introduction
A basic role Business Manager has to perform at
every step is to take decisions. For it he has to collect huge amount of data. However the collected data are in a ungrouped & raw format. In order to facilitate decision making certain processes need to be done on the data and those are: 1. Classification & Tabulation 2. Diagrammatic presentation

3. Graphical presentation
55

Classification
Is the ways of presenting the raw data in an

orderly and systematic manner which helps for


further analysis and interpretation of data. Classification id the process of arranging things in groups according to their similarity, or identity. For ex. Students in the class can be arranged according to their: gender, Basic graduation, rural & urban, experience & inexperience etc.
56

Types of Classification 1. Chronological Classification

2. Geographical Classification
3. Qualitative Classification 4. Quantitative Classification

57

Types of Classification
In Chronological Classification the collected data will be arranged according to the time of expressed in year, months, weeks etc.
Year Birth Rate 2000 36.6 2001 35.7 2002 36.0 2003 34.6 2004 35.0 2005 36.2

58

Types of Classification
In Geographical Classification the collected data will be arranged according to geographical region. For ex.
Year Birth Rate America 36.6 China 35.7 Denmark 36.0 France 34.6 India 35.0 Shri Lanka 36.2

59

Types of Classification
Quantitative Classification refers to the
classification of data according to some quantitative phenomena, such as height, weight length etc. In this type of classification there are two important elements: 1. The variable 2. The Frequency.
60

Frequency Distribution
The frequency of an observation is the number of times that observation occurs Frequency distribution is a series when a number of observations with similar of closely related values are put in separate bunches or groups. Three main reasons for preparing frequency distribution: 1. To estimate frequency of population from the sample data.

2. To facilitate the computation of various statistical data.


3. To facilitate the analysis of data.
61

Frequency Distribution
In a survey of 40 families in a village, the number of children per family was recorded and the following data obtained. 1, 0, 3, 2, 1, 5, 6, 2, 2, 1, 0, 3, 4, 2, 1, 6, 3, 2, 1, 5, 3, 3, 2, 4, 2, 2, 3, 0, 2, 1, 4, 5, 3, 3, 4, 4, 1, 2, 4, 5.
Observation 0 1 2 3 4 5 6 Tally marks III IIII II IIII IIII IIII III IIII I IIII II Frequency 3 7 10 8 6 4 2

It is called as Discrete or ungrouped Frequency Distribution

62

Frequency Distribution
Grouped frequency:
Marks 15-19 20-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 65-69 No. of Students(f) 11 9 12 26 32 35 25 24 10 8 4

70-74
75-79

2
2
63

Frequency Distribution
Continuous Frequency distribution when variables
are in continuous format there is need to use continuous frequency distribution. Say age of students: 4-6, 7-9, are not correct, instead of it: Age in years: Below 6, 6 or more but less than 9,

9 or more but less than 12,


64

Formation of Frequency Distribution: Continuous data


1. Class Limit: These are the lowest and highest value(25-29)
2. Class Interval: the difference between upper and lower limit.

Here it is 4.
3.Class of Frequency: the number of observation belonging to a

particular class is known as the frequency of that class.


4.Range: the difference between largest value and smallest

value is called the range and is denoted by R.


5.Class Mid-point = (Lower limit of the Class + Upper limit of

the Class)/2.

65

Types of Class Intervals


1. Exclusive Method 2.Inclusive Method

3.Open-End Classes

66

Exclusive Method
Profit earned by Companies
Profit (Rs Lakh) 10-20 20-30 Number of Companies 12 17

30-40
40-50 50-60 Total

30
25 16 100

In this method the upper limit is exclusive and the item of that value is included in the next class. Say in the above example Company earning 20 lakh profit will be included in the Class internal 20-30.
67

Inclusive Method
Profit earned by Companies
Profit (Rs Lakh) Below 10 Number of Companies -

10-20
20-30 30-40 40 and above

12
17 30 25

Total

100

In this method both upper limit and lower are included in the class.

68

Open-End classes
Profit earned by Companies
Profit (Rs Lakh) 10-19 Number of Companies 12

20-29
30-39 40-49 50-59

17
30 25 16

Total

100

A class limit is missing either at the lower end of the first class interval or at the upper end of the last class interval.
69

Charts & Graphical Presentation

Graphical Presentation
Charts & Graphics are visual aids which gives a
birds eye view of a given set of numerical data. Important graphs are: 1.Histogram 2.Frequency Polygen

3.Frequency curve
4.Pie Chart 5.Bar diagram.
71

Histogram
Class interval 0 20 20 40 Frequency 13 18

40 60
60 80 80 100

25
15 9

Total

f 80

Histogram
A histogram is a graph that displays the data by using adjacent vertical bars (unless the frequency of a class is 0) of various heights to represent the frequencies of the classes. A histogram is a simple (vertical) bar chart in which the frequency of observations within a class interval is represented by the corresponding bar. The class intervals, and hence the width of the bars, will be of equal size

Frequency polygon
A frequency polygon is a line graph of a grouped frequency distribution It is constructed by marking the point on the top of each histogram bar at the midpoint of the class interval, then joining these points by straight lines

Ogives
An ogive (or ogive curve) is the graphical presentation of a cumulative frequency distribution. These ogives are classified as less than and more

than ogives.
Less than, cumulative frequencies are plotted against upper boundaries of their respective class intervals. Grater than cumulative frequencies are plotted against lower boundaries of their respective class intervals.

Ogives
For Example:

Bar charts
A bar chart consists of a series of rectangular bars where the lengths of the bars represent the magnitudes of the respective quantities.
Types of Bar Diagrams: 1. Simple bar Diagrams 2. Multiple bar diagrams

3. Sub-divided or component bar diagram


4. Percentage bar diagrams 5. Deviation or Bilateral Diagrams.

Bar charts
1. Simple bar Diagrams

Year 2001

Production 901

A simple bar diagram can be drawn using horizontal or vertical bar. In business and economics, it is very a common diagram.
Production in tonnes

2002
2003 2004 2005

1792
2500 4464 5,115
6000

5,115
5000

4464
2001 2002

Vertical Bar Diagram

4000 3000 2000 1000 0 2001 2002 2003 2004 2005

2500 1792 901

2003 2004 2005

Bar charts
1. Simple bar Diagrams:
Horizontal Bar Diagram
2005

2004

year

2003

2002

2001

1000

2000

3000

4000

5000

6000

Prodiction in Tonnes

Bar charts
Multiple bar diagrams Multiple bar diagram provides more than one phenomenon and highly useful for direct comparison. The bars are drawn side-by-side and different

columns, shades can be used


for indicating each variable used.

Bar charts
Sub-Divided Bar Diagram

In these bar diagram, the


bar is divided into various parts in proportion to the value given in the data.
3500 3000 2500 1500 2000 1500 1000 1500 500 0 1250 1250 600 900

Item of Expenditure Food Clothing Education Miscellaneous Saving Total

Family A (Income) 1500 1250 1250 1900 1000 6900

Family B (Income) 1500 600 900 1000 1000 5000

1000 Family B (Income) 1000 Family A (Income)

1900
1000

Food

Colthing

Education

Miscellaneous

Saving

Bar charts
Percentage Sub-

Divided Bar Diagram


Here the components are not the actual values but the percentage of the whole. Here each bar represent 100 percent all bars are of equal height.

Pie Diagram
Pie Diagram
Pie diagram helps us to show the portioning of a total into its component parts. It is used to show classes or groups of data in proportion to whole data set. The entire pie represents all the data, while each slice represents a different class or group within

the whole.

Pie Diagram

Pie Diagram

Revenue collections for the year 2005-2006

by government in Rs. (crore)s for petroleum products


are as follows. Draw the pie diagram.

Pie Diagram

Das könnte Ihnen auch gefallen