Sie sind auf Seite 1von 8

Introduction to Statistics

Objectives
At the end of this module, the students will be able to:
1. Describe statistics;
2. Appreciate the significance of the use of statistics; and
3. Apply the concepts of the different types of statistical data.

I. Statistics and Statistic Defined

Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting numerical data to
assist in making more effective decisions. In its more common usage, refers to numerical information. It can appear
in graphic forms as well as in sentence form (Lind, Marchal, Wathen, 2006 and Kvantli, Guynes, Pavur, 1986).
 Collection: This refers to the gathering of information using scientific methods (e.g.
experimentation, survey, interviews).
 Organization: This is the simplification and presentation of the information which have been
collected. In organizing information, three methods can be done—tabular, graphical or
numerical.
 Analysis: This is working out the information which were collected to find out values, trend,
differences, associations or predictions. In doing so, investigators apply various tools and
techniques in Statistics. There are three types of analysis depending on the type of number of
variables that are to be used—univariate, bivariate and multivariate.
 Interpretation: This is seeing beyond what the information have revealed. A sound
interpretation is characterized as having scientific rigor and evidence.

Statistic is a characteristic of a sample (mean, standard deviation, variance, or any other measure based on a
sample data). A collection of more than one figure is called statistics [plural] (Lind, et al, 2006).

II. Characteristics of Statistics

 Statistics are aggregates of facts.


 Statistics are numerically expressed.
 Statistics are affected to a marked extent by multiplicity of causes.
 Statistics are enumerated or estimated according to a reasonable standard of accuracy.
 Statistics are collected for a predetermine purpose.
 Statistics are collected in a systemic manner.
 Statistics must be comparable to each other.

III. Functions or Uses of Statistics

(1) Statistics helps in providing a better understanding and exact description of a phenomenon of nature.
(2) Statistical helps in proper and efficient planning of a statistical inquiry in any field of study.
(3) Statistical helps in collecting an appropriate quantitative data.
(4) Statistics helps in presenting complex data in a suitable tabular, diagrammatic and graphic form for an
easy and clear comprehension of the data.
(5) Statistics helps in understanding the nature and pattern of variability of a phenomenon through
quantitative observations.
(6) Statistics helps in drawing valid inference, along with a measure of their reliability about the population
parameters from the sample data.
(7) To scientifically measure conditions of any given problem and assess existing relationship(s).
(8)To show the laws underlying facts and events that cannot be determined by individual observations.
(9)To reveal cause and effect relations that otherwise may remain unknown
(10)To uncover ambiguous trends and behavior in related conditions.

IV. Limitations of Statistics


(1) Statistics laws are true on average. Statistics are aggregates of facts. So single observation is not a
statistics, it deals with groups and aggregates only.
(2) Statistical methods are best applicable on quantitative data.
(3) Statistical cannot be applied to heterogeneous data.
(4) It sufficient care is not exercised in collecting, analyzing and interpretation the data, statistical results
might be misleading.
(5) Only a person who has an expert knowledge of statistics can handle statistical data efficiently.
(6) Some errors are possible in statistical decisions. Particularly the inferential statistics involves certain
errors. We do not know whether an error has been committed or not.

V. Importance of Statistics in Different Fields


Statistics plays a vital role in every fields of human activity. Statistics has important role in determining the
existing position of per capita income, unemployment, population growth rate, housing, schooling medical facilities
etc…in a country. Now statistics holds a central position in almost every field like Industry, Commerce, Trade,
Physics, Chemistry, Economics, Mathematics, Biology, Botany, Psychology, Astronomy etc…
 Statistical Techniques/tools are used extensively in quantitative research.
(1) Business:
Statistics play an important role in business. A successful businessman must be very quick and accurate
in decision making. He knows that what his customers wants, he should therefore, know what to produce and
sell and in what quantities. Statistics helps businessman to plan production according to the taste of the
costumers, the quality of the products can also be checked more efficiently by using statistical methods. So
all the activities of the businessman based on statistical information. He can make correct decision about the
location of business, marketing of the products, financial resources etc…
(2) In Economics:
Statistics play an important role in economics. Economics largely depends upon statistics. National
income accounts are multipurpose indicators for the economists and administrators. Statistical methods are
used for preparation of these accounts. In economics research statistical methods are used for collecting and
analysis the data and testing hypothesis. The relationship between supply and demands is studies by
statistical methods, the imports and exports, the inflation rate, the per capita income are the problems which
require good knowledge of statistics.
(3) In Mathematics:
Statistical plays a central role in almost all natural and social sciences. The methods of natural
sciences are most reliable but conclusions draw from them are only probable, because they are based on
incomplete evidence. Statistical helps in describing these measurements more precisely. Statistics is branch
of applied mathematics. The large number of statistical methods like probability averages, dispersions,
estimation etc… is used in mathematics and different techniques of pure mathematics like integration,
differentiation and algebra are used in statistics.
(4) In Banking:
Statistics play an important role in banking. The banks make use of statistics for a number of
purposes. The banks work on the principle that all the people who deposit their money with the banks do
not withdraw it at the same time. The bank earns profits out of these deposits by lending to others on
interest. The bankers use statistical approaches based on probability to estimate the numbers of depositors
and their claims for a certain day.
(5) In State Management (Administration):
Statistics is essential for a country. Different policies of the government are based on statistics.
Statistical data are now widely used in taking all administrative decisions. Suppose if the government wants
to revise the pay scales of employees in view of an increase in the living cost, statistical methods will be
used to determine the rise in the cost of living. Preparation of federal and provincial government budgets
mainly depends upon statistics because it helps in estimating the expected expenditures and revenue from
different sources. So statistics are the eyes of administration of the state.
(6) In Accounting and Auditing:
Accounting is impossible without exactness. But for decision making purpose, so much precision is
not essential the decision may be taken on the basis of approximation, know as statistics. The correction of
the values of current asserts is made on the basis of the purchasing power of money or the current value of
it.
In auditing sampling techniques are commonly used. An auditor determines the sample size of the
book to be audited on the basis of error.
(7) In Natural and Social Sciences:
Statistics plays a vital role in almost all the natural and social sciences. Statistical methods are
commonly used for analyzing the experiments results, testing their significance in Biology, Physics,
Chemistry, Mathematics, Meteorology, Research chambers of commerce, Sociology, Business, Public
Administration, Communication and Information Technology etc…
(8) In Astronomy:
Astronomy is one of the oldest branch of statistical study, it deals with the measurement of distance,
sizes, masses and densities of heavenly bodies by means of observations. During these measurements errors
are unavoidable so most probable measurements are founded by using statistical methods.
Example: This distance of moon from the earth is measured. Since old days the astronomers have been
statistical methods like method of least squares for finding the movements of stars.

IV. Two Areas of Statistics (Lind, et al., 2006)

A. Descriptive Statistics is the method of organizing, summarizing, and providing a description of the sample
data in an informative way. It includes presenting data in percentage, ranks, standard units, frequency distribution,
measures of location, measures of dispersion, among others.
For Example: Industrial statistics, population statistics, trade statistics etc… Such as businessman
make to use descriptive statistics in presenting their annual reports, final accounts, bank statements

Examples 1: A poll found that 49% of the people in a survey knew the name of the first book of the Bible. The
statistic 49 describes the percentage (proportion) of persons who knew the first book of the Bible.

2: According to consumer reports, Sharp washing machine owners reported 9 problems per 100
machines in 2007. The statistic 9 describes the number of problems out of every 100 reported
machines.

B. Inferential Statistics is used to infer the truth or falsity of a hypothesis. It includes making a decision, estimate,
prediction, or generalization about a population, based on a sample. The ultimate goal is to gain information
about the sample drawn from a population rather than on the population itself. Inferential statistics allow us to
make accurate inferences about the population itself on the basis of the sample data.
For Example: Suppose we want to have an idea about the percentage of illiterates in our country. We take a
sample from the population and find the proportion of illiterates in the sample. This sample proportion with the
help of probability enables us to make some inferences about the population proportion. This study belongs to
inferential statistics.

A population is a collection of possible individuals, objects, or measurements of interest. It is a group of


individuals/subjects that comprise the same characteristics.

A sample is a portion, or part, of the population of interest. It is a subgroup of the target population which
the researcher plans to study for the purpose of making generalization about the entire population.

Examples 1: TV networks constantly monitor the popularity of their programs by hiring private research firms
and other organizations to sample the preferences of TV viewers.

2: The accounting department of a large firm will select a sample of the invoices to check for
accuracy for all the invoices of the company.

3: Wine tasters sip a few drops of wine to make a decision with respect to all the wine waiting to
be released for sale.

Two Types of Inferential Statistics (Mann, 2004)


1. Non-parametric statistics are those used when the variables being analyzed are either nominal or ordinal,
and when interval measurement may not be assumed. The name “non-parametric” stems from the fact that these
statistics are not based on assumptions about the parameters of the normal distribution. But this does not imply no
assumptions are necessary.

2. Parametric statistics are used when interval measurement can be assumed and the sample size is
appropriate.

V. Kinds of Variables (Lind, et al, 20006 & Mann, 2004)


Variable: A quantity which can vary from one individual or object to and other. It is usually denoted by
the last letters of alphabets .
For Example: Heights and Weights of students, Income, Temperature, No. of Children in a family etc…
There are several types of variables and their names are derived based on their functions:
 Independent variable: This explains the outcome of the investigation. Often times this is called
explanatory variable or predictor variable.
 Dependent variable: This is the result or outcome of the investigation. This is otherwise called as
response variable.
 Intervening variable: This brings an effect to the outcome of the study which may work independently
or in tandem with the independent variable. It destroys the predictive ability of the independent variable
thus it has to be controlled. There are two types of intervening variables—Moderators and Mediators.
Moderators have strong contingent effect on the relationship between the independent and dependent
variable. Mediators describe how rather than when the effect of the independent variable will occur on
the dependent variable.

1. Qualitative or attribute variable: the characteristic or variable being studied is nonnumeric.

Examples: Gender, religious affiliation, type of automobile owned, place of birth, hair color.

2. Quantitative variable: the variable can be reported numerically. Quantitative variables can be classified
as either discrete or continuous.

a. Discrete variables can only assume certain values and there are usually “gaps” between values.
Typically, discrete variables result from counting.

Examples: the number of chairs in a classroom (1,2,3,..., etc...).


number of cars exiting the University main gate over an hour (5,8,etc.)
number of students in each section of graduate Statistics class, etc.
number of children in a family.

b. Continuous variables: can assume any value within a specific range. A variable which can
assume each and every value within a given range. It can occur in decimals.
For Example: Heights and Weights of students, Speed of a bus, the age of a Shopkeeper, the life time of
a T.V etc…
the time it takes to fly from Cebu to Manila.
air pressure in a tire
weight of a shipment of grains (15.01 tons, 15.45 tons, etc.)
distance between Cebu and Bohol, etc.
balance in your checking account
minutes remaining in class
Constant: A quantity which can be assuming only one value. It is usually denoted by the first letters of

alphabets .

For Example: Value of and value of

VI. Different Types of Data (Mann, 2004; Kvanli, et al., 1986)

Data are information, known facts, figures, observation, statistics, records, and reports. They can be
classified in different ways.

Classification of Data: The process of arranging data into homogenous group or classes according to some
common characteristics present in the data
For Example: The process of sorting letters in a post office, the letters are classified according to the cities
and further arranged according to streets.

Bases of Classification:
There are four important bases of classification:
(1) Qualitative Base (2) Quantitative Base (3)Geographical Base (4) Chronological or Temporal Base
(1) Qualitative Base:
When the data are classified according to some quality or attributes such as sex, religion, literacy, intelligence
etc…
(2) Quantitative Base:
When the data are classified by quantitative characteristics like heights, weights, ages, income etc…
(3) Geographical Base:
When the data are classified by geographical regions or location, like states, provinces, cities, countries etc…

(4) Chronological or Temporal Base:


When the data are classified or arranged by their time of occurrence, such as years, months, weeks, days
etc… For Example: Time series data.

Types of Classification:
(1) One -way Classification:
If we classify observed data keeping in view single characteristic, this type of classification is known as one-
way classification.
For Example: The population of world may be classified by religion as Muslim, Christians etc…

(2) Two -way Classification:


If we consider two characteristics at a time in order to classify the observed data then we are doing two way
classifications.
For Example: The population of world may be classified by Religion and Sex.

(3) Multi -way Classification:


We may consider more than two characteristics at a time to classify given data or observed data. In this way
we deal in multi-way classification.
For Example: The population of world may be classified by Religion, Sex and Literacy.

a. Ungrouped Data are raw, unorganized information

Example: Scores of a 20 students on a 20 item quiz


15, 8, 9, 14, 18, 7, 17, 3, 15, 19, 7, 20, 13, 15, 18, 15, 2, 9, 2, 5

b. Grouped Data are data presented in a frequency distribution table, organized or processed data.
Example: Rating Frequency
Superior 6
Good 28
Average 25
Poor 12
Inferior 3

Schema of data classification of data is given below.

DATA

Qualitative or attribute quantitative or numerical


(type of car owned)

Discrete Continuous
(number of children) (time taken for an exam)

Nominal Ordinal Interval Ratio Interval Ratio

VII. Levels of Data Measurement (Lind, at al., 2006)

A. Nominal Data: The “lowest” level of the most primitive measurement. Classification has no natural order.
There is no measurement involved, only counts. There is no particular order to the categories. Data categories are
mutually exclusive and exhaustive, so an object belongs to one and only one category. Data categories have no logical
order.

Examples: hair color, gender, religious affiliation.

Nominal scales may be further subdivided into two groups: Renaming and Categorical.
1. Nominal-Renaming occurs when each object in the set is assigned a different number (i.e. renamed
with a number). Examples of nominal-renaming are social security numbers or numbers on the back of player’s
jersey. The former is necessary because different individuals have the same name, i.e. Mary Smith, and because
computers have an easier time dealing with numbers rather than alpha-numeric characters.
2. Nominal-Categorical occurs when objects are grouped into subgroups and each object within a
subgroup is given the same number. The subgroups must be mutually exclusive, that is, an object may not belong to
more than one category or subgroup. An example of nominal-categorical measurement is grouping people into
categories based upon stated political party preference (Nacionalista, LAKAS-KAMPI, or Laban,) or upon sex
(Male or Female.) In the political party preference system Nacionalistas might be assigned the number "1", LAKAS’
"2", and Laban "3", while in the latter females might be assigned the number "1" and males "2".
In general it is meaningless to find means, standard deviations, correlation coefficients, etc., when the data
is nominal. This does not mean, however, that such systems of measurement are useless, for in combination with
other measures they can provide a great deal of information.

Properties
1. Mutually exclusive. An individual or item that, by virtue of being included in one category, must be
excluded from any other category. example: eye color.

2. Exhaustive. Each person, object, or item must be classified in at least one category.
Example: religious affiliation.

B. Ordinal Level Data: may be arranged in some order, but differences between data values cannot be
determined or are meaningless. One category is “higher” or “better” than the next one. Data categories are mutually
exclusive and exhaustive and are ranked according to the particular trait they posses.

Example: Rating of a Student Teacher. During a taste test of 4 colas, cola C was ranked number 1, cola
B was ranked number 2, cola A was ranked number 3, and cola D was ranked number 4.

Cola C B A D
Rank 1st 2nd 3rd 4th

Properties
1. Data classifications are mutually exclusive and exhaustive.

2. Data classifications are ranked or ordered according to the particular trait they posses.

C. Interval Level Data: the next highest level of measurement. It includes all the characteristics of the ordinal
level, moreover, the difference between values is a constant size. There is no natural zero point.

Example: Temperature on the Celsius scale on a certain winter time in Canberra City Australia: (12 oC,
0oC, -5oC). Note that the zero value is just a point on the scale and does not represent the absence of the
condition.

Properties
1. Data categories are mutually exclusive and exhaustive

2. Data categories are scaled according to the amount of the characteristics they possess.

3. Equal differences in the characteristic are represented by equal differences in the numbers assigned
to the categories.

D. Ratio Level Data: the “highest” level of measurement, has all the characteristics of the interval level, in
addition, the zero (0) point is meaningful and the ratio between two numbers is meaningful. examples: money, units
of production, weight, income

Properties
1. Data categories are mutually exclusive and exhaustive

2. Data categories are scaled according to the amount of the characteristics they possess.

3. Equal differences in the characteristic are represented by equal differences in the numbers assigned
to the categories.

4. The point 0 reflects the absence of the characteristic.

Das könnte Ihnen auch gefallen