Sie sind auf Seite 1von 5

BA 302: Chapter-1 Instructions: Introduction & Definitions

By Dr. Kishor Guru-Gharana


Why Statistics? Like mathematics the tools of Statistics can be universally applied
to any field which involves numbers and values which depend on chance or have
associated probabilities. In particular Statistics is applied in Business and
Managerial Decision making and Research, Empirical Economics and Finance,
Marketing, Psychology, Medicine, other Social Studies, and Government Decision
making at all levels. This list is only indicative not exhaustive.
What is Statistics?: Statistics is a science of making sense out of a large set of
numbers. It involves collecting, organizing, analyzing and interpreting collected
quantitative information.
Sometimes the word statistics is also used to mean the data. But the above
definition applies to the body of knowledge and tools for designing samples and
surveys, collecting data, presenting the data in tabular or graphical forms,
summarizing the large data set in terms of key summary measures 1, estimating
relationships among variables, conducting tests of Hypotheses or assumptions
about the Population from which the samples are selected, drawing conclusions
and making forecasts or predictions based on some given or assumed values of the
predictor variables.
Types of Statistics: In fact the better title would be branches of Statistics. There
are two types (or branches) of Statistics. Descriptive Statistics includes all the
tasks listed above up to Summarizing. Inferential Statistics involves estimation of
relationships, Tests of Hypotheses about population characteristics based on
Sample results and making Forecasts.
Population: The totality of all items in a study is called the population. It does not
have to be people only. The word population is used for the entire set of
individuals or objects of interest. It is also used to imply the entire set of values or

Such a measure is also called a sample statistic such as the Mean or the Standard Deviation or the Variance, and to
make matters more difficult the word Statistics is also used to mean a plural of the term Statistic, that is a
collection of such values estimated from the sample data.

measurements of some characteristics for the individuals or objects in the


population.
Census: If all elements in a population are examined we call it a Census method.
Sample: If some elements or items are selected for study from the population, we
call it a sample method. So a Sample is a selected subset of items from the
universal set called the population. A value included in the sample is also
sometimes called an observation.
Why Sample?: There are many reasons why we study a sample instead of the
whole population. Budget and time constraints are generally the main reasons for
conducting a sample study instead of a Census Study because collecting
information about the entire population may not be feasible under the given budget
and time. Another reason is the destructive nature of the investigation. If a wine
tester were to taste all the wine produced there would be nothing left to sell.
Similarly the test of strengths in production, tasting of cooked food, testing of cars
for crash tests, etc cannot be done on the whole population. In some cases the
population may be infinite (or practically infinite) and census is simply not
possible. For example it is not possible to collect all the mosquitoes in the world to
study their behavior, or collect all the fishes in the sea to conduct study on them.
Sometimes sample is preferred to census even if the latter feasible, because it is
easier to control the error involved in the measurement for the smaller subset of
items, or to hire sufficient number of expert manpower for the study with sample
than with the large population.
Types of Random Sample: A sample in which the items are selected with some
specified probability is called a random sample. If each selected item has equal
probability of being selected, then it is called a Simple Random Sample. If a
systematic pattern is introduced into random sampling, it is referred to as
systematic (random) sampling. If the selected item(s) are put back before next
drawing, it is called with repalacement, otherwise without replacement. If
the population is divided into mutually exclusive (no overlapping), exhaustive (no
items in population left out) and homogeneous (items with similar characteristics
grouped together) subsets before conducting independent random sampling from
each subset (called stratum), then we have a stratified random sample.
2

Sometimes, when intrinsic or natural groupings are evident in a statistical


population, the total population is divided into these groups (or clusters) and a
sample of the groups is selected. This leads to cluster sampling. Sometimes,
different sampling methods are combined, such as stratified cluster sampling, or
sampling is done in several stages (multistage stratified sampling). These various
methods are designed to capture known characteristics of the population and come
up with a more representative sample compared to simple random sample. In this
course, however, all the formulas are based on simple random samples.
Variable: Any entity which can assume different values is called a variable.
Measurement: A particular value of a variable is a measurement of some
characteristic of an individual or object under study. For example it could be the
weight, volume or length of an item, the income or expenditure or age, etc of an
individual. It could even be an attribute (called a qualitative variable) such as the
Gender, or Race, or Color, etc.
Four Levels of Measurement:
i.

Nominal Scale: When an entity under study is categorized into two or more
possible groups or categories the resulting measurement is called Nominal
scale measurement. For example a persons Gender, Race, color, Party
affiliation, Region, Make of an automobile, etc. In Nominal scale values we
can only count the number of items (called frequency of observation) falling
in each category but we cannot perform the mathematical operations of
addition, subtraction, multiplication, division, etc on the values themselves.
Moreover, we cannot even do a meaningful ranking of the different
categories. For example there is no meaningful ranking of Male/Female
categories.

ii.

Ordinal Scale: When the categories can be meaningfully ranked in some


way, we call the resulting measurements as Ordinal scale measurements. For
example the satisfaction levels of consumers (highly satisfied, satisfied,
neutral, not satisfied, highly dissatisfied, etc), the educational categories
(Undergraduate, Graduate, Post-graduate, etc), the letter grades (A, B, C, D,
F), and so on. Although ranking can be done the above mentioned
3

mathematical operations can still be not performed directly on the Ordinal


Scale values. For example the sum of highly satisfied and satisfied does
not make sense. Sometimes people attach some numerical values to such
Ordinal variables so that some mathematical operation can be done, but it is
still quite arbitrary. For example, you could give highly satisfied the value 5,
satisfied the value 4, neutral the value 3, not satisfied the value 2 and highly
dissatisfied the value1. But you could also give those five categories values 2,
1, 0, -1, and -2, respectively. This is all arbitrary. Therefore, rigorously
speaking, Ordinal values can only be ranked but not added, or subtracted, or
multiplied or taken ratio of.
Together the Nominal scale and Ordinal scale variables are called Qualitative
Variables
iii.

Interval Scale: If in addition to ranking the differences between values can be


meaningfully interpreted, then we call the resulting measurements as Interval
scale values or measurements. Temperature is measured in interval scale level.
The difference between 400 F and 600 F is the same as that between 600 F and
800 F. temperatures can be ranked and differences or intervals make sense.
But ratios of two Temperatures still do not make sense. For example we
cannot say that 1000 F is twice as hot as 500 F, because the former is really hot
while the latter is quite cold. The reason we cannot take ratios of interval scale
values, is that they lack inherently defined Zero or origin. The Zero of Celsius
is 320 Fahrenheit and 1000C is 2120F. This is all arbitrary. Similar is the case
with SAT and GMAT scores, and Credit scores.

iv.

Ratio Scale: This is the highest level of measurement. In addition to the


properties of ranking and meaningful differences these values also allow
ratios. The reason is that such a scale possesses inherent Zero (or origin) and
well defined unit of measurement. Most of the variables we deal with are ratio
scale variables such as height, weight, length, time, income, expenditure,
number of units produced, etc.

Together Interval scale and Ratio scale variables (or values or measurements)
are called Quantitative variables.
4

Two types of Quantitative Variables: Quantitative variables are further classified


into two categories, namely Discrete and Continuous, as defined below:
Discrete Variables: Discrete variables arise as a result of counting. They have
holes (or gaps or jumps or discontinuities) between values. For example the
number of students in a class or an University, the number of patients in a hospital,
the number of cars produces and sold, number of accidents, etc. It is easy to
understand that there is a gap between 1 and 2, and 2 and 3 and so on.
Continuous Variables: Variables which can assume any real value (Integer,
fraction, decimal, etc) within some specified limits. There are no gaps or
discontinuities between values. Theoretically the decimal places could go to any
number of digits. Such variables arise as a result of measurement instead of
counting. Examples are; weight, height, temperature, length, volume, distance,
area, time, income, expenditure, prices, taxes, etc.

Das könnte Ihnen auch gefallen