Sie sind auf Seite 1von 11

CHAPTER 1: INTRODUCTION TO STATISTICS STA404

CHAPTER ONE
INTRODUCTION TO STATISTICS

WHAT IS STATISTICS?

Statistics is the science of collecting, organizing, presenting, analyzing and interpreting data. Based on the analyzed
data, conclusion can be drawn on the characteristics of the population and decision can be made for future action. The
steps of statistical analysis involve collecting information, evaluating it, and drawing conclusions. The information might
be:

 The number of men and women has cancer


 The velocity of a burning gas on the sun’s surface

Statisticians provide crucial guidance in determining what information is reliable and which predictions can be trusted.
They often help search for clues to the solution of a scientific mystery, and sometimes keep investigators from being
misled by false impressions. Statisticians work in a variety of fields, including medicine, government, education,
agriculture, business, law and finance.

USES OF STATISTICS

Statistics are invaluable to economist, administration, analysts, manager and even politicians. It can be used whether
for business purposes or administrative or research purposes.
Statistics helps policy-makers and decision-makers to monitor performance of the various sectors in economy, and by
studying their tends finally assist the government in planning the economy.
Managers use statistics by studying past trends and pattern to develop methods that should be taken for future actions
and methods to overcome future possible problem.

THE ROLE OF STATISTICS

There are three important reasons to study statistics:


1. Being Informed
2. Understanding and making decisions.
3. Evaluating decisions that affect your life.

TYPES OF STATISTICS

Statistics can be dividing into two categories:

1. Descriptive Statistics

This kind of statistics deals with developing and utilizing technique for careful collection and effective
presentation of the data collected. The aim is to study the characteristics of the data. Therefore the study
involves mainly in the collection, organizing, presentation and description of the numerical information.

1
CHAPTER 1: INTRODUCTION TO STATISTICS STA404

The art of collecting and organizing the data are the basic steps of statistics before any presentation can be
done. Data can be presented by using graphs or charts. Interpretation of the graphs and charts will help us
evaluate the information presented to describe the characteristics of the collected.

2. Inferential Statistics

This kind of statistics deals with the tools and technique of statistics that are used to analyzed the data and to
make prediction, estimates or decisions by drawing conclusions from the data. Inferential statistics is used to
determine how far our decision about any information is true and acceptable. It is also used to estimate or draw
inferences about the attitudes or characteristics of the whole population based on sample. It encompasses all
types of decision. In principle it enables an optimum decision to be made for any problem especially if it relates
to the following:

i) Determining whether any apparent characteristics of a situation are genuine or are merely the result of random
happening.
ii) Assessing the problem magnitude of numerical quantity and determining the reliability of such assessment.
iii)Interpreting past patterns of variations to predict future happenings.

BASIC STATISTICAL TERMS

Like all profession, also statisticians have their own keywords and phrases to ease a precise communication. However,
one must interpret the results of any decision making in language that is easy for the decisions-maker to understand.
Otherwise, he/she does not believe in what you recommend, and therefore does not go into the implementation phase.
This lack of communication between statisticians and the managers is the major barrier for using statistics.

Term Meaning
Element An element is an object on which a measurement is taken.
Population A population is a collection of element of interest or the measurements obtained
from all individuals or objects of interest.
Sample A sample is a portion or subset of the total group or population of interest.
Census A census is a study of the entire population.
Sample Survey Is a study of the entire population.
Parameter A parameter is a numerical descriptive measure of the population. Parameters are
used to represents a certain population characteristics.
For example, the population means µ is a parameter that is often used to indicate
the average value of a quantity.
Statistics A statistics is a numerical descriptive measure taken from sample. It is used to give
information about unknown values in the corresponding population.
For example, the average of the data in a sample is used to give information about
the overall average in the population from which that sample was drawn.

2
CHAPTER 1: INTRODUCTION TO STATISTICS STA404

Variable A variable is measure a characteristics of the population under study which may take
different values, such as weight, gender since they are different from individual to
individual.
Data A data is an observation or information that have been recorded or collected.
Random Random is the choice of a single item from a group if every item in the group has the
same chance of being selected as any other item.
Pilot Study A pilot, or feasibility study is a small experiment designed to test logistics and gather
information prior to a large study, in order to improve the quality and efficiency. The
pilot study provide vital information on the severity of proposed procedures.

A set of all items (population)

A set of items selected from the population


(sample). Hence, the sample is a subset of the
population

Example 1

In the automobile industry, customer service is a crucial factor affecting car sales. The management of a reputed
automobile company is interested in determining the level of customer satisfaction with the service provided by the
company’s service center. The company has altogether 60 service centers in the Klang Valley. Six service centers was
selected for the study.

a) State the objective of the study.


b) State the population and sample for this study.

Solution:

Example 2

The local cable television company is planning to add one channel to its basic services. There are five channels to
choose from, and the company would like to have some input from 2000 subscribers. There are about 20,000

3
CHAPTER 1: INTRODUCTION TO STATISTICS STA404

subscribers in Malaysia and the company knows that 35% of the subscribers are college students, 45% are white-
collar workers, 15% are blue-collar workers and 5% are others.

a) State the population of the study.


b) State the sampling unit for this study.

Solution:

TYPE OF VARIABLE

A variable is a measurable factor, characteristic, or attribute of an individual or a system in other words, something that
might be expected to vary over time. For example, variable of interest maybe the absenteeism among students,
household income of Malaysian citizen and sales of cars.

Qualitative Example:
• Measured with non-numerical scale • Gender, type of cars, religion
• Yields categorical response • Are you a Malaysian?
The answer is only 'Yes' or 'No'
Quantitative Discrete
• Measured on numerical scale • Numerical response which arises from
Variable • Yields numerical response a counting process
• Example: How many children do you
have?
Continuous
• Numerical response which arises from
a measuring process
• Example: How tall are you ?
What is your weight?

LEVEL / SCALE OF MEASUREMENT

The level of measurement of a variable in mathematics and statistics is a classification that is used to describe the
nature of information contained within numbers assigned to objects and therefore within the variable. Level of
measurement of the data is an important factor in determining which procedure to use. Four level of measurement:
nominal, ordinal, interval and ratio.

Nominal Scale

4
CHAPTER 1: INTRODUCTION TO STATISTICS STA404

Variables that are measured only nominally are also called categorical variables. In this type of measurement, names
are assigned to objects as labels. The data cannot be arranged in ordering scheme (from low to high). The nominal
scale is the lowest in the level of data measurement scale. Variables measured at a nominal level include gender,
marital status, race, religious affiliation, college major, and birthplace. Other examples include: geographical location
in a country, telephone access code, or the model of car.

Example:

1. What is your gender?

Male Female

2. Where did you get information about I-Phone?

Newspapers

Magazines

Television

Internet

Other

Please specify: ____________________

ORDINAL SCALE

The numbers are called ordinals when the numbers assigned to objects represent the rank order (1 st, 2nd. 3rd etc) of
the entities measured. Comparisons of greater and less can be made, in addition to equality and inequality. However,
operations such as conventional addition and subtraction are still meaningless. The ordinal scale is a level higher than
the nominal scale.

Example:

1. How satisfied are you with this book?

Very Not Very not


Satisfied Neutral
satisfied satisfied satisfied

2. Your highest Completed Level of Education

1. SPM

2. Diploma

3. Bachelor

4. Master

5
CHAPTER 1: INTRODUCTION TO STATISTICS STA404

5. PhD

INTERVAL SCALE

The interval is like ordinal level but with the additional property that the different between two data values is meaningful.
Data at this level do not have a natural zero starting point. Example of interval scale is temperature. 0°F doesn’t mean
“no heat” and 40°F is not twice as hot as 20°F.

RATIO SCALE

Ratio scale is strongest scale of measurement. Ratio scale contains a meaningful zero (absolute zero point) which
represent the absence of the phenomena being measured. Example of ratio measurement is time taken to study per
day, the monthly amount spent for prepaid top up and number of cigarette per day.

Example:

1. How many siblings do you have?


2. The monthly salary of factory worker.

Example 3

Identify the type of variable and level of measurement for each statement.

i) Identity number
ii) Social class of resident
iii) Laptop price
iv) Amount of time spent shopping every week
v) Favourite shopping spot
vi) Number of UiTM students
vii) Brand of hand phone most preferred

Solution:

6
CHAPTER 1: INTRODUCTION TO STATISTICS STA404

SAMPLING METHODS

Sampling is an effort to estimate the characteristics of a population by studying a small portion of the items in the
population. The process can be done by selecting a sample that can represent the population as accurate as possible.
Sampling technique is the sampling processes that can be done by selecting a sample that can be represented must
the population as accurate as possible. The sampling frame is the frame will consist of all items in the population. The
frame must be complete that is no item of the population should be left out and it should not be defective because it is
out of date or contains inaccurate or duplicate items, or inadequate because it does not cover all the categories required
to be included in the investigation. For example, the sampling frame are telephone directories, town maps provide a
useful frame and some other lists.

Type of Sampling Technique

There are two type of sampling technique:


1. The probability sampling
2. The non-probability sampling

1. The Probability Sampling

Probability sampling techniques is used when a researcher plans to make inferences about population. The sample
is selected based on known probabilities.

a) A simple random sample is selected from the population in such a way that each item has the same chance of
being selected as sample by using chance method or random number.

b) In systematic sampling, we divide the population size (N) by the sample size (n) to obtain the range k
(k=N/n). An element is then randomly selected from the first k elements in the list. Suppose the rth element is
selected, then every kth element in the population is sampled beginning with the rth element. This means that
the element chosen are element rth, (r+k)th, (r+2k)th, (r+3k)th,….and so on until sample size n is obtain.

c) Researchers select stratified samples by dividing the population into groups according to some characteristic
that is important to the study, then sampling from each group.

d) Researchers select cluster samples by using intact groups called clusters. Cluster sampling is used when the
population is large.

2. The Non-probability Sampling

Non-probability Sampling techniques are used when generalization concerning the population is not required or
when sampling frames are difficult to obtain.

7
CHAPTER 1: INTRODUCTION TO STATISTICS STA404

a) Convenience sampling is referred as accidental sampling. It is not normally representative of the target
population because sample units are only selected if they can be accessed easily and conveniently. Basically,
respondent are selected because they happen to be in the right place at the right time.

b) Judgmental sampling technique is used when a sample is taken based on certain judgments about the overall
population. Judgmental sampling is subject to the researcher’s biases and is perhaps even more bias than
convenience sampling.

c) Snowball sampling is a method in which a researcher indentifies one member of some population of interest,
speaks to him, and then asks that person to identify others in the population that the researchers might speak to.
This person is then asked to refer the researcher to yet another person and so on.

Quota Sampling technique divide sample into quotas, the quotas indicating the number of people to be interviewed,
but leaving the choice of the actual respondent to the interviewers.

TYPE OF DATA

1. PRIMARY DATA

Primary data are data may be expressly collected for a specific purpose. Such data is known as primary data.
The data are collected at first hand in response to specific question and to satisfy specific purpose of a statistical
inquiry and is not analyzed yet. Data are collected by the investigators himself. When the primary data has been
collected, processed and analyzed, then the published set of data becomes a secondary data.
For examples, the Ministry of Education is interested in knowing the attitudes of students towards studying. To
get such information, the ministry will form a body to ask students towards concerning the matter. Suitable
question are designed to fulfill their requirement. This question will be answered by students and the vocabulary
being used should be suitable to the levels of the people answering the question to avoid misunderstandings.

ADVANTAGES DISADVANTAGES
i) Typical information wanted is obtained, i) Inconvenient, required more time, effort,
and investigators is aware of any limitations manpower and money.
they may contain as he knows the conditions
under which they are collected.

ii) Secondary data may introduce reproductive


errors, so it is better to use primary data.

iii) Data gathered are more accurate and


consistent with the objective of research.

8
CHAPTER 1: INTRODUCTION TO STATISTICS STA404

2. SECONDARY DATA

These kinds of data consists of figures and information which were collected originally to satisfy a particular
inquiry but have been used now, at second hand as the basis for a different inquiry by another person. In other
word, secondary data are data that is taken from other investigator’s collection of figure. Often such data are
collected for some other purpose. It is impossible for users of secondary data to have through understanding of
the background as the original investigators and thus may not be aware of its limitation. Such data must be used
in great care because it not may give the exact kind of information wanted and the data may not be in the most
suitable form.

ADVANTAGES DISADVANTAGES
i) More convenient(required less time, effort i) Transcription error
and money.
ii) May not meet our specific needs and objectives
ii) Data help you decide what further research
of current research.
needs to be done.
iii) Not all is readily available or expensive.

iv) Accuracy is questionable (lack of accuracy)

DATA COLLECTION METHODS

Data collection is important because analysis and conclusion rely on it. That is why the analysis and validity of the data
depends upon the contents and how the data is collected. Three important aspects in choosing the right method
collecting data or doing research are:
i) How to choose a respondent
ii) How to contact the selected respondent
iii) What information needed from the respondent?

There are a few methods of getting information from a sample. The best methods for a certain conditions depend on:

i) Budget available for the research; especially the amount allocated for field work.
ii) Time allocation for the research. This is important so that the research can be finished on time.
iii) Accuracy of the result needed.
iv) The distribution of sample needed. The best methods are needed in handling sample which is widely distributed
to save expenses.

The common methods of data collection are as follows:

a) Face-to-face interview
b) Telephone interview
c) Direct questionnaire (questionnaires are distributed and collected personally)
d) Mail or postal questionnaire (questionnaires are sent and received back through the post)
e) Direct observation (respondents are observed and data recorded)

9
CHAPTER 1: INTRODUCTION TO STATISTICS STA404

f) Other methods (email, recording, colorimeter, spectrophotometer, etc)

BASIC TYPE OF STATISTICAL STUDIES

In an observational study, the researcher merely observes what is happening or what has happened in the past and
tries to draw conclusions based on these observations.

In an experimental study, the researcher manipulates one of the variables and tries to determine how the manipulation
influences other variables.

Review Exercises 1

1. State whether the following statements are quantitative or qualitative variables.

a) Time taken to finish an examination


b) Brands of skirt bought by customers
c) Mass of durians bought by Pak Ali
d) Quantity of petrol sold by petrol stations in Kuala Pilah
e) Number of houses in Seremban
f) Number of students who scored 9As in the SPM examination in 2011
g) Type of cars driven by students in UiTM Negeri Sembilan
i) Gender of students in a class

2. State whether the following statements are numerical or categorical data

a) Measuring in litres the petrol tank capacity of lorries


b) Counting the number of buses that pass through a toll-booth
c) Respondents asked if they prefer to have blue car, black car or green car
d) T-shirt classified into large, medium or small
e) Income of lecturers in UiTM
f) Types of shoes used by students in a school

3. State whether each of the following statements is true or false

a) A study of statistics can be divided into two sections: qualitative and quantitative methods
b) Inferential statistics consists of methods dealing with collection, tabulation, summarization and
presentation of data
c) A population is a collection of individuals that the researcher wishes to study.
d) Cumulative Grade Point Average (CGPA) score of a student’s is a qualitative variable.
e) When constructing questionnaires, a researcher must ensure that the questions are related to and
satisfy the objectives of the research.

4. Define the following statistical terms:


a) Population
b) Sample

10
CHAPTER 1: INTRODUCTION TO STATISTICS STA404

c) Census study
d) Secondary data
e) Inferential Statistics
f) Discrete random variable

5. Classify each variable as discrete or continuous


a) Number of doughnuts sold each day by Big Apple
b) Water temperature of swimming pools in Hotel A on a given day
c) Weights of dogs in a pet shelter
d) Lifetimes (in hours) of batteries XXX
e) Number of beef burger sold each day by a stall B
f) Number of DVDs rented each day by a video store
g) Capacity (in gallons) of milk sold in a certain day at a grocery store
h) The time taken by students to complete their assignments
i) Number of cups of tea served at a restaurant YY

11

Das könnte Ihnen auch gefallen