Beruflich Dokumente
Kultur Dokumente
INTRODUCTION
TO
Statistics
The word statistics derives from classical Latin roots, status which means
state.
As potential users of statistics, we need to master both the science and the
art of using statistical methodology correctly.
Specific definition:
Statistics is a collection of procedures and principles for
gathering data and analyzing information to help people
make decisions when faced with uncertainty.
Nowadays statistics is used in almost all fields of human effort such as:
Education
Agricultural
Businesses
Health
Example
1.
Sport
applications of Statistics
Sports
A statistician may keeps records of the number of hits a baseball player gets in
a season.
Financial
Financial advisor uses some statistic information to make reliable predictions
in investment.
Public Health
An administrator would be concerned with the number of residents who
contract a new strain of flu virus during a certain year.
Others
Any Idea?..
2. Applied Statistics
o Involves the applications of those theorems, formulas, rules and laws to
solve real world problems.
o Applied Statistics can be divided into two main areas, depending on how data
are used. The two main areas are:
Descriptive statistics
Inferential statistics
ASPECTS OF STATISTICS
Theoretical/Mathematical
Statistics
Deals with the development,
derivation and proof of statistical
theorems, formulas, rules and
laws.
Applied
Statistics
Involves the applications of those
theorems, formulas, rules and laws
to solve real world problems.
Descriptive
Statistics
Consist of method for
collecting, organizing,
displaying and
summarizing data
Inferential
Statistics
Consist of methods that use
results obtained from sample to
make decisions or conclusions
about a population
Example 1
Determine which of the following statements is descriptive in nature and which
is inferential.
a. Of all U.S kindergarten teachers, 32% say that knowing the alphabet is an
essential skill.
Chapter 1: Introduction to Statistics
b. Of the 800 U.S kindergarten teachers polled, 32% say that knowing the
alphabet is an essential skill.
Population
Sample
Inference
Statistic
Parameter
Population
A collection of all individuals
Sample
Statistic
A numerical value summarizing the
sample data.
English
alphabet is used
symbolize the name of statistic
to
Average/Mean
- s
Standard deviation e.g. The average height, found by
using the set of 25 heights.
Variable
A characteristic of interest about each individual element of a population
or sample.
e.g. : A students age at entrance into college, the color of students hair.
Data value
The value of variable associated with one element of a population or
sample. This value may be a number, a word, or a symbol.
e.g. : Farah entered college at age 23, her hair is brown.
Data
The set of values collected from the variable from each of the elements
that belong to sample.
e.g. : The set of 25 heights collected from 25 students.
iii) Variable
iv) Data value : one data value is the ringgit value of a particular car. Alis
car, for example, is value at RM 45 000.
v) Data
vi) Parameter :
vii) Statistic
Continuous Variables
6
EXERCISE 1
1. Of the adult U.S. population, 36% has an allergy. A sample of 1200 randomly
selected adults resulted in 33.2% reporting an allergy.
a.
b.
What is sample?
c.
d.
e.
2. The faculty members at Universiti Utara Malaysia were surveyed on the question
How satisfied were you with this semester schedule? Their responses were to be
categorized as very satisfied, somewhat satisfied, neither satisfied nor
dissatisfied, somewhat dissatisfied, or very dissatisfied.
a.
b.
c.
d.
b.
c.
d.
Primary data
Secondary data
Data is the set of values collected from the variable from each of the
Advantages:
Advantages:
elements that belong to sample.
Precise answer.
cost.
e.g. the set of 25 heights collected fromLower
25 students.
Appropriate for research that requires huge data
Save time and energy.
Data can be collected from a survey or an experiment.
collection.
Increase the number of answered questions.
Disadvantages:
Obsolete information.
Disadvantages:
Data accuracy is not confirmed.
Expensive.
Interviewer might influence respondents Types of Data
responses.
Respondent refuse to answer sensitive or personal
question.
2. Telephone interview
Advantages:
Quick.
Less costly.
Wider respondent coverage.
Disadvantages:
Limited interview duration.
Demonstration cannot be performing.
Telephone is not answered.
3. Postal questionnaire
A set of questions to obtain related information of
conducted study.
Questionnaires are posted to every respondent.
Advantages:
Wider respondent coverage.
Respondent have enough time to
questions.
Interviewer influences can be avoided.
Lower cost.
Chapter 1: Introduction to Statistics
Disadvantages:
One way interaction.
Low response rate.
answer
Any Idea?.......
Another technique to collect primary data
is observation. List the advantages and
disadvantages of this technique.
1.3.2.1
Scale of Measurements
Levels of Measurement
Nominal - categories only
Ordinal - categories with some order
Interval - differences but no natural starting point
Ratio - differences and a natural starting point
EXERCISE 2
1)
10
2)
3)
4)
5)
Some hotels ask their guests to rate the hotels services as excellent,
very good, good, and poor. This is an example of the
a. ordinal scale.
b. ratio scale.
c. nominal scale.
d. interval scale.
6)
7)
8)
9)
11
depending
on how the
10)
11)
Statistical inference
a. refers to the process of drawing inferences about the sample based
on the characteristics of the population.
b. is the same as descriptive statistics.
c. is the process of drawing inferences about the population based on
the information taken from the sample.
d. is the same as a census.
EXERCISE 3
1. In each of this statements, tell whether descriptive or inferential statistics
have been used.
a) The average life expectancy in New Zealand is 78.49 years.
b) A diet high in fruits and vegetables will lower blood pressure.
c) The total amount of estimated losses from Tsunami flood was RM4.2
billion.
d) Researchers stated that the shape of a persons ears is related to the
persons aggression
e) In 2013, the number of high school graduates will be 3.2 million
students.
2. Classify each variable as discrete or continuous.
a) Ages of people working in a large factory
b) Number of cups of coffee served at a restaurant
Chapter 1: Introduction to Statistics
12
25
Matrix No: _______________________
Group:______
TUTORIAL CHAPTER 1
In the following multiple-choice questions, please circle the correct answer.
1.
You asked five of your classmates about their height. On the basis of this
information, you stated that the average height of all students in your university
or college is 65 inches. This is an example of:
a. descriptive statistics
b. statistical inference
c. parameter
d. population
13
2.
A company has developed a new computer sound card, but the average lifetime
is unknown. In order to estimate this average, 200 sound cards are randomly
selected from a large production line and tested and the average lifetime is
found to be 5 years. The 200 sound cards represent the:
a. parameter
b. statistic
c. sample
d. population
3.
4.
5.
When data are collected in a statistical study for only a portion or subset of all
elements of interest, we are using a:
a. sample
b. parameter
c. population
d. statistic
6.
7.
8.
9.
A politician who is running for the office of governor of a state with 4 million
registered voters commissions a survey. In the survey, 54% of the 5,000
14
A company has developed a new battery, but the average lifetime is unknown.
In order to estimate this average, a sample of 500 batteries is tested and the
average lifetime of this sample is found to be 225 hours. The 225 hours is the
value of a:
a. parameter
b. statistic
c. sample
d. population
11.
The process of using sample statistics to draw conclusions about true population
parameters is called
a. inferential statistics
b. the scientific method
c. sampling method
d. descriptive statistics
12.
13.
Researchers suspect that the average number of credits earned per semester by
college students is rising. A researcher at Michigan State University (MSU)
wished to estimate the number of credits earned by students during the fall
semester of 2003 at MSU. To do so, he randomly selects 500 student transcripts
and records the number of credits each student earned in the fall term 2003. He
found that the average number of semester credits completed was 14.85 credits
per student. The population of interest to the researcher is
a. all MSU students
b. all college students in Michigan
c. all MSU students enrolled in the fall semester of 2003
d. all college students in Michigan enrolled in the fall semester of 2003
14.
The collection and summarization of the graduate degrees and research areas of
interest of the faculty in the University of Michigan of a particular academic
institution is an example of
a. inferential statistics
b. descriptive statistics
c. a parameter
d. a statistic
15
16.
17.
A study is under way in a national forest to determine the adult height of pine
trees. Specifically, the study is attempting to determine what factors aid a tree
in reaching heights greater than 50 feet tall. It is estimated that the forest
contains 32,000 pine trees. The study involves collecting heights from 500
randomly selected adult pine trees and analyzing the results. The sample in the
study is
a. the 500 randomly selected adult pine trees
b. the 32,000 adult pine trees in the forest
c. all the adult pine trees taller than 50 feet
d. all pine trees, of any age in the forest
18.
19.
20.
For each of the following examples, identify the data type as nominal, ordinal,
or interval.
a. The letter grades received by students in a computer science class
________________
b. The number of students in a statistics course
16
17