Beruflich Dokumente
Kultur Dokumente
MODULE 3
Data Collection, Analysis and
Interpretation
Reduces
Uncertainty
Three Basic Types of Research
No one ever study
No No No No
Selection of
exploratory research
technique Probability Nonprobability
Secondary
Experience Pilot Case Collection of
(historical) Data
survey study study data
data Gathering
(fieldwork)
Data
Editing and
Problem definition Processing
coding
(statement of and
data
research objectives) Analysis
Data
processing
Selection of
Research Design basic research
method Conclusions
Interpretation
and Report
of
findings
Experiment Survey
Secondary
Observation
Laborator Field Interview Questionnair Data Study
Report
y e
Types of Data and Measurement Scales
Data
Non-metric Metric
or or
Qualitative Quantitative
8
Types of Variables
Male/Female
Dichotomous Engineering/non-engineering
Engineering background
Discrete Educational level
Production Units
Continuous Costs
3-9
Description of HBAT Primary Database Variables
POPULATION
Sampling
Who is to be sampled?
How large a sample?
How will sample units be selected?
Two Major Categories of Sampling
Probability sampling
• Known, nonzero probability for every
element
Nonprobability sampling
• Probability of selecting any particular
member is unknown
Nonprobability Sampling
Convenience
Judgment
Quota
Snowball
Probability Sampling
Plan procedure
for selecting sampling units
Conduct fieldwork
Research Design
Master plan
Framework for action
Specifies methods and procedures
Basic Research Design- Data Collection
Surveys
Experiments
Secondary data
Observation
The Major Decisions in Questionnaire
Design
Questionnaire relevance
Questionnaire accuracy
Phrasing Questions
Open-ended questions
Fixed-alternative questions
IMPORTANT!
Research Question-----
Research Objective-----
Research Hypothesis----!
Univariate- dispersion, fluctuation
within the variable.
Bivariate – difference, relationship,
association, causal
Multivariate- relationship,
association, causal, modeling
illustration
uni
RQ1- Is the O&G Industry stable in
the past 5 years? RO? RH?
RQ2- Is there a difference between
bi
energy sector and finance sector
with regard to finance performance?
RO? RH?
multi RQ3- what are the factors that affect
the performance of energy sector?
RO? RH?
ANALYSIS
Analysis of Quantitative Data-Dealing
with Data
Dealing with Data: Coding, Entering, and
Cleaning
Results with One Variable-UNIVARIATE
Results with Two Variables- BIVARIATE
Results with More than Two- MULTIVARIATE
Relevant for Inferential Statistics
Dealing with Data
Cum ulative
Frequency Percent Vali d Percent Percent
Vali d 5.0 1 1.0 1.0 1.0
5.1 1 1.0 1.0 2.0
5.2 1 1.0 1.0 3.0
5.5 2 2.0 2.0 5.0
5.6 1 1.0 1.0 6.0
5.7 4 4.0 4.0 10.0
5.8 1 1.0 1.0 11.0
5.9 2 2.0 2.0 13.0
6.0 1 1.0 1.0 14.0
6.1 2 2.0 2.0 16.0
6.2 1 1.0 1.0 17.0
6.3 1 1.0 1.0 18.0
6.4 5 5.0 5.0 23.0
6.5 2 2.0 2.0 25.0
6.6 1 1.0 1.0 26.0
6.7 4 4.0 4.0 30.0
6.9 3 3.0 3.0 33.0
7.0 1 1.0 1.0 34.0
7.1 2 2.0 2.0 36.0
2-33
7.4 2 2.0 2.0 38.0
Histograms and The Normal Curve
This is the distribution for
X19 - Satisfaction HBAT database variable
30
X19 – Satisfaction.
20
10
X19 - Satisfaction
2-34
Results with Two Variables
4
4 5 6 7 8 9 10 11
X6 - Product Quality
2-38
Bivariate Table
IQ-----PRODUCTION!
Scattergram
Cross-tab
correlation
Correlation Matrix for Store Image Elements
V1 V2 V3 V4 V5 V6 V7 V8 V9
V1 Price Level 1.00
V2 Store Personnel .427 1.00
V3 Return Policy .302 .771 1.00
V4 Product Availability .470 .497 .427 1.00
V5 Product Quality .765 .406 .307 .472 1.00
V6 Assortment Depth .281 .445 .423 .713 .325 1.00
V7 Assortment Width .354 .490 .471 .719 .378 .724 1.00
V8 In-Store Service .242 .719 .733 .428 .240 .311 .435 1.00
V9 Store Atmosphere .372 .737 .774 .479 .326 .429 .466 .710 1.00
3-44
Correlation Matrix of Variables After
Grouping Using Factor Analysis
V3 V8 V9 V2 V6 V7 V4 V1 V5
V3 Return Policy 1.00
V8 In-store Service .733 1.00
V9 Store Atmosphere .774 .710 1.00
V2 Store Personnel .741 .719 .787 1.00
V6 Assortment Depth .423 .311 .429 .445 1.00
V7 Assortment Width .471 .435 .468 .490 .724 1.00
V4 Product Availability .427 .428 .479 .497 .713 .719 1.00
V1 Price Level .302 .242 .372 .427 .281 .354 .470 1. 00
V5 Product Quality .307 .240 .326 .406 .325 .378 .472 .765 1.00
3-45
Results with More than two Variables: The
Elaboration Model
ANOVA
df SS MS F Significance F
Note that:
Regression 1 129173.1279 129173.128 15.43583 0.00772299 (1) both t and F
Residual 6 50210.37209 8368.39535 have the same
Total 7 179383.5 p-value, and
(2) t2 = F.
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 44.3139535 108.5086985 0.40839079 0.69716178 -221.197461 309.825368
Years 38.755814 9.864427133 3.92884589 0.00772299 14.6184126 62.8932153
Descriptive statistics
• Number of people
• Trends in employment
• Data
Inferential statistics
• Make an inference about a population
from a sample
Stem & Leaf Diagram – HBAT Variable X6
Each stem is shown by the
numbers, and each number is a
X6 - Product Quality
leaf. This stem has 10 leaves.
Stem-and-Leaf Plot
Frequency Stem & Leaf The length of the stem, indicated by the
number of leaves, shows the frequency
3.00 5. 012
10.00 5. 5567777899
distribution. For this stem, the
10.00 6. 0112344444 frequency is 14.
10.00 6. 5567777999
5.00 7. 01144
This table shows the distribution of X6 with a stem and
11.00 7. 55666777899
leaf diagram (Figure 2.2). The first category is from 5.0 to
9.00 8. 000122234
5.5, thus the stem is 5.0. There are three observations with
14.00 8. 55556667777778
values in this range (5.0, 5.1 and 5.2). This is shown as
18.00 9. 001111222333333444
three leaves of 0, 1 and 2. These are also the three lowest
8.00 9. 56699999
values for X6. In the next stem, the stem value is again 5.0
2.00 10 . 00
and there are ten observations, ranging from 5.5 to 5.9.
These correspond to the leaves of 5.5 to 5. 9. At the other
Stem width: 1.0
Each leaf: 1 case(s) end of the figure, the stem is 10.0. It is associated with two
leaves (0 and 0), representing two values of10.0, the two
highest values for X6.
2-51
HBAT Diagnostics: Box & Whiskers Plots
Outlier = #13 Group 2 has substantially more
11 dispersion than the other groups.
10
13
6 Median
4
N= 32 35 33
X1 - Customer Type
2-52
One-Way ANOVA - An Example
Compare calculated values to those in the Excel output:
Anova: Single Factor
SUMMARY
Groups Count Sum Average Variance
Alone 10 637 63.7 87.56666667
WithPass 12 683 56.91666667 63.53787879
ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 250.9833333 1 250.9833333 3.37566268 0.081071382 4.351250027
Within Groups 1487.016667 20 74.35083333
Total 1738 21