Beruflich Dokumente
Kultur Dokumente
Marketing Research
Overview
• We’ll talk about basic statistical tools
– T-tests, crosstabs, and regression are useful tools
• We’ll talk about what they can and can’t do
• More sophisticated tools can give a deeper
view of your customers
– Conjoint analysis, cluster analysis, and factor
analysis can help you understand who your
customers are and what they like
A Quick Note on Data Analysis
• Statistics are just one part of an argument
• People are easily persuaded by numbers and statistics
– The more complicated the analysis, the less likely it is to be
challenged
• The strongest challenge to many statistical arguments
is not in how the data are analyzed, but in how the
data are collected
– Methodological expertise always trumps data analytic
experience
– Data analytic knowledge allows for more careful
consideration of methodology
Really Basic: Comparing Groups
• In marketing we often have a need to understand
differences between groups
– Segmentation
• Are two or more segments really different along some
dimension of behavior or attitude?
– Experiments
• Did the treatment work?
• We need a systematic approach that allows us to
say when two (or more) groups of customers,
companies, markets, etc. really are different
Most Basic: t-tests
• Do web shoppers pay a different price for cars
than dealership shoppers?
• Do a hypothesis test:
Null Hypothesis:
=
Alternative:
≠
T-test Results
• “Customers who bought their new vehicles on the Auto
Online website report having paid less for their vehicles
than did customers who purchased their vehicles at the
dealership (Monline = $11,582 vs. Mdealer = $13,594),
t(1398) = -6.14, p < .001).”
– If the p-value of the test is “small” we reject the null
hypothesis
– Here “small” typically means less than 5% (p = .05)
• Now try answering a different question:
– Are customers who purchase a car online more likely to
buy their next car online as well?
Understanding Associations
• One of the most common questions in
Marketing Research:
– Are two (or more) variables associated?
Customer type
Subsequent
transaction
Tools for Analyzing Associations
• Cross tabulation • Regression
– Only for two categorical – Applies to any number
variables of variables
– Easy to understand – Not necessarily
categorical variables
– Slightly harder to
Online 1st Dealer 1st
understand
Online 2nd
Dealer 2nd 1400
1200
1000
800
Sales
600
400
200
0
0.75 1.25 1.75 2.25 2.75 3.25
Price
Χ2-test for Association
• We can do a statistical test here
• The null hypothesis is that there is no
association between method of first purchase
and method of subsequent purchase
– This means that the percentage of people their
next car online is the same regardless of how they
purchased their previous car
• Again, if the p-value of the test is less than .05,
we reject the null hypothesis
Intuition for Χ2-test
• The Χ2-test is based on comparing the actual
cell counts to what we would expect them to
be if there was no association
1 2
= 333/500
= 154/500
3 4
We would expect
the table to look
= 0.692*0.666 = 0.231*500 like this if there
was no association
Intuition for Χ2-test
• The Χ2-test is based on comparing the actual
cell counts to what we would expect them to
be if there was no association
Actual Expected
St =β0 + β1Pt + εt
p-value
t-statistic
St =β0 + β1Pt + εt
What does this mean?
Key Points
• Regression:
– Generates a specific equation describing the
relationship between a specific predictor (e.g.,
prices) and a specific outcome variable (e.g., sales)
– The results can offer precise (if imperfect)
prescriptions for managers
Example: Minute Maid Sales
• We previously identified a relationship
between Minute Maid prices and Minute
Maid sales
– Essentially, Sales = 1093 + (-377 x price)
• This model seems a little simplistic
– What about accounting for the behavior of
competitors?
– Regression is good at that too
• St =β0 + β1Pmm + β2Ptp + β3Ptr + β4Psb + ε
Brand Loyalty
• If some customers are
very similar to one
another but different
from other (groups of)
customers, cluster
analysis can help you Price sensitivity
identify these (multiple)
segments.
Cluster Analysis
• What is it actually doing?
• The algorithm measures the “distance” between
every point and generates a solution which
minimizes distances within a cluster and
maximizes distances between clusters
– Note that this language is very close to how you were
taught to think about the attributes of good
segmentation
• What, exactly, is “distance”?
– A rare literal example
Cluster Analysis: Baseball
• Baseball batters attempt to
hit balls to parts of the
field without any defensive
players.
• Baseball coaches have
seven players to distribute
wherever they want on the
field.
• Despite this general
flexibility, fielders are
almost uniformly
distributed in the same
locations.
• Is that where batted balls
tend to land?
Let’s look at clustering of batted balls
for a single player.
Chase Utley
Example: Shopping Attitudes
• V1: Shopping is fun
• V2: Shopping is bad for your budget
• V3: I combine shopping with eating out
• V4: I try to get the best buys while shopping
• V5: I don’t care about shopping
• V6: You can save a lot of money by comparing
prices
Example: Shopping
• Cluster 1: _______________
• Cluster 2: _______________
• Cluster 3: _______________
Key Points
• Cluster Analysis allows us to simplify across
respondents
• When used effectively, it can guide marketing
strategy
• Nevertheless, it is by no means pure
computational science. Identifying and
labeling clusters requires some interpretation
– This is a strength (in flexibility)
– And a weakness
Clusters versus Factors
Factor
Analysis
V1 V2 V3 V4 V5 ….. V20
Cluster
Analysis
Data
Factor Analysis
• Factor Analysis can be used for data reduction
(i.e., to reduce the number of variables).
• Factor analysis: Summarize the information
contained in a larger number of variables into
a smaller number of ‘factors’ without
significant loss of information.
– Data reduction is important when you need to measure
“fuzzy” concepts like “love,” “trust,” or “satisfaction
– Ask a series of questions that tap into the different
components of the concept
– Too many variables! Factor analysis can help to reduce
this dimensionality problem
Factor Analysis: Intuition
• Factor analysis assumes that the correlation
between a large number of variables is due to
them all being dependent on the same small
number of “factors”
• Example: Choice of movies
– Suppose individuals choose movies based on two
main attributes:
• Plot/story line (A1)
• Production quality (A2)
– Each individual has a preference for A1 and A2
Example: Choice of Movies
A1 Weight A2 Weight
I can relate to the characters 0.81 -0.02
.
.
.
.
.
.
Think of different brands
as different combinations
of attributes! Engine Size HP Type #Doors Brand Price
or
Assumption of Part-Worth’s
• Total utility = sum of utilities of each attribute
$100 K $150 K
Location
Example: New Job
Prospective
Employee 1
New York 0.0 (w11)
City
San Francisco 0.75 (w12)
$100,000 0.0 (w21)
Salary
$150,000 0.25 (w22)
Now we can rank jobs for this person:
U(NY,$100K)=0
U(NY,$150K)=0.25
U(SF,$100K)=0.75
U(SF,$150K)=0.25+0.75=1.0
Example: New Job
Prospective
Employee 1
New York 0.0 (w11)
City
San Francisco 0.75 (w12)
$100,000 0.0 (w21)
Salary
$150,000 0.25 (w22)
Now we can rank jobs for this person:
U(NY,$100K)=0
U(NY,$150K)=0.25
U(SF,$100K)=0.75
U(SF,$150K)=0.25+0.75=1.0
Example: New Job
Prospective Prospective
Employee 1 Employee 2
New York 0.0 (w11) 0.0 (w11)
City
San Francisco 0.75 (w12) 0.25 (w12)
$100,000 0.0 (w21) 0 (w21)
Salary
$150,000 0.25 (w22) 0.75 (w22)
Now we can rank jobs for this person, and compare it to this person:
U(NY,$100K)=0 U(NY,$100K)=0
U(NY,$150K)=0.25 U(NY,$150K)=0.75
U(SF,$100K)=0.75 U(SF,$100K)=0.25
U(SF,$150K)=0.25+0.75=1.0 U(SF,$150K)=0.25+0.75=1.0
Example: New Job
Prospective Prospective
Employee 1 Employee 2
New York 0.0 (w11) 0.0 (w11)
City
San Francisco 0.75 (w12) 0.25 (w12)
$100,000 0.0 (w21) 0 (w21)
Salary
$150,000 0.25 (w22) 0.75 (w22)
Now we can rank jobs for this person, and compare it to this person:
U(NY,$100K)=0 U(NY,$100K)=0
U(NY,$150K)=0.25 U(NY,$150K)=0.75
U(SF,$100K)=0.75 U(SF,$100K)=0.25
U(SF,$150K)=0.25+0.75=1.0 U(SF,$150K)=0.25+0.75=1.0
How do we get the part-worths?
• This is very nice but we don’t know consumers’ valuations of
attributes…
• …and consumers probably don’t know their own valuations
either!
• A solution: Force consumers to rank different bundles of
attributes (i.e., “brands”)
A 1. C
B 2. E
C 3. A
D 4. F
E 5. B
F 6. D
Conjoint Analysis: Approaches
• Traditional Conjoint: Have respondents directly rank or rate
a series of product profiles
Conjoint ≈Consider Jointly
• Discrete Choice Models (allows for non-choice)
– Also called “Choice Based Conjoint”
• Warnings
– CA will not indicate the absence of an important attribute
– Attributes should be actionable to the firm
– Interpolation between attribute levels ok – but do not
extend beyond the range selected
Key points
• Conjoint is a very popular and frequently
useful tool for identifying the underlying
utilities of consumers.
• It details the relative value of product
attributes and guides product development
and competitive pricing.
• Nevertheless, its application is deeply
contingent on both the consumer and the
product category.
Summary
• There are a number of useful statistical
techniques that can help you understand your
data
– T-tests, crosstabs, and regression are basic tools
that can make comparisons and show
relationships between marketing variables
– Cluster, factor, and conjoint analysis can help you
understand your customers’ traits and preferences
• These tools are only effective if you have good
research design to start