Sie sind auf Seite 1von 19

# Industrial Statistics

Session 2

## Summary of main techniques

Technique Regression
Correspondence Analysis/Biplots and Mapping
Cluster analysis/Consumer segmentation Factor analysis Purpose in Research Used to: identify key drivers of performance

Provide graphical summary of brands positioning in relative or absolute terms across a range of perceptions/images
Group respondents in terms of their similarity and/or dissimilarity to establish previously undiscovered attitudinal and/or behavioral segments. Used to: examine inter-relationships between variables, with the aim of data reduction, or to identify underlying themes

Conjoint Analysis

Identifies the relative worth or value of each level of several attributes from rank-ordered preferences of attribute combinations

Objectives
To understand a range of business contexts / questions that can be best answered through Multivariate Modeling To get an overview of the most widely used Multivariate Analytical Methods
To be able to decide / advise what technique to use in what context

## A Glimpse into Client Questions

Over the last two years, I have suffered attrition of 2 percent of my credit card customers. Can you find out why this has happened and suggest a solution? Can you help me identify my high risk customers and help me prioritize my targeting? Can you help me arrive at an efficient set of variables that would maximize differences among the five segments in my category? I would like to understand patient compliance in four disease markets, and across six types of drugs in each market. How do I allocate spends across media, distribution, promotions, price, and direct marketing so that I can optimize my ROI? I want to see the benefit-need gaps in the cough drops market so I can think of a line extension.

Multivariate Methods
A set of advanced statistical techniques that analyze more than two variables at the same time

Prediction

Explanation

Reduction

Grouping

Mapping

## Multivariate Methods: Taxonomy

Multivariate Methods

Dependence Techniques
Clearly hypothesized relationships between sets of variables At least one variable dependent on some other variables At least one variable considered independent of others and predicting the dependent

Interdependence Techniques

No specific hypotheses about relationships among variables Therefore, no variable is especially identified to be dependent on some other variables; no variable considered independent of others and predicting the dependent Variables could be inter-dependent

## Major / Popular Techniques

Dependence Techniques Interdependence Techniques Factor Analysis
Cluster Analysis Multidimensional Scaling Correspondence

Multiple Regression Logistic Regression Discriminant Analysis Multivariate Analysis of Variance Conjoint Analysis Structural Equation Modeling

Analysis

Session 1

Regression
Quantifies the of the relationship between a dependent variables and some explanatory independent variables Analyst specifies the nature of the relationship, ie which are the dependent and independent variables

Regression
Simple (bivariate) Regression
The starting point for multiple regression Bivariate regression is the same analyses as finding correlation between independent and dependent variable

## Multiple Linear Regression

Several Independent variable, but still only one dependent

## Many other non-linear forms not covered

Logistic, Generalised Linear Models etc These types of regression are for different types of data

## Regression between two variables

Simple Regression
9 8 7 6 5 4 3 2 1 0 0 2
Intensity of pusrsuing Hobby

4 6 8 10 Satisfaction with Hobbies
7 point scales

## Simple Linear Regression, Example 1

Y
30 25
Sales Value

Simple linear regression has only one independent variable Model fit from R2 = 0.975 R2 indicates the proportion of the total variation in the dependent variable explained by the independent variable

20 15 10 5 0 0 25 50 75

X 100

125

Line of best fit: Y = 1.8 + 2.15*X Sales value = constant + multiple of advertising expenditure

## Simple Linear Regression, Example 2

Brand Equity - Brand Share Relationship
8 7 6 5 4 3 2 1 0 0 10 20 30

## Multiple Linear Regression (MLR): Multiple Independent variables (Xs)

We are interested in the causes of variation in the response to a dependent variable (eg what causes an increase/decrease in sales/ratings) There will be many variables in a survey which can be regarded as possible causes/predictors of a dependent variable (eg Money spent on advertising, value for money etc) In statistics speak these are called Independent or Explanatory variables Multiple Linear Regression uses correlation as its building bock to establish the association between Y and Xs

## ML Regression: Dependent variable (Y)

The dependent variable Y in a regression will be a Key Performance Indicator (KPI) What are the key drivers of customer satisfaction? Or what are the biggest influencers of brand equity in the market? From a questionnaire we maybe interested in one variable in particular eg purchase intention, likelihood to recommend, overall satisfaction, the amount of sales of a product, an overall rating of service When this type of variable represents the key interest within a survey, Regression refers to this as the Dependent variable

Illustration: 1
Can I explain Sales using Investments in Advertising and Incentives to Sales force?

## What are the drivers to Brand Equity in my category?

What drive Client Satisfaction in the market research industry quality, price, timeliness, service levels?

Multiple Regression

## Multiple Linear Regression

Useful in making predictions, giving explanations, or understanding drivers. Single metric (interval or ratio) dependent (outcome) variable Several (more than two) metric (interval or ratio) independent variables (predictors) Assumes (or requires) normally distributed data from a large sample Results in an equation using which predictions / explanations of the outcome variable are possible

Example: check how well the regression model fits the data, using R2

R-square (R2) is an overall measure of how well the model (the regression equation) explains the variance in the data R2 is always between 0 and 1:
An R2 value of 0.222 means it explains 22% of the variance in the data The bigger, the R2 value, the better An acceptable level for R2 depends on the research setting, but low ones are accepted in the market research industry. But preferably at least 0.6 and higher

Use the Adjusted R2 which takes account of the sample size and the no. of independent variables. Often there is not a large difference between this and the R2
Model Summary Model 1 R .471(a) R Square Adjusted R Square Std. Error of the Estimate .222 .171 .603

## MLR: Example 1, SPSS Output from MLR

Look at the table of standardised coefficients (beta scores). These are the weights (i) of the model The Beta scores show the extent to which the independent variable fluctuates with the dependent variable:
The bigger the Beta scores, the greater their impact (ie. The more they fluctuate with satisfaction) The implication is that these are more important attributes, because they are the ones that are moving when dependent levels change

Model

## Unstandardized Coefficients Standardized Coefficients Beta B Std. Error

Sig.

(Constant) 3.640 .085 42.664 .000 REGR factor score 1 .235 .086 .355 2.727 .009 1 REGR factor score 2 .062 .086 .093 .716 .478 REGR factor score 3 .196 .086 .296 2.276 .028 a Dependent Variable: Q58. Overall how satisfied are you with XXX as a life insurance company as a w

## Multiple Linear Regression Summary

Linear Regression
eg Key Driver analysis usually based on attitudinal data The relationship is linear (ie a straight line can describe the relationship) and is additive in nature Based on correlation Use model fit R2 (adjusted) Provides Importance Scores Not suitable for all data types, categorical or choice data

## Y = c + b1x1 + b2x2 + b3x3 + ..+ e

30 25 20 15 10 5 0 0 1 2 3 4 5 6 7 8 9 x inde pe nde nt

y independent