Marketing Mix v5

Marketing Mix Models with BayesiaLab
Stefan Conrady, stefan.conrady@bayesia.us

Dr. Lionel Jouffe, jouffe@bayesia.com
May 8, 2013
Table of Contents
Introduction
Example & Dataset 3
Model Development 4
Data Import 4
Supervised Learning 8
Network Performance 10
Model Evaluation 12
Mutual Information 14
Mapping 17
Total Effects on Target 18
Direct Effects on Target 20
Elasticity 21
Caveats 22
Target Mean Analysis by Direct Effects 22
Marketing Mix Optimization 24
Resource Allocation Optimization 27
Summary 32
Appendix
Framework: The Bayesian Network Paradigm 33
Acyclic Graphs & Bayess Rule 33
Compact Representation of the Joint Probability Distribution 34
References 35
Contact Information 38
Bayesia USA 38
Bayesia Singapore Pte. Ltd. 38
Bayesia S.A.S. 38
Copyright 38
ii www.bayesia.us | www.bayesia.sg | www.bayesia.com
Introduction
To many business executives, marketing mix models remain shrouded in mystery. There are many advertis-
ing agencies and market research companies, plus countless online media frms, which promote their par-
ticular marketing mix model in an effort to support marketing decision makers. However, there is a re-
markable lack of universally accepted and standardized methods that marketing practitioners can draw
upon. In fact, performing a search for books on Amazon.com regarding marketing mix models yields
only three relevant titles and, as it turns out, they are mostly geared towards an academic audience. In con-
trast to the sparse array of books, there is no shortage of academic papers on all aspects of marketing mix
modeling and optimization.
1
Without doubt, many of these peer-reviewed journal articles have progressed
the feld of marketing science, but the often abstract nature of the proposed methods keep them far removed
from practical implementation.
Given this lack of textbook references in this feld, plus the fairly inaccessible nature of the academic litera-
ture, decision makers have to rely almost exclusively on the persuasion skills of research vendors and con-
sultants in determining the validity of any proposed marketing mix model.
Marketing models based on Bayesian networks are not automatically a solution to this quandary, but their
inherently visual nature plus their computational transparency make them much more accessible to a broad
range of stakeholders. To interpret and validate a Bayesian network requires, most importantly, a
common-sense understanding of the domain and not necessarily a degree in statistics.
It is our objective to use the framework of Bayesian networks, plus the features of the BayesiaLab software
package, to create sound marketing mix models that can be implemented by many and interpreted by all.
More specifcally, we will focus on how to optimize marketing mix models with BayesiaLabs algorithms
and to derive policy recommendations for decision makers.
Example & Dataset
To illustrate this approach we study daily ice cream sales of a European food distributor as a function of
environmental variables and marketing efforts.
2
Our sample data set includes the following time-series variables:
Seasonally-adjusted daily sales in the local currency
Traditional advertising, such as print advertising (incl. coupons), TV, radio, in-store promotions, etc.
Online advertising, including banner ads, search engine marketing, online coupons
www.bayesia.us | www.bayesia.sg | www.bayesia.com 3
1
A fairly broad selection of marketing science papers is provided in the appendix.
2
In order to keep the data source confdential, we have obfuscated both the industry and locale, while maintaining the
actual market dynamics of the original domain.
Competitive advertising (estimate of all marketing efforts combined)
Temperature in C
Number of retail outlets
Weekday
Model Development
While the focus of this white paper is to evaluate and interpret a given marketing mix model, we will briefy
recap the steps one would take to generate such a model with BayesiaLab.
Data Import
We use BayesiaLabs Data Import Wizard to load all 7 time series
3
into memory from a comma-separated
fle (CSV). BayesiaLab automatically detects the column headers, which contain the variable names.
The next step identifes the data types contained in the dataset. BayesiaLab will attempt to detect the type of
variables in the dataset. In this case, BayesiaLab identifes all variables to be continuous, which is indicated
by the turquoise background color of all columns.
4 www.bayesia.us | www.bayesia.sg | www.bayesia.com
3
Although the dataset has a temporal ordering, for expository simplicity we will treat each time interval as an inde-
pendent observation, without taking into account any temporal dynamics.

However, in our case the number of Weekday should be treated as discrete so as to avoid binning in the
subsequent discretization function.
As our dataset contains missing values, we need to specify the type of Missing Values Imputation. Given the
small size of the dataset, we will choose the Structural EM method.
4
4
For more details on missing values imputation with Bayesian network, see Conrady and Jouffe (2012).
The following discretization step is very important for all models in BayesiaLab and thus we provide a bit
more detail here.
Our objective of this model is to establish Sales as a function of the marketing instruments and other exter-
nal factors. Thus, we can take this objective into account for the discretization process. More specifcally,
we will split the process into two parts. First, we will discretize the target variable, i.e. Sales, on its own. We
highlight the Sales column in the data table and then choose Manual as the Discretization Type. This pro-
vides us with the probability density function of Sales.
By clicking Generate a Discretization, we are prompted to select the discretization type.
We choose Type: K-Means and Intervals: 4.
5
The chart will now display the results of this discretization.
Now that we have discretized the target variable by itself, we will discretize the remaining continuous vari-
ables with the Decision Tree algorithm and use Sales as the target. This allows binning the continuous vari-
ables in such a way that we gain a maximum amount of information from these variables with respect to
the target.
5
For a discussion of discretization algorithms and a guide for interval selection, please see the papers referenced in the
appendix.
Upon completion of the discretization, BayesiaLab will present all variables as nodes in an unconnected
network in the graph panel.
The small question mark icons associated with three of the nodes indicate that they contain missing values.
Supervised Learning
Now that we have an initial network, albeit unconnected, we can perform our frst Supervised Learning al-
gorithm with the objective of characterizing the target node. However, we do need to specify the target by
right-clicking on Sales and selecting Set As Target Node.
Once this is set, the Sales node will appear as a bulls-eye.
We have an array of Supervised Learning algorithms available to apply here. Given the small number of
nodes, variables selection is not an issue and hence this should not infuence our choice. Furthermore, the
relatively small number of observations does not create a challenge in terms of computational effort. With
these considerations, and without going into further detail, we select the Augmented Naive Bayes algorithm.
The augmented part in the name of this algorithm refers to the additional unsupervised search that is per-
formed on the basis of the given naive structure.
Upon learning, the newly generated network is now displayed in the graph panel.
The predefned naive structure is highlighted by the blue arcs, while the additional (augmented) arcs from
the unsupervised learning are shown in black.
Network Performance
Once again, this is not meant to be a complete treatment of how to build a marketing mix model. Thus, we
will not fne-tune this model by evaluating a range of specifcations,
6
algorithms or experiment with differ-
ent types of variable discretizations. However, we do wish to cover a few performance measures to assure
the reader that the model presented here is a reasonable characterization of the underlying domain.
A quick and straightforward way to test the out-of-sample network performance is to carry out Cross Vali-
dation by selecting (from within the Validation Mode) Tools | Cross Validation | Targeted | K-Folds:
6
Given the inherently dynamic nature of marketing effects, it would be very appropriate to model this as a temporal
Bayesian network. For instance, this would enable us to capture potential lags in the effects of marketing activities on
the target variable. The BayesiaLab framework can easily accommodate such a temporal specifcation, but for exposi-
tional clarity, we chose to model the contemporaneous interactions only.
In terms of parameters for the Cross-Validation, we select the same learning algorithm as before, i.e. Aug-
mented Naive Bayes. Also, using a 10-fold validation is a typical choice in this context.
The resulting Global Report provides a variety of metrics, including precision and R
2
.
Value
<=20755
6.406
<=23387
7.375
<=25914
5.594
>259145
.594
Gini Index 66% 41.75% 38.03% 69.52%
Relative Gini Index 75.25% 62.92% 63.76% 80.63%
Mean Lift 2.49 1.64 1.52 2.49
Relative Lift Index 81.50% 78.29% 80.11% 84.09%
Value
<=20755
6.406
(53)
<=23387
7.375
(142)
<=25914
5.594
(172)
>259145
.594
(59)
<=207556.406 (56) 37 18 1 0
<=233877.375 (124) 15 86 22 1
<=259145.594 (213) 1 38 140 34
>259145.594 (33) 0 0 9 24
Value
<=20755
6.406
(53)
<=23387
7.375
(142)
<=25914
5.594
(172)
>259145
.594
(59)
<=207556.406 (56) 66.07% 32.14% 1.79% 0%
<=233877.375 (124) 12.10% 69.35% 17.74% 0.81%
<=259145.594 (213) 0.47% 17.84% 65.73% 15.96%
>259145.594 (33) 0% 0% 27.27% 72.73%
Value
<=20755
6.406
(53)
<=23387
7.375
(142)
<=25914
5.594
(172)
>259145
.594
(59)
<=207556.406 (56) 69.81% 12.68% 0.58% 0%
<=233877.375 (124) 28.30% 60.56% 12.79% 1.69%
<=259145.594 (213) 1.89% 26.76% 81.40% 57.63%
>259145.594 (33) 0% 0% 5.23% 40.68%
R: 0.76104342242
R2: 0.57918709081
Occurrences
Reliability
Precision
Sampling Method: K-Folds
Learning Algorithm: Augmented Naive Bayes
Target: Sales
Relative Gini Global Mean: 70.64%
Relative Lift Global Mean: 81%
Total Precision: 67.37%
Even without further comparison, the reported values appear reasonable and suggest that we can proceed
with evaluating this network.
Model Evaluation
We have accepted the network as plausible representation of this domain and will now interpret the struc-
ture we obtained. To make it easier to understand the structure, we will frst apply one of BayesiaLabs
automatic layout algorithms, which quite literally disentangles the network and therefore provides a
clearer picture. Selecting View | Automatic Layout achieves this or alternatively pressing the P key as a
shortcut.
The Naive Bayes versus the Augmented part of this network, shown in blue and black respectively, are
now much more obvious in this layout.
Given that the naive structure was given by defnition, only the presence or absence of black arcs provides
information about the existence of relationships between the predictors. Much more can be understood
when we examine the magnitude and the sign of all relationships in the network.
Although correlation, as we will later emphasize, is not a central metric for network analysis in BayesiaLab,
we will use it for a frst look, especially since all readers will be familiar with this measure. Selecting Analy-
sis | Graphic | Pearsons Correlation provides this information directly in the network graph.
The colors of the arcs indicate the sign of the relationship and the arc labels provide the correlation value.
Many of the shown relationships seem intuitive, for instance that Number of Stores and both kinds of Ad-
vertising have a positive association with Sales. Equally plausible is the fact that Temperature is associated
with Sales (although one of the co-authors of this paper believes that one can eat ice cream rain or shine).
The negative association between Competitive Adv. and Sales also seems expected. Less clear is the negative
correlation between Sales and Weekday, but the small value suggests either very weak link or perhaps a
nonlinear relationship.
Mutual Information
Given that correlation is a strictly linear metric, its ability to characterize all these relationships is inherently
limited. We will now turn to Mutual Information as a new measure, which can help overcome this limita-
tion.
In contrast to correlation, Mutual Information does not refect the sign of the relationship, however, this
measure captures the strength of relationships between variables, even if they are highly nonlinear.
More specifcally, Mutual Information I(X,Y) measures how much (on average) the observation of a ran-
dom variable Y tells us about the uncertainty of X, i.e. by how much the entropy of X is reduced if we have
information on Y. Mutual Information is a symmetric metric, which refects the uncertainty reduction of X
by knowing Y as well as of Y by knowing X.
In our example, knowing the value of Weekday on average reduces the uncertainty of the value of Sales by
0.4802 bits, which means that it reduces its uncertainty by 17.11% (shown in red, in the opposite direction
of the arc). Conversely, knowing Sales reduces the uncertainty of Weekday by 26.3% (shown in blue, in the
opposite direction of the arc). It is interesting to see that, by looking at Mutual Information, Weekday and
Sales now have a very strong relationship, whereas previously, the correlation coeffcient was near zero.
To explore the nature of this relationship further, we can perform Target Mean Analysis with Sales and
Weekday (Analysis | Visual | Target Mean Analysis | Standard).
This prompts us to select the way we want to examine this relationship. In this context, it seems appropriate
to look at the delta mean of the target as a function of the mean of Weekday. Here, the value of Weekday is
simply Monday through Sunday recoded into discrete numerical states, 1 through 7.
The resulting plot confrms the previous hypothesis of nonlinearity.
For instance, we can interpret this as follows: given that Weekday=Friday, we observe that Sales reaches the
highest value. Furthermore we can infer that, given Weekday=Sunday, Sales has its lowest value. We can
speculate that consumers perhaps purchase more ice cream on Fridays, in preparation for leisure activities
over the weekend, than during the week.
Returning to our interpretation of Mutual Information, it is now obvious why Weekday reduces the uncer-
tainty of Sales by over 25%. There is apparently an intra-week seasonality. Another interpretation of Mu-
tual Information is importance and we can use Analysis | Report | Target Analysis | Correlations with the
Target Node to obtain an overview of the importance of all nodes in the network.
!"#"$% !"#"$% &'%$#()' *'+,''- /0 *'+,''- /0
(10/,2$#(/1 (10/,2$#(/1 345 -(+1(0(6$16' 7,''8/2 7,''8/2 3*$#$5
!""#$%& '()*'+ +,(-'. / )('')0 +*-(12/, /* '(''. +*-(12/, /* '(''.
3456"78789" ;$9( '(/+2- 0('*. '(+,2+ 1/)(2212 0,(--+ 2 '(''. 0,(--+ 2 '(''.
<=%$( ;$9( '('*-1 )(10. '(/0-2 )*-(*0'/ )2(-'0 2 '(''. )2(-'0 2 '(''.
>4( 4? @74="A '('*/ )()). '(/,*, -'2,(1'+- )0(*+/- 2 '(''. )0(*+/- 2 '(''.
BCD8C" ;$9( '('0,) )(/*. '(/12 /*/(,012 )1('2)- 2 '(''. )1('2)- 2 '(''.
<"56"=%7E=" '('12+ -(+). '(/+-- /)(1))/ -)(2,1) 2 '('/. -)(2,1) 2 '('/.
9:#'-# 3*$#$5 ;:)$%"' 3*$#$5
Node significance with respect to the information gain brought by the node to the knowledge of Sales
</8' !'$1 =$%"' 9:#'-# ;:)$%"'
Mapping
As of version 5.1, BayesiaLab offers a Mapping function, which can visualize several of the above metrics,
such a Correlation and Mutual Information in a single graphic (Analysis | Visual | Mapping):

This screenshot shows Mapping with the following metrics displayed:
Node Analysis: Mutual Information with the Target Node
Arc Analysis: Pearsons Correlation
Total Effects on Target
For the following analysis we need to emphasize that we perform observational inference, i.e. that none of
the relationships implies a causal relationship with Sales.
Total Effects on Target (Analysis | Report | Target Analysis | Total Effects on Target):
Total Effect is a linearized measure that shows the impact
7
of a one-unit change of each Node on the
Target.
8
!"#$%#&%'()% *)+&)), ./ *)+&)), ./
0."#1 2//)3" 4&))%.5 4&))%.5 6*#"#7
!"#$%&'&'(% *+(, -.,/012 -/3,.415 62,//3 5 .,..7 62,//3 5 .,..7
89:+, *+(, .,3126 2,3/14 05,/.6 5 .,..7 05,/.6 5 .,..7
;", "< =&"9%> .,4265 0?,.6./ 06,?34/ 5 .,..7 06,?34/ 5 .,..7
@AB'A% *+(, .,40?3 33,06.6 01,.50/ 5 .,..7 01,.50/ 5 .,..7
8%#$%9:&C9% .,4354 /3/,2??4 /0,5210 5 .,.47 /0,5210 5 .,.47
D%%E+:F -.,.1.4 -1?/,.6? 3?/,1542 4? .,..7 3?/,1542 4? .,..7
Total Effects on Target Sales
8.%) 0."#1 2//)3" 9:")," ;:<#1=) 9:")," 6*#"#7 ;:<#1=) 6*#"#7
7
Once again, impact should not be interpreted as a causal effect, but rather as an associated change in value.
8
The effect of one-unit change of a node is computed at its mean value.
We can speculate that some of these nodes may cause Sales, but from this table we can only infer associa-
tion, not causation.
This can be illustrated by performing the computation manually in the Monitor Panel. By default, the Moni-
tors shows the marginal frequency distributions of the states of the nodes plus the mean value (expected
value) of those distributions:
As stated above, the Total Effect is computed on the basis of a one-unit change of each node. We can simu-
late this by setting Competitive Adv. to a new mean value, i.e. changing its mean from 514.996 to 515.996.
It must be noted that there is an infnite possibility of achieving a mean value of +1 in this distribution.
BayesiaLab supports the analyst by choosing the particular distribution (of all possible distributions) that is
closest to the original distribution while achieving the targeted mean value of +1. We simply need to right-
click on the Monitor for Competitive Adv. and select Distribution for Target Value/Mean.
This prompts us to type in our desired value, i.e 515.996, to refect the one-unit change.
We can now observe the impact on Sales as a result of changing Competitive Adv. by one unit. The resulting
delta of -32.104 is shown in parentheses. This confrms (within the possible numeric precision) the value
reported in the Total Effects table.
However, you will notice that not only Sales was affected, but also most of the other node, albeit with very
small changes. This means that, given that we observe a one-unit change of Competitive Adv., we will also
observe a change in other nodes, which are connected to the target and may thus contribute to a change in
the target. This refects the Bayesian network property of omnidirectional inference. As such, the one-unit
change in Competitive Adv. is not an orthogonal impulse, which is very important to bear in mind for in-
terpretation purposes.
Direct Effects on Target
However, knowing the exclusive contribution of every single component is essential as this exercise is ulti-
mately about determining the optimum marketing mix of different instruments.
We will now briefy introduce the Likelihood Matching (LM) Algorithm, which was originally implemented
in the BayesiaLab software package for fxing probability distributions of an arbitrary set of variables,
allowing then to easily defne complex sets of soft evidence. The LM algorithm searches for a set of likeli-
hood distributions, which, when applied on the Joint Probability Distribution (JPD) encoded by the Bayes-
ian network, allows obtaining the posterior probability distributions defned (as constraints) by the user.
This allows us to perform matching across all covariates while taking into account all their interactions, and
thus estimating the exclusive effect of a node on the target.
Utilizing the LM algorithm, we can now perform Direct Effects (Analysis | Report | Target Analysis | Direct
Effects on Target).
The resulting table provides us with Standardized Direct Effect, Direct Effect, Contribution and Elasticity,
etc.
!"#$%#&%'()%
*'&)+" -..)+"
no. of SLores 3,096.38 0.1832 32.9893 27.49 17.30
1rad. Adv. 486.1939 0.1639 4.0222 24.63 17.37
CompeLlLlve Adv. 313.0276 -0.1393 -12.9373 20.71 -10.39
Cnllne Adv. 181.8383 0.0691 10.4733 10.26 6.33
1emperaLure 14.3466 0.0643 161.8894 9.38 3.00
Weekday 4.0047 -0.0494 -373.2306 7.34 -4.13
/0%) 1#23)45)#$ *'&)+" -..)+" 60$"&'73"'0$ -2#8"'+'"9
Direct Effects on Target Sales
Interestingly, we now have a different rank order of effects compared the Total Effects shown earlier. The
Direct Effect column can now be interpreted as the exclusive direct effect of a unit-change while all other
node distributions are maintained (matched). For instance, a one-unit change in No. of Stores is linked to
+73.5 delta in Sales.
The Contribution column provides a breakdown on individual contributions in percent (summing up to
100%). This means than an observed change in Sales (without any other observations) should be attributed
the individual nodes as per the Contribution values.
Elasticity
Another important measure in this context is Elasticity. The defnition of Elasticity is based on the mathe-
matical notion of point elasticity. In general, the x-elasticity of y, also called the elasticity of y with re-
spect to x, is:
E
y,x
=
!ln y
!ln x
=
!y
!x
"
x
y
=
%#y
%#x
The values for Elasticity are automatically provided as part of the Direct Effects table.
Caveats
While the Direct Effects Analysis is a convenient tool for the analyst, a number of caveats must be added.
First, the original assumption of linearity (see Total Effects) around the mean values of the nodes is still in
place. We are, in fact, interpreting small changes around the mean values, which may or may not be refect
of impact of much larger changes in inputs, which are often nonlinear. Secondly, the LM algorithm may not
be able to match all covariate distributions across all values of all nodes. BayesiaLab will report via the
Console, if an acceptable convergence cannot be achieved. Additionally, a red warning symbol will fash in
the bottom right-hand corner of the screen to alert the analyst.
Target Mean Analysis by Direct Effects
In order to overcome the limitations of the linearized view of this domain, we proceed to the Target
Mean Analysis by Direct Effects. As opposed to merely assessing the impact of a small change around the
mean of a node, we now perform this analysis, with the LM algorithm, across the entire range of values of
each node (Analysis | Visual | Target Mean Analysis | Direct Effects).
This means that, for all values of Online Advertising, BayesiaLab will attempt to fnd matching distributions
for all other covariates. So, for each value of each node, matching will be performed for all other covariate
nodes.
Those familiar with statistical matching techniques will know that this represents a signifcant computa-
tional effort, even though it seemingly happens in the background in BayesiaLab.
The above plot shows the Direct Effects curves, which can be interpreted as the response functions of the
nodes with regard to the Target. The x-values are normalized so that the nodes, which all have different
scales, can be meaningfully compared.
We can immediately see that No. of Stores and Competitive Adv. appear to be nearly linear with Sales, with
a positive and a negative sign respectively. The curves for Trad. Adv., Online Adv., and Temperature show
nonlinearities, however. Of particular interest is that Trad. Adv. plateaus around 75% of its maximum
value and then declines beyond that point.
Marketing Mix Optimization
Assuming a causal direction from all nodes towards Sales, we would possibly conclude that we should in-
crease Trad. Adv. up to the 75% level or, conversely, decrease it to the 75% level, if we had already ex-
ceeded that point. Furthermore, No. of Stores, Online Adv., and Temperature should be maximized while
Competitive Adv. should be minimized. Optimization of Sales would be rather straightforward that way.
It is self-evident that Temperature and Weekday are not subject to anybodys manipulation and that Com-
petitive Adv. is not under our control either. Knowing that the retail outlets, from boardwalk kiosks to su-
permarkets, are all independently owned and operated, we further determine that No. of Stores, i.e. the
number of retail outlets open on a given day, is beyond our control, too. To deal with this in the context of
optimization, we have the ability in BayesiaLab to specify which nodes are controllable by the agent on
whose behalf we perform the optimization. This can be done via the Cost Editor, which allows setting the
non-controllable nodes to not observable. The Cost Editor is available from the contextual menu that can
be activated through right-clicking on the Graph Panel background.
This new designation is also refected in the node colors, as non-observable nodes are now shown in a light
shade of purple.
We have now defned, by exclusion of all the others, what marketing instruments we can, in fact, manage.
However, we need to recognize another important issue. While we cannot control certain nodes at will,
some of them may be infuenced by our actions. Weekday and Temperature are clearly out of the question,
but Competitive Adv. and No. Stores could be infuenced but our marketing efforts or even by the Target
Node. Competitors will naturally react with their advertising to our advertising, and some retail outlets,
such as ice cream trucks, will adjust their business hours as a function of Sales.
BayesiaLab provides us with a convenient way to handle such responsive, yet non-controllable nodes.
We can assign to them the predefned class Non_Confounder (right-click on node: Properties | Classes |
Add)
If we repeat the Direct Effects on Target analysis on that basis, we will see that only the controllable
variables are now included.
!"#$%#&%'()%
*'&)+" -..)+"
!"#$% '$(% )*+%*,-. -%.,)* )%/)0) 0)%-.1 .*%++1
234536 '$(% .*.%0,78 -%-8*+ .)%8--) +7%881 8%-)1
/0%) 1#23)45)#$ *'&)+" -..)+" 60$"&'73"'0$ -2#8"'+'"9
Direct Effects on Target Sales
The same applies when we perform Target Mean Analysis by Direct Effects. BayesiaLab now only displays
the response curves of the variables that we can control.
These response curves can now serve as the basis for the optimization of the Target Node. In essence, we
need to determine which combination of values of our nodes, i.e. Trad. Adv. and Online Adv., would yield
the maximum value for Sales.
Resource Allocation Optimization
In most real-world applications, this optimization problem is a question of the optimum allocation of lim-
ited resources. Hence, in BayesiaLab, this marketing mix problem falls under Resource Allocation Optimi-
zation (Analysis | Report | Target Analysis | Resource Allocation Optimization):
Resource Allocation Optimization is a variant of Target Dynamic Profle. Resource Allocation Optimiza-
tionsimulates the to-be-optimized actions within a domain asSoft Evidence. These actions are constrained
by the value specifed in Maximum Resources Allowed, i.e. the overall budget computed from the per-
unit cost associated with each variable.
The option Minimize Used Resources First can specify that the optimization process begins its search by
attempting to save resources, i.e.decreasing the local resources of nodes while optimizing the search crite-
rion. The saved resources can then be reallocated in favor of the remaining variables.
You can now specify the search domain for each variable, either in terms of the Mean, the Domain, or the
Progress Margin.
TheVariation Editornow includes columns displaying the mean values for each node (highlighted in gray):
TheCurrent Meancorresponds to the marginal distribution of the node.
The Minimum Mean is computed by using the Negative Variation and the selected Type of Varia-
tion(mean, domain, progression margin).
The Maximum Mean is computed by using the Positive Variation and the selected Type of Varia-
tion(mean, domain, progression margin).
In the context of our example, we defne the range of variations as 75% versus the Current Mean.
Two criteria are available for stopping the optimization process:
Maximum Resources Allowed: by default, this value is set to sum of the resources used with the current
marginal probability distributions of the drivers. Here, the value is 665.546, i.e. the sum of the value of
Online Adv. and Trad. Adv.

Minimum JointProbability: the search is stopped when the joint probability of the Soft Evidence set in the
optimization drops below the specifed threshold.
In the context of our example, we wish to determine the optimum allocation of $900/day
9
in marketing
budget. Therefore, we set Maximum Resources Available to 900.
9
This assumes that the variables were both measured in Dollars on the same scale.
Given these parameters, plus the constraints, BayesiaLab now computes the optimum combination of the
two variables under study and reports the results in a table.
The column Initial Value/Mean shows the original, marginal values of the variables.
The column Final Value/Mean displayes the optimized values.
In our case 1 unit of measurement equals 1 Dollar, therefore the Cost column represents the difference be-
tween the initial and the fnal values of the variables.
The Resources column shows the progression from the original level of the resources to the fnal maximum
level, 895.6. This is as close as we can get to the desired level of 900 within this numerical optimization.
Perhaps the column of greatest interest is Value/Mean, which represents the Target Node. This means that
the variable Sales has increased from 234,936 to 236,401.
We visualize the before-and-after comparison of the marketing mix in the chart below.
!"#$
!$&$
!'#'
!($'
!)
!"))
!$))
!*))
!'))
!+))
!())
!&))
!#))
!,))
!"-)))
./012342/5 ./012342/5
6578/0 975/0
:;/<= ><?=
@50752 ><?=
Summary
We have demonstrated a practical approach for generating marketing mix models, and for evaluating and optimizing
them. Bayesian networks provide a practical framework for such modeling efforts, and BayesiaLab offers a wide range
of supporting functions.
Given the intuitive nature of Bayesian networks, all stakeholders in the marketing process can interpret BayesiaLab-
generated models and thus examine their plausibility. With that, we address the often-heard objection that marketing
mix models are like a black box, which cannot be validated independently. Most importantly, we have introduced the
concept of causality into marketing mix optimization, thus overcoming the severe limitations of observational inference
in the context of interventions.
Appendix
Framework: The Bayesian Network Paradigm
10
Acyclic Graphs & Bayess Rule
Probabilistic models based on directed acyclic graphs have a long and rich tradition, beginning with the
work of geneticist Sewall Wright in the 1920s. Variants have appeared in many felds. Within statistics, such
models are known as directed graphical models; within cognitive science and artifcial intelligence, such
models are known as Bayesian networks. The name honors the Rev. Thomas Bayes (1702-1761), whose
rule for updating probabilities in the light of new evidence is the foundation of the approach.
Rev. Bayes addressed both the case of discrete probability distributions of data and the more complicated
case of continuous probability distributions. In the discrete case, Bayes theorem relates the conditional and
marginal probabilities of events A and B, provided that the probability of B does not equal zero:

P(AB) =
P(BA)P(A)
P(B)
In Bayes theorem, each probability has a conventional name:
P(A) is the prior probability (or unconditional or marginal probability) ofA. It is prior in the sense
that it does not take into account any information about B; however, the event B need not occur after
eventA. In the nineteenth century, the unconditional probabilityP(A) in Bayess rule was called the ante-
cedent probability; in deductive logic, the antecedent set of propositions and the inference rule imply con-
sequences. The unconditional probabilityP(A) was called apriori by Ronald A. Fisher.
P(A|B) is the conditional probability of A, given B. It is also called the posterior probability because it is de-
rived from or depends upon the specifed value ofB.
P(B|A) is the conditional probability of B given A. It is also called the likelihood.
P(B) is the prior or marginal probability of B, and acts as a normalizing constant.
Bayes theorem in this form gives a mathematical representation of how the conditional probability of event
A given B is related to the converse conditional probability of B given A.
The initial development of Bayesian networks in the late 1970s was motivated by the need to model the top-
down (semantic) and bottom-up (perceptual) combination of evidence in reading. The capability for bidirec-
tional inferences, combined with a rigorous probabilistic foundation, led to the rapid emergence of Bayesian
networks as the method of choice for uncertain reasoning in AI and expert systems replacing earlier, ad hoc
rule-based schemes.
10
Adapted from Pearl (2000), used with permission.
The nodes in a Bayesian network represent variables
of interest (e.g. the temperature of a device, the gen-
der of a patient, a feature of an object, the occur-
rence of an event) and the links represent statistical
(informational) or causal dependencies among the
variables. The dependencies are quantifed by condi-
tional probabilities for each node given its parents in
the network. The network supports the computation
of the posterior probabilities of any subset of vari-
ables given evidence about any other subset.
Compact Representation of the Joint
Probability Distribution
The central paradigm of probabilistic reasoning is
to identify all relevant variables x1, . . . , xN in the
environment [i.e. the domain under study], and
make a probabilistic model p(x1, . . . , xN) of their interaction [i.e. represent the variables joint probability
distribution].
Bayesian networks are very attractive for this purpose as they can, by means of factorization, compactly
represent the joint probability distribution of all variables.
Reasoning (inference) is then performed by introducing evidence that sets variables in known states, and
subsequently computing probabilities of interest, conditioned on this evidence. The rules of probability,
combined with Bayes rule make for a complete reasoning system, one which includes traditional deductive
logic as a special case. (Barber, 2012)
References
Allenby, Greg M., and Peter E. Rossi. Marketing Models of Consumer Heterogeneity. Journal of Econometrics 89,
no. 12 (November 26, 1998): 5778.
Barber, David. Bayesian Reasoning and Machine Learning. Cambridge University Press, 2011.
Bell, D.R., J. Chiang, and V. Padmanabhan. The Decomposition of Promotional Response: An Empirical Generaliza-
tion. Marketing Science (1999): 504526.
. The Decomposition of Promotional Response: An Empirical Generalization. Marketing Science (1999): 504
526.
Bowman, Douglas. Market Response and Marketing Mix Models Trends and Research Opportunities. Boston:: Now,,
2010.
Chandy, R.K., G.J. Tellis, D.J. MacInnis, and P. Thaivanich. What to Say When: Advertising Appeals in Evolving
Markets. Journal of Marketing Research 38, no. 4 (2001): 399414.
Conrady, Stefan, and Lionel Jouffe. Driver Analysis & Product Optimization, A Case Study from the Perfume Indus-
try, December 1, 2010. http://www.bayesia.us/index.php/driver-analysis.
. Missing Values Imputation - A New Approach to Missing Values Processing with Bayesian Networks, Janu-
ary 4, 2012. http://bayesia.us/index.php/missingvalues.
. Modeling Vehicle Choice and Simulating Market Share with Bayesian Networks (2010).
http://bayesia.us/index.php/market-share-simulation.
Dekimpe, M.G., and D.M. Hanssens. Persistence Modeling for Assessing Marketing Strategy Performance. Erasmus
Research Institute of Management, Erasmus University, 2003.
. Time-series Models in Marketing::: Past, Present and Future. International Journal of Research in Marketing
17, no. 23 (2000): 183193.
Dekimpe, Marnik G., and Dominique M. Hanssens. The Persistence of Marketing Effects on Sales. Marketing Science
14, no. 1 (January 1, 1995): 121.
Dorfman, Robert, and Peter O. Steiner. Optimal Advertising and Optimal Quality. The American Economic Review
44, no. 5 (December 1, 1954): 826836.
Dotson, Jeff, and Stefan Conrady. Investigating the Dynamic Impact of Advertising Through Online Search and Onine
Sales presented at the Advanced Research Techniques Forum, Palm Desert, CA, June 7, 2011.
Erasmus, A.C., E. Boshoff, and GG Rousseau. Consumer Decision-making Models Within the Discipline of Consumer
Science: a Critical Approach. Journal of Family Ecology and Consumer Sciences/Tydskrif Vir Gesinsekologie En
Verbruikerswetenskappe 29 (2010).
Freo, M. The Impact of Sales Promotions on Store Performance: a Structural Vector Autoregressive (SVAR) Ap-
proach (2005).
Gatignon, H., and D.M. Hanssens. Modeling Marketing Interactions with Application to Salesforce Effectiveness.
Journal of Marketing Research (1987): 247257.
Gelman, Andrew, and Jennifer Hill. Data Analysis Using Regression and Multilevel/Hierarchical Models. 1st ed. Cam-
bridge University Press, 2006.
Ghysels, Eric, Pedro Santa-Clara, and Rossen Valkanov. The MIDAS Touch: Mixed Data Sampling Regression Models.
Anderson Graduate School of Management, UCLA, June 2004. http://ideas.repec.org/p/cdl/anderf/4852.html.
Gowrisankaran, G., and M. Rysman. Dynamics of Consumer Demand for New Durable Goods. National Bureau of
Economic Research Cambridge, Mass., USA, 2009.
Gupta, S., D. Hanssens, B. Hardie, W. Kahn, V. Kumar, N. Lin, N. Ravishanker, and S. Sriram. Modeling Customer
Lifetime Value. Journal of Service Research 9, no. 2 (2006): 139.
Hagmayer, Y., S.A. Sloman, D.A. Lagnado, and M.R. Waldmann. Causal Reasoning Through Intervention. Causal
Learning: Psychology, Philosophy, and Computation (2007): 86100.
Hanssens, Dominique M., Peter S. H. Leefang, and Dick R. Wittink. Market Response Models and Marketing Prac-
tice. Applied Stochastic Models in Business and Industry 21 (July 2005): 423434.
Hanssens, Dominique. Market Response Models: Econometric and Time Series Analysis. 2. ed. Dordrecht: Kluwer Aca-
demic Publishers, 2003.
Heckerman, D. A Tutorial on Learning with Bayesian Networks. Innovations in Bayesian Networks (2008): 3382.
Hoover, K., and S. Demiralp. Searching for the Causal Structure of a Vector Autoregression. Feedback 212 (1842):
338.
Joseph, J. Understanding Advertising Adstock Transformations (2006).
Joseph, J.V. Non-Stationarity Effects in Causal Sales Forecasting Models (n.d.).
Koppelman, F.S., and C. Bhat. A Self Instructing Course in Mode Choice Modeling: Multinomial and Nested Logit
Models. Prepared for US Department of Transportation Federal Transit Administration (2006).
Manchanda, Puneet, Peter E. Rossi, and Pradeep K. Chintagunta. Response Modeling with Non-Random Marketing
Mix Variables. SSRN eLibrary (January 2003). http://papers.ssrn.com/sol3/papers.cfm?abstract_id=371360.
. Response Modeling with Nonrandom Marketing-Mix Variables. Journal of Marketing Research 41, no. 4
(November 1, 2004): 467478.
Meyer, R., T. Erdem, F. Feinberg, I. Gilboa, W. Hutchinson, A. Krishna, S. Lippman, et al. Dynamic Infuences on
Individual Choice Behavior. Marketing Letters 8, no. 3 (1997): 349360.
Morgan, Neil A., Rebecca J. Slotegraaf, and Douglas W. Vorhies. Linking Marketing Capabilities with Proft
Growth. International Journal of Research in Marketing 26, no. 4 (December 2009): 284293.
Naik, P.A., K. Raman, and R.S. Winer. Planning Marketing-mix Strategies in the Presence of Interaction Effects.
Marketing Science (2005): 2534.
Nijs, V.R., M.G. Dekimpe, J.B.E.M. Steenkamp, and D.M. Hanssens. The Category-demand Effects of Price Promo-
tions. Marketing Science (2001): 122.
Oleg Korenok, George E. Hoffer, and Edward L. Millner. Non-Price Determinants of Automotive Demand: Restyling
Matters Most. VCU School of Business, Department of Economics, September 2009.
http://ideas.repec.org/p/vcu/wpaper/0903.html.
Pauwels, Koen, Imran Currim, Marnik G. Dekimpe, Dominique M. Hanssens, Natalie Mizik, Eric Ghysels, and Prasad
Naik. Modeling Marketing Dynamics by Time Series Econometrics. Marketing Letters 15 (January 1, 2005):
167183.
Pearl, Judea. Causality: Models, Reasoning and Inference. 2nd ed. Cambridge University Press, 2009.
Pourret, Olivier, Patrick Nam, and Bruce Marcot. Bayesian Networks: A Practical Guide to Applications. 1st ed. Wiley,
2008.
Ramaswami, Sridhar N., Rajendra K. Srivastava, and Mukesh Bhargava. Market-based Capabilities and Financial
Performance of Firms: Insights into Marketings Contribution to Firm Value. Journal of the Academy of Market-
ing Science 37 (October 2, 2008): 97116.
Robinson, William T. Marketing Mix Reactions to Entry. Marketing Science 7, no. 4 (October 1, 1988): 368385.
Rossi, Peter. Bayesian Statistics and Marketing. Hoboken NJ: Wiley, 2005.
Rossi, Peter E., and Greg M. Allenby. Bayesian Statistics and Marketing. Marketing Science 22, no. 3 (July 1, 2003):
304328.
Rossi, Peter E., Robert E. McCulloch, and Greg M. Allenby. The Value of Purchase History Data in Target Market-
ing. Marketing Science 15, no. 4 (January 1, 1996): 321340.
Rubin, Donald B. Matched Sampling for Causal Effects. 1st ed. Cambridge University Press, 2006.
Sethuraman, R., and G.J. Tellis. An Analysis of the Tradeoff Between Advertising and Price Discounting. Journal of
Marketing Research (1991): 160174.
Sethuraman, R., G.J. Tellis, and R.A. Briesch. How Well Does Advertising Work? Generalizations from Meta-Analysis
of Brand Advertising Elasticities. Journal of Marketing Research 48, no. 3 (2011): 457471.
Silva-Risso, J., W. V. Shearin, I. Ionova, A. Khavaev, and D. Borrego. Chrysler and J. D. Power: Pioneering Scientifc
Price Customization in the Automobile Industry. Interfaces 38 (January 1, 2008): 2639.
Slotegraaf, R. J. The Paradox of a Marketing Planning Capability. Journal of the Academy of Marketing Science 32
(October 1, 2004): 371385.
Le Song, M.K., and E.P. Xing. Time-Varying Dynamic Bayesian Networks. Advances in Neural Information Process-
ing Systems 22 (n.d.): 17321740.
Srinivasan, S., and D.M. Hanssens. Marketing and Firm Value: Metrics, Methods, Findings, and Future Directions.
Journal of Marketing Research 46, no. 3 (2009): 293312.
Srinivasan, S., K. Pauwels, D.M. Hanssens, and M.G. Dekimpe. Do Promotions Beneft Manufacturers, Retailers, or
Both? Management Science (2004): 617629.
Srinivasan, S., K. Pauwels, J. Silva-Risso, and D.M. Hanssens. Product Innovations, Marketing Investments and Stock
Returns. Journal of Marketing 73, no. 1 (2009): 2443.
Srinivasan, S. Staying Ahead in the Innovation Race: New-product Introductions and Relative Firm Value. School of
Management, University of California, Los Angeles, CA, 2004.
Steenkamp, J.B.E.M., V.R. Nijs, D.M. Hanssens, and M.G. Dekimpe. Competitive Reactions to Advertising and Pro-
motion Attacks. Marketing Science (2005): 3554.
Szymanski, D.M., and R.T. Hise. E-satisfaction: An Initial Examination. Journal of Retailing 76, no. 3 (2000): 309
322.
Tellis, G.J. Advertisings Role in Capitalist Markets: What Do We Know and Where Do We Go from Here? JOUR-
NAL OF ADVERTISING RESEARCH-NEW YORK- 45, no. 2 (2005): 162.
Tellis, G.J., R.K. Chandy, and P. Thaivanich. Which Ad Works, When, Where, and How Often? Modeling the Effects
of Direct Television Advertising. Journal of Marketing Research 37, no. 1 (2000): 3246.
Tellis, G.J. Modeling Marketing Mix. Handbook of Marketing Research (2006): 506522.
Tellis, G.J., and D.L. Weiss. Does TV Advertising Really Affect Sales? The Role of Measures, Models, and Data Ag-
gregation. Journal of Advertising (1995): 112.
Villanueva, J., S. Yoo, and D.M. Hanssens. The Impact of Marketing-induced Versus Word-of-mouth Customer Ac-
quisition on Customer Equity Growth. Journal of Marketing Research 45, no. 1 (2008): 4859.
Wierenga, B. Handbook of Marketing Decision Models. Springer, 2008.
Wright, M. A New Theorem for Optimizing the Advertising Budget (2008).
Yoo, B., N. Donthu, and S. Lee. An Examination of Selected Marketing Mix Elements and Brand Equity. Journal of
the Academy of Marketing Science 28 (April 1, 2000): 195211.
Zenor, M.J., B.J. Bronnenberg, and L. McAlister. The Impact of Marketing Policy on Promotional Price Elasticities
and Baseline Sales. Journal of Retailing and Consumer Services 5, no. 1 (1998): 2532.
Contact Information
Bayesia USA
312 Hamlets End Way
Franklin, TN 37067
USA
Phone: +1 888-386-8383
info@bayesia.us
www.bayesia.us
Bayesia Singapore Pte. Ltd.
20 Cecil Street
#14-01, Equity Plaza
Singapore 049705
Phone: +653158 2690
info@bayesia.sg
www.bayesia.sg
Bayesia S.A.S.
6, rue Lonard de Vinci
BP 119
53001 Laval Cedex
France
Phone: +33(0)2 43 49 75 69
info@bayesia.com
www.bayesia.com
Copyright
2013 Bayesia S.A.S., Bayesia USA and Bayesia Singapore. All rights reserved.

Marketing Mix v5

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Marketing Mix v5

Hochgeladen von

Copyright:

Verfügbare Formate

Marketing Mix Models with BayesiaLab

Stefan Conrady, stefan.conrady@bayesia.us

Das könnte Ihnen auch gefallen