Dr. Lionel Jouffe, jouffe@bayesia.com May 8, 2013 Table of Contents Introduction Example & Dataset 3 Model Development 4 Data Import 4 Supervised Learning 8 Network Performance 10 Model Evaluation 12 Mutual Information 14 Mapping 17 Total Effects on Target 18 Direct Effects on Target 20 Elasticity 21 Caveats 22 Target Mean Analysis by Direct Effects 22 Marketing Mix Optimization 24 Resource Allocation Optimization 27 Summary 32 Appendix Framework: The Bayesian Network Paradigm 33 Acyclic Graphs & Bayess Rule 33 Compact Representation of the Joint Probability Distribution 34 References 35 Contact Information 38 Bayesia USA 38 Bayesia Singapore Pte. Ltd. 38 Bayesia S.A.S. 38 Copyright 38 Marketing Mix Models with BayesiaLab ii www.bayesia.us | www.bayesia.sg | www.bayesia.com Introduction To many business executives, marketing mix models remain shrouded in mystery. There are many advertis- ing agencies and market research companies, plus countless online media frms, which promote their par- ticular marketing mix model in an effort to support marketing decision makers. However, there is a re- markable lack of universally accepted and standardized methods that marketing practitioners can draw upon. In fact, performing a search for books on Amazon.com regarding marketing mix models yields only three relevant titles and, as it turns out, they are mostly geared towards an academic audience. In con- trast to the sparse array of books, there is no shortage of academic papers on all aspects of marketing mix modeling and optimization. 1 Without doubt, many of these peer-reviewed journal articles have progressed the feld of marketing science, but the often abstract nature of the proposed methods keep them far removed from practical implementation. Given this lack of textbook references in this feld, plus the fairly inaccessible nature of the academic litera- ture, decision makers have to rely almost exclusively on the persuasion skills of research vendors and con- sultants in determining the validity of any proposed marketing mix model. Marketing models based on Bayesian networks are not automatically a solution to this quandary, but their inherently visual nature plus their computational transparency make them much more accessible to a broad range of stakeholders. To interpret and validate a Bayesian network requires, most importantly, a common-sense understanding of the domain and not necessarily a degree in statistics. It is our objective to use the framework of Bayesian networks, plus the features of the BayesiaLab software package, to create sound marketing mix models that can be implemented by many and interpreted by all. More specifcally, we will focus on how to optimize marketing mix models with BayesiaLabs algorithms and to derive policy recommendations for decision makers. Example & Dataset To illustrate this approach we study daily ice cream sales of a European food distributor as a function of environmental variables and marketing efforts. 2 Our sample data set includes the following time-series variables: Seasonally-adjusted daily sales in the local currency Traditional advertising, such as print advertising (incl. coupons), TV, radio, in-store promotions, etc. Online advertising, including banner ads, search engine marketing, online coupons Marketing Mix Models with BayesiaLab www.bayesia.us | www.bayesia.sg | www.bayesia.com 3 1 A fairly broad selection of marketing science papers is provided in the appendix. 2 In order to keep the data source confdential, we have obfuscated both the industry and locale, while maintaining the actual market dynamics of the original domain. Competitive advertising (estimate of all marketing efforts combined) Temperature in C Number of retail outlets Weekday Model Development While the focus of this white paper is to evaluate and interpret a given marketing mix model, we will briefy recap the steps one would take to generate such a model with BayesiaLab. Data Import We use BayesiaLabs Data Import Wizard to load all 7 time series 3 into memory from a comma-separated fle (CSV). BayesiaLab automatically detects the column headers, which contain the variable names. The next step identifes the data types contained in the dataset. BayesiaLab will attempt to detect the type of variables in the dataset. In this case, BayesiaLab identifes all variables to be continuous, which is indicated by the turquoise background color of all columns. Marketing Mix Models with BayesiaLab 4 www.bayesia.us | www.bayesia.sg | www.bayesia.com 3 Although the dataset has a temporal ordering, for expository simplicity we will treat each time interval as an inde- pendent observation, without taking into account any temporal dynamics.
However, in our case the number of Weekday should be treated as discrete so as to avoid binning in the subsequent discretization function. As our dataset contains missing values, we need to specify the type of Missing Values Imputation. Given the small size of the dataset, we will choose the Structural EM method. 4 Marketing Mix Models with BayesiaLab www.bayesia.us | www.bayesia.sg | www.bayesia.com 5 4 For more details on missing values imputation with Bayesian network, see Conrady and Jouffe (2012). The following discretization step is very important for all models in BayesiaLab and thus we provide a bit more detail here. Our objective of this model is to establish Sales as a function of the marketing instruments and other exter- nal factors. Thus, we can take this objective into account for the discretization process. More specifcally, we will split the process into two parts. First, we will discretize the target variable, i.e. Sales, on its own. We highlight the Sales column in the data table and then choose Manual as the Discretization Type. This pro- vides us with the probability density function of Sales. By clicking Generate a Discretization, we are prompted to select the discretization type. Marketing Mix Models with BayesiaLab 6 www.bayesia.us | www.bayesia.sg | www.bayesia.com We choose Type: K-Means and Intervals: 4. 5 The chart will now display the results of this discretization. Now that we have discretized the target variable by itself, we will discretize the remaining continuous vari- ables with the Decision Tree algorithm and use Sales as the target. This allows binning the continuous vari- ables in such a way that we gain a maximum amount of information from these variables with respect to the target. Marketing Mix Models with BayesiaLab www.bayesia.us | www.bayesia.sg | www.bayesia.com 7 5 For a discussion of discretization algorithms and a guide for interval selection, please see the papers referenced in the appendix. Upon completion of the discretization, BayesiaLab will present all variables as nodes in an unconnected network in the graph panel. The small question mark icons associated with three of the nodes indicate that they contain missing values. Supervised Learning Now that we have an initial network, albeit unconnected, we can perform our frst Supervised Learning al- gorithm with the objective of characterizing the target node. However, we do need to specify the target by right-clicking on Sales and selecting Set As Target Node. Marketing Mix Models with BayesiaLab 8 www.bayesia.us | www.bayesia.sg | www.bayesia.com Once this is set, the Sales node will appear as a bulls-eye. We have an array of Supervised Learning algorithms available to apply here. Given the small number of nodes, variables selection is not an issue and hence this should not infuence our choice. Furthermore, the relatively small number of observations does not create a challenge in terms of computational effort. With these considerations, and without going into further detail, we select the Augmented Naive Bayes algorithm. The augmented part in the name of this algorithm refers to the additional unsupervised search that is per- formed on the basis of the given naive structure. Marketing Mix Models with BayesiaLab www.bayesia.us | www.bayesia.sg | www.bayesia.com 9 Upon learning, the newly generated network is now displayed in the graph panel. The predefned naive structure is highlighted by the blue arcs, while the additional (augmented) arcs from the unsupervised learning are shown in black. Network Performance Once again, this is not meant to be a complete treatment of how to build a marketing mix model. Thus, we will not fne-tune this model by evaluating a range of specifcations, 6 algorithms or experiment with differ- ent types of variable discretizations. However, we do wish to cover a few performance measures to assure the reader that the model presented here is a reasonable characterization of the underlying domain. A quick and straightforward way to test the out-of-sample network performance is to carry out Cross Vali- dation by selecting (from within the Validation Mode) Tools | Cross Validation | Targeted | K-Folds: Marketing Mix Models with BayesiaLab 10 www.bayesia.us | www.bayesia.sg | www.bayesia.com 6 Given the inherently dynamic nature of marketing effects, it would be very appropriate to model this as a temporal Bayesian network. For instance, this would enable us to capture potential lags in the effects of marketing activities on the target variable. The BayesiaLab framework can easily accommodate such a temporal specifcation, but for exposi- tional clarity, we chose to model the contemporaneous interactions only. In terms of parameters for the Cross-Validation, we select the same learning algorithm as before, i.e. Aug- mented Naive Bayes. Also, using a 10-fold validation is a typical choice in this context. The resulting Global Report provides a variety of metrics, including precision and R 2 . Value <=20755 6.406 <=23387 7.375 <=25914 5.594 >259145 .594 Gini Index 66% 41.75% 38.03% 69.52% Relative Gini Index 75.25% 62.92% 63.76% 80.63% Mean Lift 2.49 1.64 1.52 2.49 Relative Lift Index 81.50% 78.29% 80.11% 84.09% Value <=20755 6.406 (53) <=23387 7.375 (142) <=25914 5.594 (172) >259145 .594 (59) <=207556.406 (56) 37 18 1 0 <=233877.375 (124) 15 86 22 1 <=259145.594 (213) 1 38 140 34 >259145.594 (33) 0 0 9 24 Value <=20755 6.406 (53) <=23387 7.375 (142) <=25914 5.594 (172) >259145 .594 (59) <=207556.406 (56) 66.07% 32.14% 1.79% 0% <=233877.375 (124) 12.10% 69.35% 17.74% 0.81% <=259145.594 (213) 0.47% 17.84% 65.73% 15.96% >259145.594 (33) 0% 0% 27.27% 72.73% Value <=20755 6.406 (53) <=23387 7.375 (142) <=25914 5.594 (172) >259145 .594 (59) <=207556.406 (56) 69.81% 12.68% 0.58% 0% <=233877.375 (124) 28.30% 60.56% 12.79% 1.69% <=259145.594 (213) 1.89% 26.76% 81.40% 57.63% >259145.594 (33) 0% 0% 5.23% 40.68% R: 0.76104342242 R2: 0.57918709081 Occurrences Reliability Precision Sampling Method: K-Folds Learning Algorithm: Augmented Naive Bayes Target: Sales Relative Gini Global Mean: 70.64% Relative Lift Global Mean: 81% Total Precision: 67.37% Marketing Mix Models with BayesiaLab www.bayesia.us | www.bayesia.sg | www.bayesia.com 11 Even without further comparison, the reported values appear reasonable and suggest that we can proceed with evaluating this network. Model Evaluation We have accepted the network as plausible representation of this domain and will now interpret the struc- ture we obtained. To make it easier to understand the structure, we will frst apply one of BayesiaLabs automatic layout algorithms, which quite literally disentangles the network and therefore provides a clearer picture. Selecting View | Automatic Layout achieves this or alternatively pressing the P key as a shortcut. The Naive Bayes versus the Augmented part of this network, shown in blue and black respectively, are now much more obvious in this layout. Marketing Mix Models with BayesiaLab 12 www.bayesia.us | www.bayesia.sg | www.bayesia.com Given that the naive structure was given by defnition, only the presence or absence of black arcs provides information about the existence of relationships between the predictors. Much more can be understood when we examine the magnitude and the sign of all relationships in the network. Although correlation, as we will later emphasize, is not a central metric for network analysis in BayesiaLab, we will use it for a frst look, especially since all readers will be familiar with this measure. Selecting Analy- sis | Graphic | Pearsons Correlation provides this information directly in the network graph. The colors of the arcs indicate the sign of the relationship and the arc labels provide the correlation value. Many of the shown relationships seem intuitive, for instance that Number of Stores and both kinds of Ad- vertising have a positive association with Sales. Equally plausible is the fact that Temperature is associated with Sales (although one of the co-authors of this paper believes that one can eat ice cream rain or shine). The negative association between Competitive Adv. and Sales also seems expected. Less clear is the negative Marketing Mix Models with BayesiaLab www.bayesia.us | www.bayesia.sg | www.bayesia.com 13 correlation between Sales and Weekday, but the small value suggests either very weak link or perhaps a nonlinear relationship. Mutual Information Given that correlation is a strictly linear metric, its ability to characterize all these relationships is inherently limited. We will now turn to Mutual Information as a new measure, which can help overcome this limita- tion. In contrast to correlation, Mutual Information does not refect the sign of the relationship, however, this measure captures the strength of relationships between variables, even if they are highly nonlinear. More specifcally, Mutual Information I(X,Y) measures how much (on average) the observation of a ran- dom variable Y tells us about the uncertainty of X, i.e. by how much the entropy of X is reduced if we have Marketing Mix Models with BayesiaLab 14 www.bayesia.us | www.bayesia.sg | www.bayesia.com information on Y. Mutual Information is a symmetric metric, which refects the uncertainty reduction of X by knowing Y as well as of Y by knowing X. In our example, knowing the value of Weekday on average reduces the uncertainty of the value of Sales by 0.4802 bits, which means that it reduces its uncertainty by 17.11% (shown in red, in the opposite direction of the arc). Conversely, knowing Sales reduces the uncertainty of Weekday by 26.3% (shown in blue, in the opposite direction of the arc). It is interesting to see that, by looking at Mutual Information, Weekday and Sales now have a very strong relationship, whereas previously, the correlation coeffcient was near zero. To explore the nature of this relationship further, we can perform Target Mean Analysis with Sales and Weekday (Analysis | Visual | Target Mean Analysis | Standard). This prompts us to select the way we want to examine this relationship. In this context, it seems appropriate to look at the delta mean of the target as a function of the mean of Weekday. Here, the value of Weekday is simply Monday through Sunday recoded into discrete numerical states, 1 through 7. The resulting plot confrms the previous hypothesis of nonlinearity. Marketing Mix Models with BayesiaLab www.bayesia.us | www.bayesia.sg | www.bayesia.com 15 For instance, we can interpret this as follows: given that Weekday=Friday, we observe that Sales reaches the highest value. Furthermore we can infer that, given Weekday=Sunday, Sales has its lowest value. We can speculate that consumers perhaps purchase more ice cream on Fridays, in preparation for leisure activities over the weekend, than during the week. Returning to our interpretation of Mutual Information, it is now obvious why Weekday reduces the uncer- tainty of Sales by over 25%. There is apparently an intra-week seasonality. Another interpretation of Mu- tual Information is importance and we can use Analysis | Report | Target Analysis | Correlations with the Target Node to obtain an overview of the importance of all nodes in the network. Marketing Mix Models with BayesiaLab 16 www.bayesia.us | www.bayesia.sg | www.bayesia.com !"#"$% !"#"$% &'%$#()' *'+,''- /0 *'+,''- /0 (10/,2$#(/1 (10/,2$#(/1 345 -(+1(0(6$16' 7,''8/2 7,''8/2 3*$#$5 !""#$%& '()*'+ +,(-'. / )('')0 +*-(12/, /* '(''. +*-(12/, /* '(''. 3456"78789" ;$9( '(/+2- 0('*. '(+,2+ 1/)(2212 0,(--+ 2 '(''. 0,(--+ 2 '(''. <=%$( ;$9( '('*-1 )(10. '(/0-2 )*-(*0'/ )2(-'0 2 '(''. )2(-'0 2 '(''. >4( 4? @74="A '('*/ )()). '(/,*, -'2,(1'+- )0(*+/- 2 '(''. )0(*+/- 2 '(''. BCD8C" ;$9( '('0,) )(/*. '(/12 /*/(,012 )1('2)- 2 '(''. )1('2)- 2 '(''. <"56"=%7E=" '('12+ -(+). '(/+-- /)(1))/ -)(2,1) 2 '('/. -)(2,1) 2 '('/. 9:#'-# 3*$#$5 ;:)$%"' 3*$#$5 Node significance with respect to the information gain brought by the node to the knowledge of Sales </8' !'$1 =$%"' 9:#'-# ;:)$%"' Mapping As of version 5.1, BayesiaLab offers a Mapping function, which can visualize several of the above metrics, such a Correlation and Mutual Information in a single graphic (Analysis | Visual | Mapping):
This screenshot shows Mapping with the following metrics displayed: Node Analysis: Mutual Information with the Target Node Arc Analysis: Pearsons Correlation Marketing Mix Models with BayesiaLab www.bayesia.us | www.bayesia.sg | www.bayesia.com 17 Total Effects on Target For the following analysis we need to emphasize that we perform observational inference, i.e. that none of the relationships implies a causal relationship with Sales. Total Effects on Target (Analysis | Report | Target Analysis | Total Effects on Target): Total Effect is a linearized measure that shows the impact 7 of a one-unit change of each Node on the Target. 8 !"#$%#&%'()% *)+&)), ./ *)+&)), ./ 0."#1 2//)3" 4&))%.5 4&))%.5 6*#"#7 !"#$%&'&'(% *+(, -.,/012 -/3,.415 62,//3 5 .,..7 62,//3 5 .,..7 89:+, *+(, .,3126 2,3/14 05,/.6 5 .,..7 05,/.6 5 .,..7 ;", "< =&"9%> .,4265 0?,.6./ 06,?34/ 5 .,..7 06,?34/ 5 .,..7 @AB'A% *+(, .,40?3 33,06.6 01,.50/ 5 .,..7 01,.50/ 5 .,..7 8%#$%9:&C9% .,4354 /3/,2??4 /0,5210 5 .,.47 /0,5210 5 .,.47 D%%E+:F -.,.1.4 -1?/,.6? 3?/,1542 4? .,..7 3?/,1542 4? .,..7 Total Effects on Target Sales 8.%) 0."#1 2//)3" 9:")," ;:<#1=) 9:")," 6*#"#7 ;:<#1=) 6*#"#7 Marketing Mix Models with BayesiaLab 18 www.bayesia.us | www.bayesia.sg | www.bayesia.com 7 Once again, impact should not be interpreted as a causal effect, but rather as an associated change in value. 8 The effect of one-unit change of a node is computed at its mean value. We can speculate that some of these nodes may cause Sales, but from this table we can only infer associa- tion, not causation. This can be illustrated by performing the computation manually in the Monitor Panel. By default, the Moni- tors shows the marginal frequency distributions of the states of the nodes plus the mean value (expected value) of those distributions: As stated above, the Total Effect is computed on the basis of a one-unit change of each node. We can simu- late this by setting Competitive Adv. to a new mean value, i.e. changing its mean from 514.996 to 515.996. It must be noted that there is an infnite possibility of achieving a mean value of +1 in this distribution. BayesiaLab supports the analyst by choosing the particular distribution (of all possible distributions) that is closest to the original distribution while achieving the targeted mean value of +1. We simply need to right- click on the Monitor for Competitive Adv. and select Distribution for Target Value/Mean. Marketing Mix Models with BayesiaLab www.bayesia.us | www.bayesia.sg | www.bayesia.com 19 This prompts us to type in our desired value, i.e 515.996, to refect the one-unit change. We can now observe the impact on Sales as a result of changing Competitive Adv. by one unit. The resulting delta of -32.104 is shown in parentheses. This confrms (within the possible numeric precision) the value reported in the Total Effects table. However, you will notice that not only Sales was affected, but also most of the other node, albeit with very small changes. This means that, given that we observe a one-unit change of Competitive Adv., we will also observe a change in other nodes, which are connected to the target and may thus contribute to a change in the target. This refects the Bayesian network property of omnidirectional inference. As such, the one-unit change in Competitive Adv. is not an orthogonal impulse, which is very important to bear in mind for in- terpretation purposes. Direct Effects on Target However, knowing the exclusive contribution of every single component is essential as this exercise is ulti- mately about determining the optimum marketing mix of different instruments. We will now briefy introduce the Likelihood Matching (LM) Algorithm, which was originally implemented in the BayesiaLab software package for fxing probability distributions of an arbitrary set of variables, allowing then to easily defne complex sets of soft evidence. The LM algorithm searches for a set of likeli- Marketing Mix Models with BayesiaLab 20 www.bayesia.us | www.bayesia.sg | www.bayesia.com hood distributions, which, when applied on the Joint Probability Distribution (JPD) encoded by the Bayes- ian network, allows obtaining the posterior probability distributions defned (as constraints) by the user. This allows us to perform matching across all covariates while taking into account all their interactions, and thus estimating the exclusive effect of a node on the target. Utilizing the LM algorithm, we can now perform Direct Effects (Analysis | Report | Target Analysis | Direct Effects on Target). The resulting table provides us with Standardized Direct Effect, Direct Effect, Contribution and Elasticity, etc. !"#$%#&%'()% *'&)+" -..)+" no. of SLores 3,096.38 0.1832 32.9893 27.49 17.30 1rad. Adv. 486.1939 0.1639 4.0222 24.63 17.37 CompeLlLlve Adv. 313.0276 -0.1393 -12.9373 20.71 -10.39 Cnllne Adv. 181.8383 0.0691 10.4733 10.26 6.33 1emperaLure 14.3466 0.0643 161.8894 9.38 3.00 Weekday 4.0047 -0.0494 -373.2306 7.34 -4.13 /0%) 1#23)45)#$ *'&)+" -..)+" 60$"&'73"'0$ -2#8"'+'"9 Direct Effects on Target Sales Interestingly, we now have a different rank order of effects compared the Total Effects shown earlier. The Direct Effect column can now be interpreted as the exclusive direct effect of a unit-change while all other node distributions are maintained (matched). For instance, a one-unit change in No. of Stores is linked to +73.5 delta in Sales. The Contribution column provides a breakdown on individual contributions in percent (summing up to 100%). This means than an observed change in Sales (without any other observations) should be attributed the individual nodes as per the Contribution values. Elasticity Another important measure in this context is Elasticity. The defnition of Elasticity is based on the mathe- matical notion of point elasticity. In general, the x-elasticity of y, also called the elasticity of y with re- spect to x, is: Marketing Mix Models with BayesiaLab www.bayesia.us | www.bayesia.sg | www.bayesia.com 21 E y,x = !ln y !ln x = !y !x " x y = %#y %#x The values for Elasticity are automatically provided as part of the Direct Effects table. Caveats While the Direct Effects Analysis is a convenient tool for the analyst, a number of caveats must be added. First, the original assumption of linearity (see Total Effects) around the mean values of the nodes is still in place. We are, in fact, interpreting small changes around the mean values, which may or may not be refect of impact of much larger changes in inputs, which are often nonlinear. Secondly, the LM algorithm may not be able to match all covariate distributions across all values of all nodes. BayesiaLab will report via the Console, if an acceptable convergence cannot be achieved. Additionally, a red warning symbol will fash in the bottom right-hand corner of the screen to alert the analyst. Target Mean Analysis by Direct Effects In order to overcome the limitations of the linearized view of this domain, we proceed to the Target Mean Analysis by Direct Effects. As opposed to merely assessing the impact of a small change around the mean of a node, we now perform this analysis, with the LM algorithm, across the entire range of values of each node (Analysis | Visual | Target Mean Analysis | Direct Effects). Marketing Mix Models with BayesiaLab 22 www.bayesia.us | www.bayesia.sg | www.bayesia.com This means that, for all values of Online Advertising, BayesiaLab will attempt to fnd matching distributions for all other covariates. So, for each value of each node, matching will be performed for all other covariate nodes. Those familiar with statistical matching techniques will know that this represents a signifcant computa- tional effort, even though it seemingly happens in the background in BayesiaLab. Marketing Mix Models with BayesiaLab www.bayesia.us | www.bayesia.sg | www.bayesia.com 23 The above plot shows the Direct Effects curves, which can be interpreted as the response functions of the nodes with regard to the Target. The x-values are normalized so that the nodes, which all have different scales, can be meaningfully compared. We can immediately see that No. of Stores and Competitive Adv. appear to be nearly linear with Sales, with a positive and a negative sign respectively. The curves for Trad. Adv., Online Adv., and Temperature show nonlinearities, however. Of particular interest is that Trad. Adv. plateaus around 75% of its maximum value and then declines beyond that point. Marketing Mix Optimization Assuming a causal direction from all nodes towards Sales, we would possibly conclude that we should in- crease Trad. Adv. up to the 75% level or, conversely, decrease it to the 75% level, if we had already ex- ceeded that point. Furthermore, No. of Stores, Online Adv., and Temperature should be maximized while Competitive Adv. should be minimized. Optimization of Sales would be rather straightforward that way. It is self-evident that Temperature and Weekday are not subject to anybodys manipulation and that Com- petitive Adv. is not under our control either. Knowing that the retail outlets, from boardwalk kiosks to su- permarkets, are all independently owned and operated, we further determine that No. of Stores, i.e. the number of retail outlets open on a given day, is beyond our control, too. To deal with this in the context of optimization, we have the ability in BayesiaLab to specify which nodes are controllable by the agent on whose behalf we perform the optimization. This can be done via the Cost Editor, which allows setting the non-controllable nodes to not observable. The Cost Editor is available from the contextual menu that can be activated through right-clicking on the Graph Panel background. Marketing Mix Models with BayesiaLab 24 www.bayesia.us | www.bayesia.sg | www.bayesia.com This new designation is also refected in the node colors, as non-observable nodes are now shown in a light shade of purple. We have now defned, by exclusion of all the others, what marketing instruments we can, in fact, manage. However, we need to recognize another important issue. While we cannot control certain nodes at will, some of them may be infuenced by our actions. Weekday and Temperature are clearly out of the question, but Competitive Adv. and No. Stores could be infuenced but our marketing efforts or even by the Target Node. Competitors will naturally react with their advertising to our advertising, and some retail outlets, such as ice cream trucks, will adjust their business hours as a function of Sales. BayesiaLab provides us with a convenient way to handle such responsive, yet non-controllable nodes. We can assign to them the predefned class Non_Confounder (right-click on node: Properties | Classes | Add) Marketing Mix Models with BayesiaLab www.bayesia.us | www.bayesia.sg | www.bayesia.com 25 If we repeat the Direct Effects on Target analysis on that basis, we will see that only the controllable variables are now included. !"#$%#&%'()% *'&)+" -..)+" !"#$% '$(% )*+%*,-. -%.,)* )%/)0) 0)%-.1 .*%++1 234536 '$(% .*.%0,78 -%-8*+ .)%8--) +7%881 8%-)1 /0%) 1#23)45)#$ *'&)+" -..)+" 60$"&'73"'0$ -2#8"'+'"9 Direct Effects on Target Sales The same applies when we perform Target Mean Analysis by Direct Effects. BayesiaLab now only displays the response curves of the variables that we can control. Marketing Mix Models with BayesiaLab 26 www.bayesia.us | www.bayesia.sg | www.bayesia.com These response curves can now serve as the basis for the optimization of the Target Node. In essence, we need to determine which combination of values of our nodes, i.e. Trad. Adv. and Online Adv., would yield the maximum value for Sales. Resource Allocation Optimization In most real-world applications, this optimization problem is a question of the optimum allocation of lim- ited resources. Hence, in BayesiaLab, this marketing mix problem falls under Resource Allocation Optimi- zation (Analysis | Report | Target Analysis | Resource Allocation Optimization): Resource Allocation Optimization is a variant of Target Dynamic Profle. Resource Allocation Optimiza- tionsimulates the to-be-optimized actions within a domain asSoft Evidence. These actions are constrained Marketing Mix Models with BayesiaLab www.bayesia.us | www.bayesia.sg | www.bayesia.com 27 by the value specifed in Maximum Resources Allowed, i.e. the overall budget computed from the per- unit cost associated with each variable. The option Minimize Used Resources First can specify that the optimization process begins its search by attempting to save resources, i.e.decreasing the local resources of nodes while optimizing the search crite- rion. The saved resources can then be reallocated in favor of the remaining variables. You can now specify the search domain for each variable, either in terms of the Mean, the Domain, or the Progress Margin. TheVariation Editornow includes columns displaying the mean values for each node (highlighted in gray): TheCurrent Meancorresponds to the marginal distribution of the node. Marketing Mix Models with BayesiaLab 28 www.bayesia.us | www.bayesia.sg | www.bayesia.com The Minimum Mean is computed by using the Negative Variation and the selected Type of Varia- tion(mean, domain, progression margin). The Maximum Mean is computed by using the Positive Variation and the selected Type of Varia- tion(mean, domain, progression margin). In the context of our example, we defne the range of variations as 75% versus the Current Mean. Two criteria are available for stopping the optimization process: Maximum Resources Allowed: by default, this value is set to sum of the resources used with the current marginal probability distributions of the drivers. Here, the value is 665.546, i.e. the sum of the value of Online Adv. and Trad. Adv.
Minimum JointProbability: the search is stopped when the joint probability of the Soft Evidence set in the optimization drops below the specifed threshold. In the context of our example, we wish to determine the optimum allocation of $900/day 9 in marketing budget. Therefore, we set Maximum Resources Available to 900. Marketing Mix Models with BayesiaLab www.bayesia.us | www.bayesia.sg | www.bayesia.com 29 9 This assumes that the variables were both measured in Dollars on the same scale. Given these parameters, plus the constraints, BayesiaLab now computes the optimum combination of the two variables under study and reports the results in a table. The column Initial Value/Mean shows the original, marginal values of the variables. The column Final Value/Mean displayes the optimized values. In our case 1 unit of measurement equals 1 Dollar, therefore the Cost column represents the difference be- tween the initial and the fnal values of the variables. Marketing Mix Models with BayesiaLab 30 www.bayesia.us | www.bayesia.sg | www.bayesia.com The Resources column shows the progression from the original level of the resources to the fnal maximum level, 895.6. This is as close as we can get to the desired level of 900 within this numerical optimization. Perhaps the column of greatest interest is Value/Mean, which represents the Target Node. This means that the variable Sales has increased from 234,936 to 236,401. We visualize the before-and-after comparison of the marketing mix in the chart below. Marketing Mix Models with BayesiaLab www.bayesia.us | www.bayesia.sg | www.bayesia.com 31 !"#$ !$&$ !'#' !($' !) !")) !$)) !*)) !')) !+)) !()) !&)) !#)) !,)) !"-))) ./012342/5 ./012342/5 6578/0 975/0 :;/<= ><?= @50752 ><?= Summary We have demonstrated a practical approach for generating marketing mix models, and for evaluating and optimizing them. Bayesian networks provide a practical framework for such modeling efforts, and BayesiaLab offers a wide range of supporting functions. Given the intuitive nature of Bayesian networks, all stakeholders in the marketing process can interpret BayesiaLab- generated models and thus examine their plausibility. With that, we address the often-heard objection that marketing mix models are like a black box, which cannot be validated independently. Most importantly, we have introduced the concept of causality into marketing mix optimization, thus overcoming the severe limitations of observational inference in the context of interventions. Marketing Mix Models with BayesiaLab 32 www.bayesia.us | www.bayesia.sg | www.bayesia.com Appendix Framework: The Bayesian Network Paradigm 10 Acyclic Graphs & Bayess Rule Probabilistic models based on directed acyclic graphs have a long and rich tradition, beginning with the work of geneticist Sewall Wright in the 1920s. Variants have appeared in many felds. Within statistics, such models are known as directed graphical models; within cognitive science and artifcial intelligence, such models are known as Bayesian networks. The name honors the Rev. Thomas Bayes (1702-1761), whose rule for updating probabilities in the light of new evidence is the foundation of the approach. Rev. Bayes addressed both the case of discrete probability distributions of data and the more complicated case of continuous probability distributions. In the discrete case, Bayes theorem relates the conditional and marginal probabilities of events A and B, provided that the probability of B does not equal zero:
P(AB) = P(BA)P(A) P(B) In Bayes theorem, each probability has a conventional name: P(A) is the prior probability (or unconditional or marginal probability) ofA. It is prior in the sense that it does not take into account any information about B; however, the event B need not occur after eventA. In the nineteenth century, the unconditional probabilityP(A) in Bayess rule was called the ante- cedent probability; in deductive logic, the antecedent set of propositions and the inference rule imply con- sequences. The unconditional probabilityP(A) was called apriori by Ronald A. Fisher. P(A|B) is the conditional probability of A, given B. It is also called the posterior probability because it is de- rived from or depends upon the specifed value ofB. P(B|A) is the conditional probability of B given A. It is also called the likelihood. P(B) is the prior or marginal probability of B, and acts as a normalizing constant. Bayes theorem in this form gives a mathematical representation of how the conditional probability of event A given B is related to the converse conditional probability of B given A. The initial development of Bayesian networks in the late 1970s was motivated by the need to model the top- down (semantic) and bottom-up (perceptual) combination of evidence in reading. The capability for bidirec- tional inferences, combined with a rigorous probabilistic foundation, led to the rapid emergence of Bayesian networks as the method of choice for uncertain reasoning in AI and expert systems replacing earlier, ad hoc rule-based schemes. Marketing Mix Models with BayesiaLab www.bayesia.us | www.bayesia.sg | www.bayesia.com 33 10 Adapted from Pearl (2000), used with permission. The nodes in a Bayesian network represent variables of interest (e.g. the temperature of a device, the gen- der of a patient, a feature of an object, the occur- rence of an event) and the links represent statistical (informational) or causal dependencies among the variables. The dependencies are quantifed by condi- tional probabilities for each node given its parents in the network. The network supports the computation of the posterior probabilities of any subset of vari- ables given evidence about any other subset. Compact Representation of the Joint Probability Distribution The central paradigm of probabilistic reasoning is to identify all relevant variables x1, . . . , xN in the environment [i.e. the domain under study], and make a probabilistic model p(x1, . . . , xN) of their interaction [i.e. represent the variables joint probability distribution]. Bayesian networks are very attractive for this purpose as they can, by means of factorization, compactly represent the joint probability distribution of all variables. Reasoning (inference) is then performed by introducing evidence that sets variables in known states, and subsequently computing probabilities of interest, conditioned on this evidence. The rules of probability, combined with Bayes rule make for a complete reasoning system, one which includes traditional deductive logic as a special case. (Barber, 2012) Marketing Mix Models with BayesiaLab 34 www.bayesia.us | www.bayesia.sg | www.bayesia.com References Allenby, Greg M., and Peter E. Rossi. Marketing Models of Consumer Heterogeneity. Journal of Econometrics 89, no. 12 (November 26, 1998): 5778. Barber, David. Bayesian Reasoning and Machine Learning. Cambridge University Press, 2011. Bell, D.R., J. Chiang, and V. Padmanabhan. The Decomposition of Promotional Response: An Empirical Generaliza- tion. Marketing Science (1999): 504526. . The Decomposition of Promotional Response: An Empirical Generalization. Marketing Science (1999): 504 526. Bowman, Douglas. Market Response and Marketing Mix Models Trends and Research Opportunities. Boston:: Now,, 2010. Chandy, R.K., G.J. Tellis, D.J. MacInnis, and P. Thaivanich. What to Say When: Advertising Appeals in Evolving Markets. Journal of Marketing Research 38, no. 4 (2001): 399414. Conrady, Stefan, and Lionel Jouffe. Driver Analysis & Product Optimization, A Case Study from the Perfume Indus- try, December 1, 2010. http://www.bayesia.us/index.php/driver-analysis. . Missing Values Imputation - A New Approach to Missing Values Processing with Bayesian Networks, Janu- ary 4, 2012. http://bayesia.us/index.php/missingvalues. . Modeling Vehicle Choice and Simulating Market Share with Bayesian Networks (2010). http://bayesia.us/index.php/market-share-simulation. Dekimpe, M.G., and D.M. Hanssens. Persistence Modeling for Assessing Marketing Strategy Performance. Erasmus Research Institute of Management, Erasmus University, 2003. . Time-series Models in Marketing::: Past, Present and Future. International Journal of Research in Marketing 17, no. 23 (2000): 183193. Dekimpe, Marnik G., and Dominique M. Hanssens. The Persistence of Marketing Effects on Sales. Marketing Science 14, no. 1 (January 1, 1995): 121. Dorfman, Robert, and Peter O. Steiner. Optimal Advertising and Optimal Quality. The American Economic Review 44, no. 5 (December 1, 1954): 826836. Dotson, Jeff, and Stefan Conrady. Investigating the Dynamic Impact of Advertising Through Online Search and Onine Sales presented at the Advanced Research Techniques Forum, Palm Desert, CA, June 7, 2011. Erasmus, A.C., E. Boshoff, and GG Rousseau. Consumer Decision-making Models Within the Discipline of Consumer Science: a Critical Approach. Journal of Family Ecology and Consumer Sciences/Tydskrif Vir Gesinsekologie En Verbruikerswetenskappe 29 (2010). Freo, M. The Impact of Sales Promotions on Store Performance: a Structural Vector Autoregressive (SVAR) Ap- proach (2005). Gatignon, H., and D.M. Hanssens. Modeling Marketing Interactions with Application to Salesforce Effectiveness. Journal of Marketing Research (1987): 247257. Gelman, Andrew, and Jennifer Hill. Data Analysis Using Regression and Multilevel/Hierarchical Models. 1st ed. Cam- bridge University Press, 2006. Ghysels, Eric, Pedro Santa-Clara, and Rossen Valkanov. The MIDAS Touch: Mixed Data Sampling Regression Models. Anderson Graduate School of Management, UCLA, June 2004. http://ideas.repec.org/p/cdl/anderf/4852.html. Gowrisankaran, G., and M. Rysman. Dynamics of Consumer Demand for New Durable Goods. National Bureau of Economic Research Cambridge, Mass., USA, 2009. Gupta, S., D. Hanssens, B. Hardie, W. Kahn, V. Kumar, N. Lin, N. Ravishanker, and S. Sriram. Modeling Customer Lifetime Value. Journal of Service Research 9, no. 2 (2006): 139. Marketing Mix Models with BayesiaLab www.bayesia.us | www.bayesia.sg | www.bayesia.com 35 Hagmayer, Y., S.A. Sloman, D.A. Lagnado, and M.R. Waldmann. Causal Reasoning Through Intervention. Causal Learning: Psychology, Philosophy, and Computation (2007): 86100. Hanssens, Dominique M., Peter S. H. Leefang, and Dick R. Wittink. Market Response Models and Marketing Prac- tice. Applied Stochastic Models in Business and Industry 21 (July 2005): 423434. Hanssens, Dominique. Market Response Models: Econometric and Time Series Analysis. 2. ed. Dordrecht: Kluwer Aca- demic Publishers, 2003. Heckerman, D. A Tutorial on Learning with Bayesian Networks. Innovations in Bayesian Networks (2008): 3382. Hoover, K., and S. Demiralp. Searching for the Causal Structure of a Vector Autoregression. Feedback 212 (1842): 338. Joseph, J. Understanding Advertising Adstock Transformations (2006). Joseph, J.V. Non-Stationarity Effects in Causal Sales Forecasting Models (n.d.). Koppelman, F.S., and C. Bhat. A Self Instructing Course in Mode Choice Modeling: Multinomial and Nested Logit Models. Prepared for US Department of Transportation Federal Transit Administration (2006). Manchanda, Puneet, Peter E. Rossi, and Pradeep K. Chintagunta. Response Modeling with Non-Random Marketing Mix Variables. SSRN eLibrary (January 2003). http://papers.ssrn.com/sol3/papers.cfm?abstract_id=371360. . Response Modeling with Nonrandom Marketing-Mix Variables. Journal of Marketing Research 41, no. 4 (November 1, 2004): 467478. Meyer, R., T. Erdem, F. Feinberg, I. Gilboa, W. Hutchinson, A. Krishna, S. Lippman, et al. Dynamic Infuences on Individual Choice Behavior. Marketing Letters 8, no. 3 (1997): 349360. Morgan, Neil A., Rebecca J. Slotegraaf, and Douglas W. Vorhies. Linking Marketing Capabilities with Proft Growth. International Journal of Research in Marketing 26, no. 4 (December 2009): 284293. Naik, P.A., K. Raman, and R.S. Winer. Planning Marketing-mix Strategies in the Presence of Interaction Effects. Marketing Science (2005): 2534. Nijs, V.R., M.G. Dekimpe, J.B.E.M. Steenkamp, and D.M. Hanssens. The Category-demand Effects of Price Promo- tions. Marketing Science (2001): 122. Oleg Korenok, George E. Hoffer, and Edward L. Millner. Non-Price Determinants of Automotive Demand: Restyling Matters Most. VCU School of Business, Department of Economics, September 2009. http://ideas.repec.org/p/vcu/wpaper/0903.html. Pauwels, Koen, Imran Currim, Marnik G. Dekimpe, Dominique M. Hanssens, Natalie Mizik, Eric Ghysels, and Prasad Naik. Modeling Marketing Dynamics by Time Series Econometrics. Marketing Letters 15 (January 1, 2005): 167183. Pearl, Judea. Causality: Models, Reasoning and Inference. 2nd ed. Cambridge University Press, 2009. Pourret, Olivier, Patrick Nam, and Bruce Marcot. Bayesian Networks: A Practical Guide to Applications. 1st ed. Wiley, 2008. Ramaswami, Sridhar N., Rajendra K. Srivastava, and Mukesh Bhargava. Market-based Capabilities and Financial Performance of Firms: Insights into Marketings Contribution to Firm Value. Journal of the Academy of Market- ing Science 37 (October 2, 2008): 97116. Robinson, William T. Marketing Mix Reactions to Entry. Marketing Science 7, no. 4 (October 1, 1988): 368385. Rossi, Peter. Bayesian Statistics and Marketing. Hoboken NJ: Wiley, 2005. Rossi, Peter E., and Greg M. Allenby. Bayesian Statistics and Marketing. Marketing Science 22, no. 3 (July 1, 2003): 304328. Rossi, Peter E., Robert E. McCulloch, and Greg M. Allenby. The Value of Purchase History Data in Target Market- ing. Marketing Science 15, no. 4 (January 1, 1996): 321340. Rubin, Donald B. Matched Sampling for Causal Effects. 1st ed. Cambridge University Press, 2006. Marketing Mix Models with BayesiaLab 36 www.bayesia.us | www.bayesia.sg | www.bayesia.com Sethuraman, R., and G.J. Tellis. An Analysis of the Tradeoff Between Advertising and Price Discounting. Journal of Marketing Research (1991): 160174. Sethuraman, R., G.J. Tellis, and R.A. Briesch. How Well Does Advertising Work? Generalizations from Meta-Analysis of Brand Advertising Elasticities. Journal of Marketing Research 48, no. 3 (2011): 457471. Silva-Risso, J., W. V. Shearin, I. Ionova, A. Khavaev, and D. Borrego. Chrysler and J. D. Power: Pioneering Scientifc Price Customization in the Automobile Industry. Interfaces 38 (January 1, 2008): 2639. Slotegraaf, R. J. The Paradox of a Marketing Planning Capability. Journal of the Academy of Marketing Science 32 (October 1, 2004): 371385. Le Song, M.K., and E.P. Xing. Time-Varying Dynamic Bayesian Networks. Advances in Neural Information Process- ing Systems 22 (n.d.): 17321740. Srinivasan, S., and D.M. Hanssens. Marketing and Firm Value: Metrics, Methods, Findings, and Future Directions. Journal of Marketing Research 46, no. 3 (2009): 293312. Srinivasan, S., K. Pauwels, D.M. Hanssens, and M.G. Dekimpe. Do Promotions Beneft Manufacturers, Retailers, or Both? Management Science (2004): 617629. Srinivasan, S., K. Pauwels, J. Silva-Risso, and D.M. Hanssens. Product Innovations, Marketing Investments and Stock Returns. Journal of Marketing 73, no. 1 (2009): 2443. Srinivasan, S. Staying Ahead in the Innovation Race: New-product Introductions and Relative Firm Value. School of Management, University of California, Los Angeles, CA, 2004. Steenkamp, J.B.E.M., V.R. Nijs, D.M. Hanssens, and M.G. Dekimpe. Competitive Reactions to Advertising and Pro- motion Attacks. Marketing Science (2005): 3554. Szymanski, D.M., and R.T. Hise. E-satisfaction: An Initial Examination. Journal of Retailing 76, no. 3 (2000): 309 322. Tellis, G.J. Advertisings Role in Capitalist Markets: What Do We Know and Where Do We Go from Here? JOUR- NAL OF ADVERTISING RESEARCH-NEW YORK- 45, no. 2 (2005): 162. Tellis, G.J., R.K. Chandy, and P. Thaivanich. Which Ad Works, When, Where, and How Often? Modeling the Effects of Direct Television Advertising. Journal of Marketing Research 37, no. 1 (2000): 3246. Tellis, G.J. Modeling Marketing Mix. Handbook of Marketing Research (2006): 506522. Tellis, G.J., and D.L. Weiss. Does TV Advertising Really Affect Sales? The Role of Measures, Models, and Data Ag- gregation. Journal of Advertising (1995): 112. Villanueva, J., S. Yoo, and D.M. Hanssens. The Impact of Marketing-induced Versus Word-of-mouth Customer Ac- quisition on Customer Equity Growth. Journal of Marketing Research 45, no. 1 (2008): 4859. Wierenga, B. Handbook of Marketing Decision Models. Springer, 2008. Wright, M. A New Theorem for Optimizing the Advertising Budget (2008). Yoo, B., N. Donthu, and S. Lee. An Examination of Selected Marketing Mix Elements and Brand Equity. Journal of the Academy of Marketing Science 28 (April 1, 2000): 195211. Zenor, M.J., B.J. Bronnenberg, and L. McAlister. The Impact of Marketing Policy on Promotional Price Elasticities and Baseline Sales. Journal of Retailing and Consumer Services 5, no. 1 (1998): 2532. Marketing Mix Models with BayesiaLab www.bayesia.us | www.bayesia.sg | www.bayesia.com 37 Contact Information Bayesia USA 312 Hamlets End Way Franklin, TN 37067 USA Phone: +1 888-386-8383 info@bayesia.us www.bayesia.us Bayesia Singapore Pte. Ltd. 20 Cecil Street #14-01, Equity Plaza Singapore 049705 Phone: +653158 2690 info@bayesia.sg www.bayesia.sg Bayesia S.A.S. 6, rue Lonard de Vinci BP 119 53001 Laval Cedex France Phone: +33(0)2 43 49 75 69 info@bayesia.com www.bayesia.com Copyright 2013 Bayesia S.A.S., Bayesia USA and Bayesia Singapore. All rights reserved. Marketing Mix Models with BayesiaLab 38 www.bayesia.us | www.bayesia.sg | www.bayesia.com