Beruflich Dokumente
Kultur Dokumente
White Paper
Rajesh Kavadiki
Rajesh is part of the Analytics and Insights team at Tata Consultancy Services (TCS).
He has over ten years of experience in Big Data analytics and machine learning,
and is currently working on parallelizing machine learning algorithms on the Big
Data platform. He has worked extensively on Java, MapReduce, and Big Data
related technologies with specialization in retail and social media.
Abstract
Intense competition, seasonality, and other such factors compel retailers and e-commerce vendors to
amend prices of stock keeping units (SKUs) on a daily basis. There is also great pressure on store
managers to predict the sale quantity before the price changes in order to forecast demand and inform
suppliers or vendors accordingly to prevent out of stock situations.
Price elasticity is a statistical technique that can be incorporated into a software model to predict the sale
quantity depending on the percentage of the change in price. The model is designed to predict the sale
quantities depending on the price change, promotion offers, seasonality, and changes in competitors'
SKU prices.
Determining price elasticity is extremely important for supporting retailer pricing decisions aimed at
reducing price resistance. The extent to which a product is price elastic impacts the volume of sales and
the revenues of a retailer. Therefore, marketing managers need to consider price sensitivity of the market
and the impact of price change on the sales revenue.
We have the example of an e-commerce retailer who offered huge discounts, resulting in most of the
products going out of stock within minutes. This happened because the retailer was not able to judge the
surge in demand caused by the drastic reduction in price. This in turn led to customer dissatisfaction and
negative feedback all over social media. Retailers thus need to consider building price elasticity models
for every SKU across stores to avoid such scenarios.
With multiple SKUs in their portfolio, retailers are dealing with large data sets. Retail stores, particularly
those spread across countries, have millions of transactions every day with data sizes ranging in hundreds
of terabytes. To process this amount of data, traditional statistical software or systems would need to run
for days, and in most cases, would run out of memory while analyzing them. Traditionally, therefore,
retailers use models built only to handle a few sets of SKUs. Alternatively, store managers rely on their
intuition and experience to predict sales volumes. Now retailers can realize better outcomes and
accurately evaluate price elasticity of products by leveraging a statistical model and framework that uses
distributed computing or Big Data.
5
Interpreting Price Elasticity
The following formula is generally used to find if the product is price elastic, inelastic or neutral:
Price elasticity of demand = Percentage change in quantity demand / Percentage change in price
(P-P)/[(P+P)]
where
If price elasticity is less than -1 it means that the product is price elastic; that is, if there is an increase in
price the total revenue falls, and if there is a decrease in price the total revenue increases.
If price elasticity is greater than -1 and less than 0, it represents that the product is price inelastic; that is, if
the price increases the total revenue increases, and if the price decreases the total revenue decreases.
If price elasticity is equal to -1, it represents that the product is unit elastic. This means if the price
increases the total revenue remains unchanged, and if price decreases the total revenue remains
unchanged.
The products that belong to the price inelastic category are those for which there are very few
competitive alternatives, such as unique irreplaceable spare parts. So, there is an opportunity for a retailer
to increase the price of these commodities and gain more revenue.
The products that belong to the price elastic category are those for which there are competitive
alternatives available and an increase in price could cause the customers to switch to other brands. An
example of this could be consumer goods such as personal grooming products or household products.
The total revenue here is the function of price and quantity.
The products that belong to the unit price elastic category are the ones for which a unit increase or
decrease in price would cause the same unit decrease or increase in demand. Hence a price change
would not make any impact on the revenues. 6
Price elasticity is not only applicable to retail, but also to various other industries and sectors such as
government and taxation. Cigarettes as a product category would fall under the price inelastic category,
since there are no alternatives to cigarettes as a product. So if the government tries to increase the price
of the cigarettes, the demand would almost remain the same or would decrease by a negligible margin,
thus increasing revenue. Another example would be an increase in tax by the income tax department,
which affects everyone in the supply chain differently. If there is an increase in tax on the purchase of raw
materials, the supplier might choose to pass on the price change to the manufacturer. Now the
manufacturer or retailer might decide to pass on the tax burden to the customer or may try to find
different ways to absorb the price change, which would be completely dependent on the price elasticity
of the product.
A distributed computing model using MapReduce framework can be used to measure price elasticity of
items across stores and millions of transactions. Since using traditional methods and systems (like SAS, R,
MATLAB) would be difficult to process big data sets, this approach uses Hadoop Distributed File System
(HDFS) along with MapReduce. Hadoop can scale horizontally, enabling retailers to use a cluster of
systems to build models for every SKU in the store. These predictive models can then be used to predict
the sales quantity for a price change, and thus, help store managers ensure stock availability before the
price change occurs.
Furthermore, the recommended solution uses Mahout, an open source distributed machine learning
library to overcome the limitation of existing traditional systems. The design and implementation
methodology demonstrates the feasibility of this approach to solve price elasticity model using log-linear
regression. The solution is generic and can be used with various predictor variables as per business needs.
In addition, we also suggest a parallel implementation of the log-linear regression model using the
MapReduce programming concept. Log-linear is a powerful technique in regression models, which have
become popular in solving price elasticity for retailers.
Price elasticity generally follows log-linear models as compared to linear models, since the change in
quantity does not change linearly with price. Log-linear models are similar to linear models; however
both independent and dependent variables are log transformed. Since log is applied on both dependent
and independent variables, some researchers call it the log-log model.
Price Price
where
Total Price Reduction (TPR), front page, coupons and ads are some of the promotional offers that are
applicable for a store on a particular day
I(X) Indicator function that takes value 1 or 0 depending on the promo offer
Here, Y is a single dimensional matrix with log normalized sales, and X0..n represents n dimensional matrix
with each column having unique log normalized feature. and coefficients can be found by ordinary
T -1 T
least squares formula (X X) X Y.
= (XTX)-1XTY 1.3
T -1 T
The above equation (X X) X Y has an inversion operation which is highly computation intensive for a very
large matrix. This can be avoided by parallelizable singular value decomposition.
Singular value decomposition (SVD) is the factorization of a matrix into three different matrices (U, D and
V). Multiplication of these three matrices (U, D and V) would result in the initial matrix. Equation 1.3 can
further be decomposed using SVD as:
= (V*D-1*UT) Y 1.4
8
6. MapReduce job to take the transpose of U
T
7. MapReduce job to take the transpose of V
-1 T T
8. MapReduce job to perform multiplication of V * D * U * X * Y
S = e+
(Target SKU Price) + SKU Price factor
e* I(TPR)+
e* I(Front Page)+
Promotion Offers
e* I(Coupons)+
e* I(Ads)+
(Competitor SKU Price) +
(Competitor SKU Price) + Competitor Prices
(Competitor SKU Price)
i=1
Where
Strue is the true sales for the day and for a store
As price elasticity is solved using log-linear regression, all the continuous variables are log normalized and
indicator variables are left intact. The prices of targeted SKU and competitor SKU are log normalized. The
promotion offers are not log normalized as these are discrete. Each row of the data describes day level
characteristics for a store (see Table 1).
Table 1: Indicative Dataset Row with Day Level Characteristics for a Store
The total sales in Table 1 reflect the sales quantity for the SKU in a store on a particular day. Total Price
Reduction (TPR), front page, coupons and ads are some of the promotional offers that are applicable for a
store on a particular day. TR_SKU1 price, TR_SKU2 price, TR_SKU3 price are the competitor SKU prices on
that day. The recommended model writes a hive SQL script, which in turn runs a MapReduce job to join
different tables and aggregate sales on a particular day for a store from millions of transactions.
Table 1 is for representation purposes only and the real world scenarios could be different. In the real
word, in case of large multinational stores, millions of transactions are conducted in each store and there
could be thousands of such stores. In such cases, the model needs to:
Aggregate millions of transactions with respect to sales for each store and for each SKU every day
Aggregate the total number of daily sales for every SKU across different stores in each row (example:
with approximately 1000 stores and for 730 days, the model would generate (1000*730)
approximately 0.7 million rows)
Perform the above step for every SKU (example: with approximately 10,000 SKUs, the model would
have 10,000 different data sets each with approximately 0.7 million rows)
10
Perform log-linear regression for each of the datasets (approximately 10,000), which would result in
alpha and beta coefficients
Substitute the alpha and beta coefficients in real time to predict the sales quantity
The analyses recommended in this paper were carried out using the three-stage analytical framework
including data aggregation, model building, and validation. Log regression has been employed to build
models at SKU-day level for computing sales. The columns or independent variables used in this model
are for indicative purpose and these variables can be extended to include other characteristics such as
seasonality, store level characteristics, and indicators for weekend or holiday sales. Retailers can use these
characteristics and many more depending on their business model to accurately predict sales for a given
price change. This accuracy could help separate the winners from losers in an increasingly competitive
retail environment.
11
About TCS Business Process Services Unit
Enterprises seek to drive business growth and agility through innovation in an increasingly
regulated, competitive, and global market. TCS helps clients achieve these goals by managing and
executing their business operations effectively and efficiently.
TCS Business Process Services (BPS) include core industry-specific processes, analytics and insights,
and enterprise services such as finance and accounting, HR, and supply chain management. TCS
creates value through its FORETM simplification and transformation methodology, backed by its deep
domain expertise, extensive technology experience, and TRAPEZETM governance enablers and
solutions. TCS complements its experience and expertise with innovative delivery models such as
using robotic automation and providing Business Processes as a Service (BPaaS).
TCS' BPS unit has been positioned in the leaders' quadrant for various service lines by many leading
analyst firms. With over four decades of global experience and a delivery footprint spanning six
continents, TCS is one of the largest BPS providers today.
Contact
For more information about TCS' Business Process Services Unit, visit: www.tcs.com/bps
Email: bps.connect@tcs.com
IT Services
Business Solutions
Consulting
All content / information present here is the exclusive property of Tata Consultancy Services Limited (TCS). The content / information contained here
is correct at the time of publishing. No material from here may be copied, modified, reproduced, republished, uploaded, transmitted, posted or
distributed in any form without prior written permission from TCS. Unauthorized use of the content / information appearing here may violate
copyright, trademark and other applicable laws, and could result in criminal or civil penalties. Copyright 2015 Tata Consultancy Services Limited