Sie sind auf Seite 1von 6


Mahima Hada


Marketing Analyst Team’s Purpose:
Your firm is going to roll out an expensive free trial campaign, and the firm wants to restrict this
promotion to only its valuable customers, a smaller number than their usual 256 “key customers”
who get free trials. Your marketing analyst team has been given purchase data to find out which
group of these 256 customers should be targeted in the next campaign. You know that the firm has
had success using RFM (recency, frequency, monetary) analysis in the past. You intend to build on
the RFM philosophy, and produce a better model using segmentation and regression.
Data: Your team has purchase data of the 256 “key customers”. File: RFMData_raw.xlsx
Submission Requirements:
Groups will submit a presentation of 5 slides (not counting title slide) or less in the format of a
business presentation and the Excel File and Tableau workbook with the solution.
ONE submission per group.
Presentation guidelines
The main deck of the presentation should contain only *insights* from the data, such that an
executive who doesn’t know the sample or the data or how to do segmentation and regression,
can still walk away with an understanding of:
1. the key 256 customers,
2. the segments within these customers,
3. Who the firm should target (which segment in terms of RFM metrics) for their trial
promotion, and why.
Do not exceed 5 slides (for the presentation slides).
Use Appendix slides/hidden slides to show your “work”:
1. Tableau snapshots of clusters found (different types of clusters found; different types
2. Regression results: the different types of regression you tried, and why you chose the
specific aspects you did
3. Any supporting work you want to show
Even though “descriptive analytics” is not *formally* a part of the deliverables in this project,
you cannot do any data science project without doing basic descriptive analytics first.
Prof. Mahima Hada
Project Details and tips:
In this project, you will conduct one of the most common analysis in marketing with customer
purchase data: RFM analysis. RFM (recency, frequency, monetary) analysis is a marketing
technique used to determine quantitatively which customers are the best ones by examining how
recently a customer has purchased (recency), how often they purchase (frequency), and how
much the customer spends (monetary).1
The project consists of three activities:
1. Data Manipulation: adding variables to the dataset, changing data orientations (from “thin
and long” to “fat and wide”).
2. Segmentation in Tableau
3. Regression analysis in Excel (with clusters created in Tableau)
Important steps and tips:
1. Note that the dataset does not give you data in terms of recency and frequency. So first,
add a column for the Recency metric: how many days ago was the recorded sale (tip: you
can subtract dates in Excel), and Frequency: how many number of purchases the
customer has made (you can do this manually in Excel or automatically in Tableau Prep.
The latter is shown in Step 9 below).

2. Purchase data is typically available in long format– multiple rows for each customer. This
format is ideal for Tableau (recall the survey data which you had to convert to long
format for Tableau). You can read the excel sheet into Tableau; as the data has a number
of records per customer, you should be able to see the following data (note that this is
actually frequency data…)

Prof. Mahima Hada

3. In Tableau, explore multiple clustering solutions (the online videos given for tableau
clustering will teach you how to do that). Multiple clustering solutions customers are
possible, based on the variables you pick (but make sure you include at least some of the
RFM variables). Some examples of clustering solutions are shown below (these are just
examples, you can likely do better):
Prof. Mahima Hada

4. Save the clustering solutions you deem “reasonable” as a variable in the data sheet in
Tableau (the online videos given for tableau clustering will teach you how to do that). At
the end of your clustering exercise, your data source should include Clusters in it;
something like the example given below. As you see, the data is still in long format, so
“cluster 1” (for one of the solutions), and “cluster 3” (for the second clustering solution)
is repeated for each purchasing record for Customer 1.

5. Export the above data into a .csv file. Now you can analyze it in Excel.

6. As you would want to see which clustering solution predicts customers’ purchases better
(to choose between the different clustering solutions), you need to do a regression
analysis. Note that the Tableau data is in long format – in which each customer has
multiple entries. If you put this data into a regression, the regression analysis will treat
each row as a separate datapoint, and ignore the fact that a group of records belong to one
customer. Therefore, you need to transpose this data into a “wide” format (one row per
customer and multiple columns used to represent data in rows).

7. Before you transpose the data, think about which variables you want. This is the stage at
which you add variables that aggregate a customer’s multiple purchases into one variable.
For example, you have “Size” (i.e., purchase size in $) as a purchase-level variable – you
can aggregate it for each customer as Average (average money customer spends in each
purchase), Total (total amount of money spent by customer over all purchases), and/or
most recent amount paid.

8. At this point, you have multiple ways you can proceed. You can create the “wide” data in
Excel (you can google some automatic ways to do it, or do it manually), in
R/Stata/Python or in Tableau Prep.
Prof. Mahima Hada
9. If you decide to do it in Tableau Prep, your Tableau license includes Tableau Prep as
well. Download Tableau Prep. In Tableau Prep, you will need to use the “Aggregate”
Function and divide your varibales (or fields) into “Grouped Fields” (same value across
customer for each transaction: Customer id, Cluster) or “Aggregated Fields” (fields you
want aggregated within each customer: recency, number of responses, size average, size
total etc.). An example is given below:

10. Once you switch the data into Wide format, your data should look like similar to the one
shown below. You will have different columns based on what clusters you chose, the
variables you chose etc., but each customer should be in one row only, with all data for
that customer in columns.
Prof. Mahima Hada

11. Once you have data in this format, you can analyze it using regression analysis.

12. The purpose of the regression analysis is to figure out which Clustering solution is the
best in estimating Sales.

Das könnte Ihnen auch gefallen