Sie sind auf Seite 1von 9

Classification: Internal Use

Predictive analytics using SAP Design Studio and SAP HANA –


Part 1
Follow RSS feed Like
11 Likes 7,454 Views 5 Comments

With advanced analytics finding a place in every business function, scripting and programming is too much of
a hassle when you are bound by deadlines and tough competition. This is where SAP makes a difference by
providing a minimal scripting approach to predictive analytics.

This is a two-part blog where part 1 discusses the fundamentals of SAP HANA PAL & HANA flowgraphs and
part 2 discusses how flowgraphs can be integrated with SAP Design Studio to derive actionable insights on the
fly.
SAP HANA PAL – Predictive Analytics Library, which is a part of HANA AFL (Application Functions Library)
is a large collection of functions to implement predictive modelling. These are the same set of functions that are
made available in SAP Predictive Analytics Expert Mode when connecting to a HANA data source. Earlier on,
these functions could be used by means of a several-step process using SQL scripts – manually creating all of
the artefacts required. An example of this approach can be found in this blog- http://scn.sap.com/docs/DOC-
5013
Since HANA SPS 6, things have gotten a lot easier. PAL functions can now be consumed using HANA AFM
(Application Function Modeler) – what I would call a drag and drop interface to build complex stored
procedures. HANA Flowgraphs are development artefacts which get activated as store procedures, which can
then be consumed in other applications, or simply be scheduled to run the job every other day.

An overview of AFM and HANA Flowgraphs can be found here

http://help.sap.com/saphelp_hanaplatform/helpdata/en/29/de6754ef9646999b6261819bd802cb/content.htm?fra
meset=/en/93/b3e3191ae34508a4d92dff9b6d350c/frameset.htm&current_toc=/en/00/0ca1e3486640ef8b884cd
f1a050fbb/plain.htm&node_id=869&show_children=false
Let’s build a simple flowgraph to understand how things work. Let’s take the case of a simple sales forecast
based on historical trend. I have a table which contains daily sales data of a small retail chain which has 4 stores
(S01, S02, S03, S04) in a particular region. For the sake of simplicity, I have chosen 4 top-selling products (P01,
P02, P03, P04) of the chain. So we have 4 Stores, 4 Products and the daily sales values of each of these
combinations.

Image-1: Sample data in consideration


Classification: Internal Use

First, we are going to forecast the overall sales of all stores and products put together. A little analysis of the
data tells us that there is a seasonal pattern in the data with a similar pattern repeating every year along with a
slight increasing trend and cyclicity. Hence, we will use the Triple Exponential smoothing model from PAL.
The flowgraph that we built can be seen below.

Image-2: Flowgraph model


Classification: Internal Use

In this case, we aggregate the daily sales over all stores and products to get overall sales numbers for each day
of the year, using the aggregation node. Following this, the data needs to be manipulated to a form that is
acceptable to the algorithm node. The data is passed through the algorithm to get forecasted values which is then
post-processed to a presentable form and written back to a table.

Image-3: Flowgraph development workflow

Image-4: Flowgraph output table


Classification: Internal Use

What goes on in the background?

Each node in the flowgraph is transformed into a SQL statement and the entire statement executes as a stored
procedure. The predictive algorithm node is executed as a stored procedure call to another procedure which
contains the PAL Function call for the data being processed. The properties and configuration of each node and
each connection (arrow from one node to the other) can be modified using the Properties View in the AFM. The
meta-data information such as the model configuration parameters, input and output table signatures, are stored
as catalog objects once the flowgraph is activated.

The activation of a flowgraph triggers the generation of a number of catalog objects.

1. Stored procedure in AFLLANG that calls the PAL Function


2. A wrapper procedure that incorporates all of the data manipulation steps in the flowgraph as nodes with the
help of table variables, and also calls the AFLLANG procedure.
3. A number of table types depending on the predictive algorithm used.
Classification: Internal Use

4. Parameter tables which were used to define predictive algorithm properties.


5. Write-back tables (if any)

Image-5 & 6: Procedures generated by the flowgraph

While this is a simple example, flowgraphs can be made quite complex and several kinds of analyses can be performed.
Some examples include running multiple algorithms on the same data set using a single flowgraph, model comparison,
error calculations with the help of the wide range of algorithms and data processing nodes in the AFM. In addition,
custom models can be built using R to suit specific use cases.
Extending Flowgraphs to SAP Design Studio

A typical real-life situation would be a lot more complex than the simple example that we just explored. One of
them would be to provide this information in a consumable format to the business users so they can put the
analytics to use and make informed decisions. Next would be to provide interactivity and allow users to modify
various parameters such as dynamically changing the stores and products selected, modifying model parameters
to get the most optimal fit and so on. In the next blog, you can read on how to work through some of these
complexities taking SAP Design Studio as an example.

Predictive analytics using SAP Design Studio and SAP HANA –


Part 2
Follow RSS feed Like

4 Likes 2,466 Views 2 Comments

Thanks to Radha D for her contributions in building the model.

In part 1, we discussed the 101 of predictive analytics in SAP HANA using SAP HANA PAL and HANA
flowgraph modelling. The drag and drop interface allows us to focus on the analysis without spending too much
time writing out complex SQL scripts.

In this blog, we will understand how we can integrate flowgraphs with SAP Design Studio and provide a readily
consumable output to the decision makers.
Classification: Internal Use

Extending our example from the previous blog, we would like to select different store and product combinations,
to be able to view the sales forecast at a granular level. This kind of analysis will help in inventory planning,
promotional offers for low selling products and so on. We would also like to choose the number of periods of
forecast as well as be able to modify the forecast parameters to ensure a good fit.
Image-1: Visualizing SAP HANA Flowgraph output using SAP Design Studio

Above is a sample application built using SAP Design Studio to present our case, with text inputs, dropdowns
and radio buttons to pass the parameters. This application must be able to interact with the predictive algorithms
running on HANA and pass the various parameters to the Prediction model.

Solution -1

A simple approach to do this would be to include a filter node in the flow graph created in part 1 and iterate the
steps over all store and product combinations. We can then write the results back to a table and consume it in
Design Studio using a Calculation View. This approach can be tedious and also has to be scheduled to run during
downtime hours. Also, this would apply the same model parameters to all the store and product combinations
which may not yield the best of results.

Image-2: Using filter node to select a particular store and product


Classification: Internal Use

Solution2- Doing things on the fly

To visualize something in SAP Design Studio, it should be a column view (from SAP HANA perspective). From
our sample application shown above, it is clear that we would need to pass the various parameters as variable
values to this column view. The drawbacks of solution1 tells us that this column view cannot be built on just
tables as we would like to run the prediction algorithm and the flow graph procedure each time there is a change
in the parameters in the front-end. To achieve this, we create a scripted calculation view in HANA and call the
flow graph procedure within it. The problem does not end there. Each time there is change to the parameters, the
meta-data table, where model parameters are stored, needs to be updated. Then again, write operations are not
permitted within scripted views. To work around this, we tweak the flow graph procedure to replace
corresponding parameter values at runtime using a simple case statement within the select statement which picks
up values from the parameter table.

Image-3: Modified Select statement on the model parameter table


Classification: Internal Use

Image-4: Actual Select statement on the model parameter table

Lastly, to change the dimension members (Store and Product) dynamically, we create a custom procedure with
two scalar input parameters for Store and Product respectively and replace the static filter during the pre-process
of data in the flowgraph.

Image-5: Modified Dynamic Dimension Filter

Image-6: Flowgraph – Design Studio integration workflow


Classification: Internal Use

Finally, this column view is consumable in SAP Design Studio and we can select model parameters as well as
dimension parameters on the fly and visualize the predictions accordingly.

This approach can be further extended to other complex predictive algorithms such as classification, regression
and decision trees with great visualizations using SAP Design Studio. This can also be extended to other
visualization and data discovery tools like SAP Lumira and WebI enabling decision makers with the power of
predictive analytics.