Sie sind auf Seite 1von 29

Performance Tuning in SAP BW.

19/05/2015

Training Material
BlackBerry Projects

Ekta Singh
SAP BI/BW Consultant
singh.ekta1@tcs.com
Table of Content

Abstract ............................................................................................................................................................................. 3
About the Domain............................................................................................................................................................. 4
1. Overview ................................................................................................................................................................... 5
1.1 Purpose ....................................................................................................................................................................... 5
2. Modelling .................................................................................................................................................................. 5
2.1 DSO: ................................................................................................................................................................ 5
2.2 CUBE................................................................................................................................................................ 6
2.2.1 Line item ................................................................................................................................................ 6
2.2.2 High cardinality ................................................................................................................................... 7
2.2.3 Remodelling - Example: .................................................................................................................. 7
2.2.4 Logical Partitioning........................................................................................................................... 12
3. Extraction: ...................................................................................................................................................... 20
3.1.1 PSA Partitioning ................................................................................................................................ 20
4. Compression and Aggregates .................................................................................................................. 22
5. Attribute Change Run ................................................................................................................................. 26
6. Conclusion ............................................................................................................................................................... 28

2
Abstract

In data Warehouse, the data volume can grow into huge sizes to an extent that it may cause
performance problems and optimization issues even for high performance hardware. In order to
overcome this anomaly, SAP BW offers options to improve the data analysis performance by
performing various performance tuning techniques that we will discuss later in detail.

The performance of the various data warehouse processes depends on various distinctive factors.
Efficient system performance starts with the design of an effective data model. The other important
aspects that affect BW performance are the technical infrastructure and system hardware that
contribute significantly in performance optimasation of BW.

Other factors that significantly affect system performance include the number of current and future
users, volume of data to be processed, and regular increment in the size of data to be transferred to
BW from the source systems etc.

If BW has already implemented the above norms and the hardware constraints are still found to affect
the performance of the system, then considerations need to be made to upgrade the hardware in order
to cater effective and efficient system performance.

3
About the Domain

SAP is the number one vendor of standard business application software and the third largest software
supplier in the world. SAP delivers scalable solutions that enable its customers to further advance
industry best practices. SAP is constantly developing new products to help their customers respond to
dynamic market conditions.

SAP Business Warehouse (also known as SAP Net Weaver Business Warehouse or SAP BW) is the
cornerstone of SAPs strategic Enterprise Data Warehouse solutions and runs on industry
standard RDBMS and SAP's HANA in-memory DBMS. It delivers reporting, analysis, and interpretation
of business data that is crucial to preserve and enhance the competitive edge of companies by
optimizing processes and enabling them to react quickly to meet market opportunity. In SAP BW, we
can integrate, transform, and consolidate relevant business information from productive SAP
applications and external data sources. SAP BW provides us with a high-performance infrastructure
that helps us evaluate and interpret data. Decision makers can make well-founded decisions and
identify target-oriented activities on the basis of the analysed data.

4
1. Overview

For Data warehouse management perspective Performance Optimization within SAP BW, have broadly
categorized into the below areas:

Modelling
Extraction
Compression and Aggregates.

1.1 Purpose
The purpose of this document is to highlight the key features that prove as a great aid in tuning the
performance of the data warehouse system. The highlighted features once implemented can greatly
contribute in efficient performance of system and thus help in overall optimization of system.

2. Modelling

Modelling Extraction Compression and Aggregates

Cubes and DSOs form the major part of our BW landscape. They should be modelled efficiently such
that data loading as well as reading is as fast as possible. In this section we will see how we can fine
tune these objects for better performance.

2.1DSO:
The DSO setting can be changed only when there is no data.The SID generation option allows us to
define SIDs created for new characteristic values in DSO when data is activated . There are two
options:

During Activation: Generally we use Activation with SID setting for the DSO where queries
are built directly on DSO as it doesnt allow junk data to go through it. The SIDs are
generated during activation process and reduces the query execution time.

Never Generate SID: Uncheck the SID generation upon activation in the settings tab of
DSO if you have no query built on DSO. This option makes sense when we are using the DSO
for further loading into some cube or different DSO.

5
2.2CUBE
Dimensions in cube should be modelled such that they should make logical sense to users while
creating queries on them as well as they should perform loading and reading faster. As a thumb rule,
fact table to dimension table ratio should never exceed 30% for any of the dimensions. If this ratio is
greater than 30% for any of the dimensions, then the dimension needs to be remodelled.

In case of multiple characteristics in a dimension,


o Split the dimension to smaller dimensions such that the dimension table to fact table
ratio is less than 30%. As there is a limit for number of dimensions a cube can have, this
option is possible only when the number is less than 16.
o If the cube already has 16 dimensions, we can change the properties of dimension to
High Cardinality
In case of single characteristic in a dimension, it can be changed to Line Item Dimension.

2.2.1 Line item


Line item dimension is selected when a dimension has only one characteristic. When a dimension is set
as line item dimension, it doesnt create the dimension table. Although the dimension table doesnt
present physically but still it is there as view on the SID master data table.

Note: This setting is possible only when there is no data in cube.

6
2.2.2 High cardinality

In extended star schema the fact table is connected to SID table through dimension tables. Usually the
size of fact table is much larger than the dimension table but in some scenarios the size of both are
quite comparable such as in an infocube there is a characteristic, Internal Vehicle Numer for which
every fact table entry is assigned to a different Vehicle Number . Thus the size of dimension table is
comparable to the size of fact table.

The general rule to create a dimension as high cardinality dimension is that when the size of dimension
table is 20% of size of fact table.

2.2.3 Remodelling - Example:

In SE38 execute program SAP_INFOCUBE_DESIGNS and execute it and search for your cube.
In the below screenshot we can see the dimension to fact table ratio is much above the
allocated 20%.

7
Here we can see the Loading time is around 20 minutes in cube.

In list cube we can see dimension wise data, here we observe that internal vehicle number,
Vehicle number and Document number are unique and not repetitive (Sorted all the fields).

8
We splitted the dimension into multiple dimension with the unique chracteristics in separate
dimension and declared them Line Item Dimension.

In the screenshot we can observe that now dimension to Fact Table ratio has reduced
drastically.

9
With few more changes in the cubes by creating new Line dimensions we are able to reduce the
ratio to much lower values.

NOTE: If you are not able to search your cube in this program then you need to go to cube manage
screen and Performance Tab and refresh the statistics.

10
If the status of check statistics is green then only you will be able to see the cube in the
program SAP_INFOCUBE_DESIGNS.
Loading time is reduced from 20 minutes to around 12 minutes.

Note: If any characteristic in the dimension has been selected as high cardinality make sure they are
not used in report as usually for normal characteristics in dimension Bitmap index are created which
are best for reading purpose , with high cardinality is checked system creates B-Tree indexes instead
of Bitmap indexes which gives worse query performance .

11
2.2.4 Logical Partitioning
Scenario : Report user belongs to one region and user dont want to access data from other regions.
In usual case report built on top of one cube which has data for multiple regions will be a overhead on
performance query as the query will search the entire cube which has data from all the regions.
Instead we can Logically divide the Info Cube into multiple regions.

In Logical partitioning we partition our cube by region/time that is we divide the cube into
different identical cubes and create a multiprovider on top of it. The data is partitioned in
identical cubes so the time on searching for data related to particular time space or region will
be reduced due to partition.

Partitioning of cube according to region.

Multiprovider

DTP Filter:
EUR USA ASIA
Region US

DTP Filter: Region


DTP Filter: Region
UK
DE

12
In query we can have have a filter which will select data region wise , this will hit only the
desired cube and as the cube data is now divided regionally so the query will hit only one cube
and as the volume of data in cube is now low so query will work faster.
For this we need to have a customer exit which will get input dynamicaly from user and will hit
the desired cube only.
Here you can see on TEST DSO, 3 identical cubes have been built with identical transformation
only the DTP filter has been changed with restriction on REGION.

TEST1(C_1)- FILTER REGION = UK


TEST2(C_2)- FILTER REGION = US
TEST3(C_3)- FILTER REGION = DE

13
Cube1 output

Variable1: MMREGION

Variable2: TSTINFPVDR

14
GOTO: TCODE CMOD

Create new project.

Click on enhancement assignments.

15
Add the component if not present.

Click on components

Double click on the EXIT_SAPLRRS0_001.

16
17
Write following statement in the EXIT_SAPLRRS0_001

18
19
Modelling Extraction Compression and Aggregates

3. Extraction:
While loading data from source systems using info package, it can be updated to data targets using
several methods as shown in below screenshot.

To reduce data loading times, we can select Data Target only option. This will reduce the loading
time further as it will not fill PSA first and then we need to trigger DTP. Instead we can use this setting
for directly loading to Info provider.

Pros: Faster loading time.


Cons: PSA is not available so we will not be able to correct data in case of any errors/junk
values in data. If a request is deleted from Info provider then we have to fetch data from R3
side again.

3.1.1 PSA Partitioning


When you extract the data using options other than the one mentioned above, data is written into
PSA tables in the BW system. If your data is on the order of tens of millions, consider partitioning
these PSA tables for better performance, but pay attention to the partition sizes. Partitioning PSA
tables improves data-load performance because it's faster to insert data into smaller database
tables. Partition also provides increased performance for maintenance of PSA tables for example,
you can delete a portion of data faster.

This can be done in TCODE: RSCUSTV6

20
Frequency Status-IDOCS describes how many idocs an Info-Doc can contain.
If Frequency is 1 means only one info-doc for every data IDoc. In general we should choose a
frequency between 5 to 10 but not greater than 20. By default it is 10.

Partition size: How many records a partition should have in PSA. If we set it to 50000 a
partition will be created for every 50000 records. Default it is set to 1 lakhs records.

21
Modelling Extraction Compression and Aggregates

4. Compression and Aggregates


Compress Cubes and aggregates regularly.
Rebuild the indexes on Aggregate table.
Do not build Statistics on F table of the cube.

4.1 Aggregate: An aggregate is redundant basic cube data storage with only a subset of basis cube
data.

Aggregates are memory intensive; however they are highly flexible and can be largely adjusted to the
reporting requirements. With high volume of data they are the most important tuning measure for data
analysis.

An aggregates can always be used in reports when no other information is required in the report, than
which is available in the aggregates. The decision on whether or not an aggregate will be used for
analysis will not be transparent for the user. It will be decided by the analytical engine.

For each basis cube a discretionary number of aggregates can be created with the transaction RSDDV
or context menu of the basis cube.

22
4.2 Initial filling of aggregates

Aggregates are created when the respective basis cubes contain data. Right after the creation of an
aggregate, it has to be filled initially to have the same dataset as the respective basis cube. This can
be done in the aggregate maintenance under the menu item aggregate -> activate and fill.

Depending on the size of basis cube, reading the F fact table can be very time consuming and may
not be really required, because there might already be other aggregates which can be used as a
database

23
There are several limitations while an aggregate is being built.

There can be no roll up for the aggregate.


No change run is possible if the aggregate disposes off master data attribute.

As the limitation described may exist for a period of several hours, it is advisable to use specific time
slot to initially build aggregate.

With a new creation, the aggregates are filled from the respective basis cube. The newly added data of
basis cube will be transferred to aggregates via process called ROLLUP

The data which is to be transferred to the aggregates, the corresponding ID can be entered in Request
ID in Roll Up tab.

4.3 Working of Aggregates

The reduction of data volume in an aggregate may be achieved by reduction in granularity or the
accumulation of subsets. Usually both the options are combined.

The reduction in granularity is achieved, if the amount of Info Object that defines the granularity of
cube, only a subset is filled into the aggregates.

4.4 Aggregates for Characteristics

There are two fact table F fact table for the cube and E fact table for aggregates. All characteristics that
are defined in cubes but not filled in into an aggregate are aggregated in such a way that the detailing
level of aggregate is limited to characteristics that are filled into the aggregates.

24
F Fact table (Cube)

Month Customer Material Sales


01.2002 1000 A 17
01.2002 2000 B 15
01.2002 2000 C 44
02.2002 2000 D 30

E Fact table (aggregate)

Month Customer Sale


01.2002 1000 17
01.2002 2000 59
02.2002 2000 30

25
5. Attribute Change Run
Whenever there is a change in master data, we have to execute a change run, because changes in
master data cause changes in navigational attributes or hierarchies. To insure consistency in reporting
results, data in aggregates have to be adjusted after the master data load.

By executed the change run, the data in aggregates is adjusted and the modified version of
navigational attributes and hierarchies turns into active version.

Attribute Sales
Attribute Sales
X 80
X 100
Y 40
Y 20

Aggreagtes(Before change run) Aggregate (After change run)

It is carried out from the Tools Menu and selecting Apply Hierarchy/Attribute Changes

26
The changes in Master Data will be effective after executing the change run only, and during this
process the reporting can be done on Old Master Data and hierarchies.

27
6. Conclusion
By following the above mentioned techniques we can efficiently fine tune the performance of the data
warehouse system. The highlighted features once implemented can greatly contribute in efficient
performance of system and thus help in overall optimization of system.

28
Thank You

Das könnte Ihnen auch gefallen