Beruflich Dokumente
Kultur Dokumente
19/05/2015
Training Material
BlackBerry Projects
Ekta Singh
SAP BI/BW Consultant
singh.ekta1@tcs.com
Table of Content
Abstract ............................................................................................................................................................................. 3
About the Domain............................................................................................................................................................. 4
1. Overview ................................................................................................................................................................... 5
1.1 Purpose ....................................................................................................................................................................... 5
2. Modelling .................................................................................................................................................................. 5
2.1 DSO: ................................................................................................................................................................ 5
2.2 CUBE................................................................................................................................................................ 6
2.2.1 Line item ................................................................................................................................................ 6
2.2.2 High cardinality ................................................................................................................................... 7
2.2.3 Remodelling - Example: .................................................................................................................. 7
2.2.4 Logical Partitioning........................................................................................................................... 12
3. Extraction: ...................................................................................................................................................... 20
3.1.1 PSA Partitioning ................................................................................................................................ 20
4. Compression and Aggregates .................................................................................................................. 22
5. Attribute Change Run ................................................................................................................................. 26
6. Conclusion ............................................................................................................................................................... 28
2
Abstract
In data Warehouse, the data volume can grow into huge sizes to an extent that it may cause
performance problems and optimization issues even for high performance hardware. In order to
overcome this anomaly, SAP BW offers options to improve the data analysis performance by
performing various performance tuning techniques that we will discuss later in detail.
The performance of the various data warehouse processes depends on various distinctive factors.
Efficient system performance starts with the design of an effective data model. The other important
aspects that affect BW performance are the technical infrastructure and system hardware that
contribute significantly in performance optimasation of BW.
Other factors that significantly affect system performance include the number of current and future
users, volume of data to be processed, and regular increment in the size of data to be transferred to
BW from the source systems etc.
If BW has already implemented the above norms and the hardware constraints are still found to affect
the performance of the system, then considerations need to be made to upgrade the hardware in order
to cater effective and efficient system performance.
3
About the Domain
SAP is the number one vendor of standard business application software and the third largest software
supplier in the world. SAP delivers scalable solutions that enable its customers to further advance
industry best practices. SAP is constantly developing new products to help their customers respond to
dynamic market conditions.
SAP Business Warehouse (also known as SAP Net Weaver Business Warehouse or SAP BW) is the
cornerstone of SAPs strategic Enterprise Data Warehouse solutions and runs on industry
standard RDBMS and SAP's HANA in-memory DBMS. It delivers reporting, analysis, and interpretation
of business data that is crucial to preserve and enhance the competitive edge of companies by
optimizing processes and enabling them to react quickly to meet market opportunity. In SAP BW, we
can integrate, transform, and consolidate relevant business information from productive SAP
applications and external data sources. SAP BW provides us with a high-performance infrastructure
that helps us evaluate and interpret data. Decision makers can make well-founded decisions and
identify target-oriented activities on the basis of the analysed data.
4
1. Overview
For Data warehouse management perspective Performance Optimization within SAP BW, have broadly
categorized into the below areas:
Modelling
Extraction
Compression and Aggregates.
1.1 Purpose
The purpose of this document is to highlight the key features that prove as a great aid in tuning the
performance of the data warehouse system. The highlighted features once implemented can greatly
contribute in efficient performance of system and thus help in overall optimization of system.
2. Modelling
Cubes and DSOs form the major part of our BW landscape. They should be modelled efficiently such
that data loading as well as reading is as fast as possible. In this section we will see how we can fine
tune these objects for better performance.
2.1DSO:
The DSO setting can be changed only when there is no data.The SID generation option allows us to
define SIDs created for new characteristic values in DSO when data is activated . There are two
options:
During Activation: Generally we use Activation with SID setting for the DSO where queries
are built directly on DSO as it doesnt allow junk data to go through it. The SIDs are
generated during activation process and reduces the query execution time.
Never Generate SID: Uncheck the SID generation upon activation in the settings tab of
DSO if you have no query built on DSO. This option makes sense when we are using the DSO
for further loading into some cube or different DSO.
5
2.2CUBE
Dimensions in cube should be modelled such that they should make logical sense to users while
creating queries on them as well as they should perform loading and reading faster. As a thumb rule,
fact table to dimension table ratio should never exceed 30% for any of the dimensions. If this ratio is
greater than 30% for any of the dimensions, then the dimension needs to be remodelled.
6
2.2.2 High cardinality
In extended star schema the fact table is connected to SID table through dimension tables. Usually the
size of fact table is much larger than the dimension table but in some scenarios the size of both are
quite comparable such as in an infocube there is a characteristic, Internal Vehicle Numer for which
every fact table entry is assigned to a different Vehicle Number . Thus the size of dimension table is
comparable to the size of fact table.
The general rule to create a dimension as high cardinality dimension is that when the size of dimension
table is 20% of size of fact table.
In SE38 execute program SAP_INFOCUBE_DESIGNS and execute it and search for your cube.
In the below screenshot we can see the dimension to fact table ratio is much above the
allocated 20%.
7
Here we can see the Loading time is around 20 minutes in cube.
In list cube we can see dimension wise data, here we observe that internal vehicle number,
Vehicle number and Document number are unique and not repetitive (Sorted all the fields).
8
We splitted the dimension into multiple dimension with the unique chracteristics in separate
dimension and declared them Line Item Dimension.
In the screenshot we can observe that now dimension to Fact Table ratio has reduced
drastically.
9
With few more changes in the cubes by creating new Line dimensions we are able to reduce the
ratio to much lower values.
NOTE: If you are not able to search your cube in this program then you need to go to cube manage
screen and Performance Tab and refresh the statistics.
10
If the status of check statistics is green then only you will be able to see the cube in the
program SAP_INFOCUBE_DESIGNS.
Loading time is reduced from 20 minutes to around 12 minutes.
Note: If any characteristic in the dimension has been selected as high cardinality make sure they are
not used in report as usually for normal characteristics in dimension Bitmap index are created which
are best for reading purpose , with high cardinality is checked system creates B-Tree indexes instead
of Bitmap indexes which gives worse query performance .
11
2.2.4 Logical Partitioning
Scenario : Report user belongs to one region and user dont want to access data from other regions.
In usual case report built on top of one cube which has data for multiple regions will be a overhead on
performance query as the query will search the entire cube which has data from all the regions.
Instead we can Logically divide the Info Cube into multiple regions.
In Logical partitioning we partition our cube by region/time that is we divide the cube into
different identical cubes and create a multiprovider on top of it. The data is partitioned in
identical cubes so the time on searching for data related to particular time space or region will
be reduced due to partition.
Multiprovider
DTP Filter:
EUR USA ASIA
Region US
12
In query we can have have a filter which will select data region wise , this will hit only the
desired cube and as the cube data is now divided regionally so the query will hit only one cube
and as the volume of data in cube is now low so query will work faster.
For this we need to have a customer exit which will get input dynamicaly from user and will hit
the desired cube only.
Here you can see on TEST DSO, 3 identical cubes have been built with identical transformation
only the DTP filter has been changed with restriction on REGION.
13
Cube1 output
Variable1: MMREGION
Variable2: TSTINFPVDR
14
GOTO: TCODE CMOD
15
Add the component if not present.
Click on components
16
17
Write following statement in the EXIT_SAPLRRS0_001
18
19
Modelling Extraction Compression and Aggregates
3. Extraction:
While loading data from source systems using info package, it can be updated to data targets using
several methods as shown in below screenshot.
To reduce data loading times, we can select Data Target only option. This will reduce the loading
time further as it will not fill PSA first and then we need to trigger DTP. Instead we can use this setting
for directly loading to Info provider.
20
Frequency Status-IDOCS describes how many idocs an Info-Doc can contain.
If Frequency is 1 means only one info-doc for every data IDoc. In general we should choose a
frequency between 5 to 10 but not greater than 20. By default it is 10.
Partition size: How many records a partition should have in PSA. If we set it to 50000 a
partition will be created for every 50000 records. Default it is set to 1 lakhs records.
21
Modelling Extraction Compression and Aggregates
4.1 Aggregate: An aggregate is redundant basic cube data storage with only a subset of basis cube
data.
Aggregates are memory intensive; however they are highly flexible and can be largely adjusted to the
reporting requirements. With high volume of data they are the most important tuning measure for data
analysis.
An aggregates can always be used in reports when no other information is required in the report, than
which is available in the aggregates. The decision on whether or not an aggregate will be used for
analysis will not be transparent for the user. It will be decided by the analytical engine.
For each basis cube a discretionary number of aggregates can be created with the transaction RSDDV
or context menu of the basis cube.
22
4.2 Initial filling of aggregates
Aggregates are created when the respective basis cubes contain data. Right after the creation of an
aggregate, it has to be filled initially to have the same dataset as the respective basis cube. This can
be done in the aggregate maintenance under the menu item aggregate -> activate and fill.
Depending on the size of basis cube, reading the F fact table can be very time consuming and may
not be really required, because there might already be other aggregates which can be used as a
database
23
There are several limitations while an aggregate is being built.
As the limitation described may exist for a period of several hours, it is advisable to use specific time
slot to initially build aggregate.
With a new creation, the aggregates are filled from the respective basis cube. The newly added data of
basis cube will be transferred to aggregates via process called ROLLUP
The data which is to be transferred to the aggregates, the corresponding ID can be entered in Request
ID in Roll Up tab.
The reduction of data volume in an aggregate may be achieved by reduction in granularity or the
accumulation of subsets. Usually both the options are combined.
The reduction in granularity is achieved, if the amount of Info Object that defines the granularity of
cube, only a subset is filled into the aggregates.
There are two fact table F fact table for the cube and E fact table for aggregates. All characteristics that
are defined in cubes but not filled in into an aggregate are aggregated in such a way that the detailing
level of aggregate is limited to characteristics that are filled into the aggregates.
24
F Fact table (Cube)
25
5. Attribute Change Run
Whenever there is a change in master data, we have to execute a change run, because changes in
master data cause changes in navigational attributes or hierarchies. To insure consistency in reporting
results, data in aggregates have to be adjusted after the master data load.
By executed the change run, the data in aggregates is adjusted and the modified version of
navigational attributes and hierarchies turns into active version.
Attribute Sales
Attribute Sales
X 80
X 100
Y 40
Y 20
It is carried out from the Tools Menu and selecting Apply Hierarchy/Attribute Changes
26
The changes in Master Data will be effective after executing the change run only, and during this
process the reporting can be done on Old Master Data and hierarchies.
27
6. Conclusion
By following the above mentioned techniques we can efficiently fine tune the performance of the data
warehouse system. The highlighted features once implemented can greatly contribute in efficient
performance of system and thus help in overall optimization of system.
28
Thank You