Beruflich Dokumente
Kultur Dokumente
OptimizingBWQueryPerformance|RunningSAPApplicationsontheMicrosoftPlatform
Server & Tools Blogs > Data Platform Blogs > Running SAP Applications on the Microsoft
Platform
Sign in
SAP BW query performance depends on many factors: hardware, database configuration, BW configuration
and last but not least BW cube design and BW query design. When running a query, there are several caches
involved: disk caches, database caches and the BW OLAP cache. You should keep this in mind when
comparing BW query performance of different configurations or even different systems. In the following we
are discussing the configuration options, which are specific for Microsoft SQL Server.
Prerequisites
First of all you have to make sure that there is no bottleneck in the most important system resources: CPU,
memory and I/O. You can configure the maximum number of CPU threads, which are used for a single
database query see below. However, SQL Server may reduce the actual number of used threads, if there are
not enough free worker threads. Therefore the runtime of the same query can vary greatly, dependent on the
current system load. A memory bottleneck on SQL Server may result in additional I/O. When there are
sufficient CPU and memory resources, then repeatedly running queries are fully cached. In this case the
performance of the I/O system is not crucial.
A huge part of the overall runtime of a BW query can be consumed on the SAP application server, not on the
database server. Therefore the system resources on the application server are important, too. A simple BW
query typically consists of 3 parts:
A database query running against the ffact table of the cube
A parallel running database query against the efact table of the cube
An aggregation of the two result sets running in the OLAP processor on the application server this
process has nothing to do with BW Aggregates, see below
If you have never run BW cube compression see below, then all data is in the ffact table. If you do run BW
cube compression after each data load, then all data is in the efact table. In both cases, there is no need to
aggregate the two results sets, which reduces the BW query runtime on the application server.
https://blogs.msdn.microsoft.com/saponsqlserver/2013/03/19/optimizingbwqueryperformance/
1/10
11/20/2016
OptimizingBWQueryPerformance|RunningSAPApplicationsontheMicrosoftPlatform
In SAP BW data is typically loaded into cubes using BW process chains. These chains contain steps for
dropping/recreating indexes and updating the database statistics for small dimension tables. If one of these
steps fails, then you may clearly see a BW query performance issue.
10
100
200
500
100%
125%
152%
202%
100%
110%
135%
142%
Average query runtime compared with 10 partitions in percent, lower number is better
Since the partitions are created and dropped automatically on the ffact table, a BW administrator has no
direct influence on the number of partitions. However, there are two ways to reduce the number of partitions.
First of all, you should keep the number of data loads DTPs into a cube as low as possible by avoiding small
request. You can combine many small requests to a single, large one by loading the requests first into a Data
Store Object DSO. The following DTP from the DSO to the cube creates then fewer but larger requests in the
cube.
Secondly, you can perform BW cube compression, which reduces the number of partitions again. For best
query performance you should compress all requests anyway which deletes all rows in the ffact table. BW
query performance is still good, if just most of the requests are compressed. Some customers keep new
requests for at least one week uncompressed in the ffact table. Thereby they can easily delete faulty
requests, which were loaded into the cube by mistake.
2. BW cube compression
The efact table of a cube is optimized for query performance. The process BW cube compression moves
aggregates single requests from the ffact table to the efact table. Dependent on the kind of data, the total
number of rows can dramatically be reduced by this aggregation. SAP generally recommends BW cube
compression for performance reasons, see
http://help.sap.com/saphelp_nw73/helpdata/en/4a/8f4e8463dd3891e10000000a42189c/content.htm. BW
cube compression has further advantages for inventory cubes. As a side effect, it reduces the number of
partitions on the ffact tables.
For Microsoft SQL Server, we did not always benefit from BW cube compression in the past. Reason was the
index layout using a heap for the efact table when having conventional btree indexes.
https://blogs.msdn.microsoft.com/saponsqlserver/2013/03/19/optimizingbwqueryperformance/
2/10
11/20/2016
OptimizingBWQueryPerformance|RunningSAPApplicationsontheMicrosoftPlatform
However, when using SQL Server 2012 columnstore index, we strongly benefit from BW cube compression
for BW query performance. The process of cube compression became much faster, although it contains an
additional step: It fully reorganizes the columnstore index. Since the creation of a columnstore index scales
up very well, we use 8 CPU threads for this by default. You can change the default by setting the RSADMIN
parameter MSS_MAXDOP_INDEXING using report SAP_RSADMIN_MAINTAIN.
NONE
ROW
PAGE
100%
76%
75%
100%
101%
110%
Average query runtime compared with NONE compressed ffact table in percent, lower number is better
The disk space savings of ROW compression was as expected. The additional space savings of PAGE
compression was only moderate, because the ffact table only contains numeric fields. Best compression
ratios have been seen with string fields. A large number of partitions also results in an increased space usage
per table.
https://blogs.msdn.microsoft.com/saponsqlserver/2013/03/19/optimizingbwqueryperformance/
3/10
11/20/2016
OptimizingBWQueryPerformance|RunningSAPApplicationsontheMicrosoftPlatform
Average disk space usage compared with NONE compressed ffact table, 100 partitions
in percent, lower number is better
4. Degree of parallelism
Per default, SAP BW requests two SQL Server threads per query by using a MaxDop hint. You can change
this default behavior by setting the RSADMIN parameter MSS_MAXDOP_QUERY using report
SAP_RSADMIN_MAINTAIN.
We measured the performance when running BW queries on BW compressed cubes with columnstore index,
dependent on the used degree of parallelism. Two suites of queries were running against two different cubes.
The biggest improvement was seen when moving from 1 CPU thread to 2 threads This is typically only the
case for columnstore indexes, not for conventional btree indexes. Increasing to 4 threads improved
performance noticeably. A further increase to 8 threads did not have any impact in many cases:
MaxDop
1.00
8.13
12.26
15.61
1.00
4.11
4.62
4.39
The impact of MaxDop depends on many factors like cube design, query design, actual data and used
hardware. However, in all cases we have cleary seen the negative impact when using only a single CPU thread
per query. That is why you should never set MSS_MAXDOP_QUERY to 1.
When there is a temporary CPU bottleneck, SQL Server can reduce the actual number of used threads for a
query. In extreme cases, this could end up in one thread, even when MSS_MAXDOP_QUERY is set to 4 or
higher. In particular when running BW queries against multiproviders you can run into this issue. A BW multi
provider is a logical cube, which retrieves the data from multiple basis cubes at the same point in time. This
results in many simultaneously running SQL queries: 2 SQL queries one on the ffact table and one on the e
fact table per basis cube. When using MaxDop 4 for a multiprovider consisting of 4 basis cubes, SQL Server
may need up to 32 threads 2 tables * 4 cubes * MaxDop 4 = 32 threads. On a database server with less than
32 CPU threads this can result in actually using MaxDop 1 for at least one SQL query. Keep in mind, that the
response time of a BW query is determined by the slowest participating SQL query.
That is why the default value of MSS_MAXDOP_QUERY =2 was chosen relatively low. On a database server
with more than 64 CPU threads, you may increase MSS_MAXDOP_QUERY to 3 or 4 dependent on your
workload while BW queries are running.
https://blogs.msdn.microsoft.com/saponsqlserver/2013/03/19/optimizingbwqueryperformance/
4/10
11/20/2016
OptimizingBWQueryPerformance|RunningSAPApplicationsontheMicrosoftPlatform
For best BW query performance, one should avoid running BW queries during high workload. For example,
BW cube compression can be very CPU intensive, since it includes the recreation of the columnstore index
using 8 CPU threads by default.
Index type
Partition type
btree
nonpart.
columnstore
partitioned
notpart.
partitioned
1.00
1.06
4.77
9.83
1.00
0.92
6.78
7.64
1.00
2.03
1.00
1.14
For SQL Server columnstore indexes we see consistent query performance improvements when using
partitioning of the efact table. The performance improvements compared with nonpartitioned columnstore
are moderate factor 1.14 if data is already in cache. However, this is an additional performance increase
compared with conventional btrees. For example, when a nonpartitioned columnstore index was 6.78 times
faster than a btree index, then the partitioned columnstore index was 7.64 times faster.
https://blogs.msdn.microsoft.com/saponsqlserver/2013/03/19/optimizingbwqueryperformance/
5/10
11/20/2016
OptimizingBWQueryPerformance|RunningSAPApplicationsontheMicrosoftPlatform
A columnstore index is optimized for large tables having some million rows. Internally, a columnstore index
is divided into segments. Each segment contains up to one million rows. When using 8 CPU threads for
creating the columnstore index, you typically see 8 segments per column, which are not fully filled with one
million rows. When using partitioning for small tables, you further decrease the average segment size of the
columnstore index. Having too small segments decreases the query performance. Therefore you should
consider partitioning only for tables with at least some dozen of million rows.
To fully benefit from the performance improvements of the partitioned columnstore, you have to apply the
latest version of SAP note 1771177 first which will be released in April 2013. Then you should recreate the
indexes of existing, partitioned cubes. The new code improvements optimize columnstore segment
elimination, in addition to having partitions. Therefore, you have a performance benefit on BW cubes
containing a filter on a time characteristics, even when creating only a single partition.
7. BW Aggregates
A BW aggregate is neither a configuration option nor is it SQL Server specific. However, we want to discuss
aggregates here, since they were the preferred means to increase BW query performance in the past. A BW
aggregate is a copy of an existing BW basis cube with a restricted number of characteristics and/or applied
filters. BW aggregates are optimized for one or a few BW queries. Therefore you typically have to create many
aggregates in order to support all BW queries running against a single basis cube. Technically, a BW
aggregate looks like a conventional cube. It has two fact tables, each of them having its own database
indexes. Since BW aggregates are logical copies of the cube, they have to be manually loaded and
compressed each time data is loaded into the basis cube.
On the contrary, a columnstore index on a basis cube is maintained automatically. The size of the cube is
decreasing when using the columnstore, not increasing by creating additional copies of the cube. There is no
need to create new aggregates to support new or adhoc BW queries when using the columnstore.
Therefore the columnstore is the new preferred means to increase BW query performance. Once you define
a columnstore index on a BW cube for Microsoft SQL Server, all existing aggregates of this cube are
deactivated. This is done due to the fact that the BW OLAP processor is not aware of the columnstore.
Therefore using an aggregate which never has a columnstore index is avoided, once the basis cube has a
columnstore index.
6/10
11/20/2016
OptimizingBWQueryPerformance|RunningSAPApplicationsontheMicrosoftPlatform
than the database performance. In the worst case, the database is not accessed at all, if the OLAP cache can
be used. You can turn off the OLAP cache on BW cube level using InfoProvider Properties in SAP transaction
RSA1:
Alternatively you can turn off the OLAP cache on BW query level using SAP transaction RSRT. However, in
productive customer systems the OLAP cache is typically turned on. So why should you turn it off for
performance tests? There are two reasons for this: Firstly, the likelihood of a fully filled OLAP cache is in a test
environment much higher than in a productive system. Therefore you would benefit much more from the
OLAP cache in a test system, which results in unrealistic measurements. Secondly, you typically want to tune
the slowest BW queries running under the worst conditions when the OLAP cache does not contain fitting
entries by chance.
Some BW queries are independent from database performance by nature. When there is a big result set with
millions of rows, then a huge part of the runtime is consumed for transferring the result set from the database
server to the SAP application server, and finally to the BW client. In this case you are measuring the network
throughput rather than the database performance.
https://blogs.msdn.microsoft.com/saponsqlserver/2013/03/19/optimizingbwqueryperformance/
7/10
11/20/2016
OptimizingBWQueryPerformance|RunningSAPApplicationsontheMicrosoftPlatform
Recent Posts
SQL Server 2016 improvements for SAP BW November 11, 2016
Where can I find documentation or information for .? October 24, 2016
In a changing market we now offer a broader spectrum of topics October 23, 2016
How to install or upgrade SAP Systems to SQL Server 2016 October 4, 2016
Tags
Availability Hot
News Jobs Licensing liveCache LOGINFO Log Shipping MaxDB Migration Mirroring Monitoring News
Performance Private Cloud SAP GUI Security Sizing SQL Server System
Center Upgrade Virtualization VLFs Windows
Archives
November 2016 1
October 2016 3
August 2016 2
June 2016 2
May 2016 4
March 2016 2
February 2016 1
January 2016 2
December 2015 1
October 2015 6
August 2015 1
All of 2016 17
All of 2015 29
All of 2014 23
All of 2013 29
All of 2012 43
All of 2011 28
All of 2010 33
All of 2009 15
All of 2008 18
All of 2007 11
Tags
Administration
BW
Compression
Performance
https://blogs.msdn.microsoft.com/saponsqlserver/2013/03/19/optimizingbwqueryperformance/
8/10
11/20/2016
OptimizingBWQueryPerformance|RunningSAPApplicationsontheMicrosoftPlatform
Add Comment
4 years ago
Hi Martin,
we are 10 days productive with ColumnStoreIndex. Approximately 80% of all cubes were converted, the rest is
daily full load and datavolume is below 2Mio rows each. Biggest 6 Cubes are in the 60Mio rows size. We have
eliminated nearly all aggregates, the few remaining aggregates all have a reduction factor > 20 and are used
for highly aggregated daily reports.
User response is great. Naturally the ones with once long running queries say it's remarkebly faster and the
other with former response times below 10 sec see no significant improvement. For some it opens new
possibilities of analyzing and control data quality. If you can execute two or three steps in the same time it
took one step before the upgrade, you are more interested and motivated to do so.
The effect is not so prominent for users in ExcelEnvironment because of transfer time and ExcelOverhead,
but significant enough to be recognized.
ColumnStoreIndex in SAP BW may not be as impressively fast as SAP BW on HANA but a really big step
performancewise. And you can get it for only the cost of a proper configured standard hardware with little
upgrade and conversion effort and the most important, no licence overhead.
Thanks
NK
3 years ago
Hi Martin,
One of your blogs makes a mention of the number of indexes dropping from around 10 to 2 for an E fact
table. We have set up a POC environment and have executed MSSCSTORE for a test cube. The E table now
contains the earlier Dimension indexes, created by DDIC, a primary Index P and the CS index.
My question is whether we need to drop the dimension indexes in the E fact table? If yes, then, during the
next transport won"t they get recreated again? Is this what you allude to as reduction of indexes.
Cheers
Martin Merdes
3 years ago
Hi NK,
the indexes 010, 020, still exist in DDIC, but not on the DB once you create a CS index. If the indexes still
exist on DB which I have never seen or heard about, then open a support message at SAP in component
BWSYSDBMSS.
https://blogs.msdn.microsoft.com/saponsqlserver/2013/03/19/optimizingbwqueryperformance/
9/10
11/20/2016
OptimizingBWQueryPerformance|RunningSAPApplicationsontheMicrosoftPlatform
Thanks
Martin
NK
3 years ago
Thank you for the response Martin. Understood, my mistake. Cameron clarified too.
They do not exist on the database.
Cheers
Naresh Anakamatla
3 years ago
https://blogs.msdn.microsoft.com/saponsqlserver/2013/03/19/optimizingbwqueryperformance/
10/10