How To Measure Operations On Internal Tables - SCN

Getting Started Newsletters Store
Products Services & Support About SCN Downloads

Industries Training & Education Partnership Developer Center
Lines of Business University Alliances Events & Webinars Innovation
Log On Join Us Hi, Guest Search the Community
Activity Communications Actions
Browse
ABAP Testing and Troubleshooting
Previous
post
Next
post
0 Tweet 0
The efficient handling of internal tables is one of the most important performance factors in ABAP programs.
Therefore it is essential to know the runtime behavior of internal table statements. This blog describes how to
measure operations on internal tables and how to ensure that the measurement results are reliable.
1. Introduction
In ABAP, there are three table types: standard table, sorted table and hashed table; and two main types of accesses,
read with index and read with key (which has some subtypes).
The expectations on table types and access regarding performance are the following:
The fastest accesses should be independent of the table size. This behavior should be realized by the index
reads on standard table and sorted table. The hashed table allows no index read, it calculates a hashed value
from the table key which allows also a direct access to the searched line.
A binary search algorithm splits in every step the search area in two parts and checks which part contains the
wanted entry. It can be applied if the table has a sort order, i.e. either a sorted table or a sorted standard table.
The binary search should have a logarithmic dependence on the table size. It is realized automatically by the
read on a sorted table with table key.
It can also be used on standard tables by adding BINARY SEARCH at the end of the read statement. Here you
must take care that the standard table is sorted in ascending order according to the used key. If the sort order is
not fulfilled, then the binary search still works, but it can miss entries. Please be also aware, that a sort is an
expensive operation, it should never be used inside a large loop. In principle, a table should be sorted exactly
once during the execution of a program.
Both realizations of the binary search find not just any record fulfilling the search condition, but the first record
according to the sort order. Therefore, they also speed up a read with key where the key is not the complete
table key but only a leading part of it.
All other reads must scan the whole table sequentially, and need an average runtime per record which is directly
proportional to the size of the table. These are, all reads from standard table, if no binary search is added and
no index is used, all reads from sorted tables which contain no leading fields of the table key and no index, and
all reads on hashed tables which do not specify the complete table key.
This blog has two goals:
1. First to demonstrate that the behavior is qualitatively as described above.
2. And to establish a reliable measurement method.
In a previous blog ( Runtimes of Reads and Loops on Internal Tables) the exact measurement results for several
reads from internal tables have been shown.
2. Measurement Program
One might assume that in principle the measurement of a READ from an internal table is very simple. You simply use
the ABAP command GET RUN TIME as
GET RUN TIME FIELD start.
READ TABLE sort1
WITH TABLE KEY key1 = k1 key2 = k2
INTO wa1.
GET RUN TIME FIELD stop.
It is absolutely essential that the measurement does not contain operations other than the one you want to measure.
This is really obvious but still often overlooked. Additionally, you will soon recognize that the results of such simple
measurements show huge variation and do not match the expected behavior.
The following program has all the ingredients to measure internal table operations in a reliable way.

How to Measure Operations on Internal Tables
Posted by Siegfried Boes in ABAP Testing and Troubleshooting on Nov 9, 2007 12:31:39 AM
Share 0 Like
*&---------------------------------------------------------------------* *&
Report Z_ITAB_TEST *&---------------------------------------------------------
------------* * Measures Runtimes of Read from Internal Tables * for
hashed table with table key and * for sorted table with table key * *
Several variation parameters are built in: * N_max : to cover different
table sizes 10,20,50, ... 10.000 * Pre-Read: to exclude first executions
from measurement * I_max : to increase time resolution * S_Max : to
get better statistics * L_Max : to read lines from different location in
the table * * Last change: Nov 2007 *-----------------------------------------
-----------------------------* REPORT zsb_itab_test LINE-SIZE
220. TYPES: BEGIN OF st_tab, keyfield(30) TYPE n, c(2)
TYPE n, ctext(168) TYPE c, END OF st_tab. TYPES: tab TYPE
STANDARD TABLE OF st_tab WITH KEY keyfield c, sort TYPE SORTED
TABLE OF st_tab WITH UNIQUE KEY keyfield c, hash TYPE HASHED TABLE OF
st_tab WITH UNIQUE KEY keyfield c. DATA: n_itab TYPE tab,
s_itab TYPE tab, c10(10) TYPE c, c50(50) TYPE c,
c250(250) TYPE c, c1000(1000) TYPE c, c5000(5000) TYPE c,
textx TYPE st_tab-ctext, j_max TYPE i VALUE '20',
n_i TYPE i, nn TYPE i, ll TYPE i,
l_inc TYPE i, l10 TYPE i, start TYPE i,
stop TYPE i, t_i TYPE p DECIMALS 3, t1_s
TYPE p DECIMALS 3, t2_s TYPE p DECIMALS 3, t3_s TYPE p
DECIMALS 3, t1_s_min TYPE p DECIMALS 3, t2_s_min TYPE p
DECIMALS 3, t3_s_min TYPE p DECIMALS 3, t1_l TYPE p
DECIMALS 3, t2_l TYPE p DECIMALS 3, t3_l TYPE p
DECIMALS 3, tsum1_l TYPE p DECIMALS 3, tsum2_l TYPE p
DECIMALS 3, tsum3_l TYPE p DECIMALS 3, t1_n TYPE p
DECIMALS 3, t2_n TYPE p DECIMALS 3, t3_n TYPE p
DECIMALS 3. c10 = '1234567890'. CONCATENATE c10 c10 c10 c10 c10
INTO c50. CONCATENATE c50 c50 c50 c50 c50 INTO c250.
CONCATENATE c250 c250 c250 c250 INTO c1000. CONCATENATE c1000
c1000 c1000 c1000 c1000 INTO c5000. textx = c5000. *--------------------------
-------------------------------------------- PARAMETERS: n_max TYPE i
DEFAULT '10', preread TYPE c AS CHECKBOX, i_max TYPE i DEFAULT '1',
s_max TYPE i DEFAULT '1', l_max TYPE i DEFAULT '1'. START-OF-
SELECTION. FORMAT COLOR COL_KEY INTENSIFIED ON. WRITE: / ' '. WRITE AT
30 'Runtime (micro-seconds) '. WRITE: / ' '. WRITE
AT 10 'N'. WRITE AT 30 'Read_1'. WRITE AT 50 'Read_2'. WRITE AT 70
'Offset '. FORMAT RESET. *--------------------------------------------------
--------------------- n_i = 0. * 4. variation: size of internal tables:-----
---------------------------- DO n_max TIMES. n_i = n_i + 1. * fill
internal tables of certain size: PERFORM fill_itab USING
n_i CHANGING nn. CLEAR tsum1_l. CLEAR tsum2_l. CLEAR
tsum3_l. CLEAR ll. l_inc = nn / ( l_max + 1 ). * 3. variation:
different locations:---------------------------------- WHILE ( ll<l_max
). ll = ll + 1. l10 = ( l_inc * ll ) * 10. t1_s_min =
'9999999.9'. t2_s_min = '9999999.9'. t3_s_min = '9999999.9'. * 2.
variation: the statistical repeats:------------------------------ DO
s_max TIMES. * 1. variation: internal repeats---------------------------------
----- IF ( i_max EQ '1' ). IF ( preread IS INITIAL
). PERFORM read_hashed_tabkey_1. t1_s =
t_i. PERFORM read_sorted_tabkey_1. t2_s =
t_i. ELSE. PERFORM read_hashed_tabkey_p.
t1_s = t_i. PERFORM read_sorted_tabkey_p. t2_s =
t_i. ENDIF. ELSE. PERFORM
read_hashed_tabkey. t1_s = t_i. PERFORM
read_sorted_tabkey. t2_s = t_i. PERFORM
empty_do. t3_s = t_i. ENDIF. * end of 1. variation-----------
---------------------------------- IF ( t1_s<t1_s_min ).
t1_s_min = t1_s. ENDIF. IF ( t2_l<t2_s_min ).
t2_s_min = t2_s. ENDIF. IF ( t3_l<t3_s_min ).
t3_s_min = t3_s. ENDIF. ENDDO. * end of 2. variation:------------
------------------------------- t1_l = t1_s_min. t2_l =
t2_s_min. t3_l = t3_s_min. tsum1_l = tsum1_l + t1_l. tsum2_l
= tsum2_l + t2_l. tsum3_l = tsum3_l + t3_l. ENDWHILE. * end of 3.
variation: location-average:----------------------------------- t1_n =
tsum1_l / l_max. t2_n = tsum2_l / l_max. t3_n = tsum3_l / l_max.
FORMAT COLOR COL_NEGATIVE. WRITE: / nn. WRITE AT 20 t1_n. WRITE
AT 40 t2_n. WRITE AT 60 t3_n. FORMAT RESET. ENDDO. *----------
------------------------------------------------------------- * fill_itab:
fills standard table * ote the filled table should have no special order *
*----------------------------------------------------------------------- FORM
fill_itab USING n_i TYPE i CHANGING nn TYPE i.
DATA: itab TYPE tab, wa1 TYPE st_tab, count
TYPE i. * predefined IF ( n_i GE 10 ). nn = 10000. ELSEIF ( n_i =
9 ). nn = 5000. ELSEIF ( n_i = 8 ). nn = 2000. ELSEIF ( n_i = 7
). nn = 1000. ELSEIF ( n_i = 6 ). nn = 500. ELSEIF ( n_i = 5 ).
nn = 200. ELSEIF ( n_i = 4 ). nn = 100. ELSEIF ( n_i = 3 ). nn =
50. ELSEIF ( n_i = 2 ). nn = 20. ELSEIF ( n_i = 1 ). nn = 10.
ENDIF. REFRESH itab. REFRESH n_itab. REFRESH s_itab. CLEAR
wa1. CLEAR count. *------------------------------------------------ * itab
is build sorted ! DO nn TIMES. count = count + 1. wa1-keyfield

The explanation of the parameters, their effect and their proper setting are discussed step-by-step below.
3. The Effect of the Parameters
3.a. Variation of the Size n_max
The runtime of an operation on an internal table depends on the number of lines in the table, the machine power, the
table width and other factors. So, if we measure only for one fixed table of the size n, for example n = 1000, then we do
not learn much. We are mainly interested in the dependence of the runtime on the size of the internal table n, as all
other parameters should not change. Therefore a variation of n is included in the test program.
The effect can be seen by running the test program with n_max = 10, pre-read off, i_max = 1, s_max = 1, l_max = 1,
i.e. in default setting. For simplicity, 10 values have been predefined to cover the range from 10 to 10.000. Execute the
tests several times to check whether this setting leads already to reliable data or not. The results are shown as black
lines in figures 1 and 2. It is obvious that there is a lot of variation between the measurements. Also, the dependence
on the table size is far from what we expect. So before we draw wrong conclusions, let us check whether the
measurement can be improved any further.
3.b. Pre-read Cost of Initial Reads
The strangest effect in these first measurements is the rather strong increase of the runtime with the number lines in
the table: for the hashed table we expect no increase, and for the sorted table maybe a small increase. It seems that
a first read needs much more time than the subsequent ones, which are therefore a better measure for our needs.
For this reason a pre-read was added, i.e. 20 reads on the table were executed before the actual measurement is
done.
The effect can be seen by running the test program with n_max = 10, pre-read on, i_max = 1, s_max = 1, l_max = 1.
The results are shown as orange lines in figures 1 and 2. The result is much smaller than the result of the first test,
but there is still a lot of variation between the measurements.
3.c. Measurement Time Resolution Repeated Execution i_max
The measurements are now in the range of a few microseconds and therefore extremely close to the time resolution
of the GET RUN TIME, which is one micro-second. So measuring one execution is not reliable, the operation must be
repeated several times to get runtimes in the ranges of 50 or more microseconds. This can be done by adding a DO
... ENDDO. The cost of the empty DO ENDDO must be deduced from the measurement.
GET RUN TIME FIELD start.
DO i_max TIMES.
READ TABLE sort1
WITH TABLE KEY key1 = k1 key2 = k2
INTO wa1.

ENDDO.
GET RUN TIME FIELD stop.
This can be done by running the test program with n_max = 10, pre-read on, i_max = 1000, s_max = 1, l_max = 1.
Note, i_max was increased until the results did change no longer. The results are shown the figures 1 and 2 as
green lines. The result is much smaller than the previous results therefore the detail view was added. There is still a
bit of variation in the results.
3.d. Repeats for Better Statistics s_max
To reduce the variation of the results even further, it helps to the repeat the measurements several times and
calculate the average. It can be assumed that the variations are caused by some uncontrollable effects, which should
have only a negative impact. i.e. they can make the execution slower but not faster. Therefore we do not average over
the different executions but use the fastest execution out of several measurements.
This can be done by running the test program with s_max than 1, i.e. with n_max = 10, pre-read on, i_max = 1000,
s_max = 20, l_max = 1. The results are shown in the figures 1 and 2 as blue lines. The variation decreases again.
3.e. Location Dependence l_max
It is obvious that an operation like the sequential read, which scans the table from start to end, will find an entry at the
beginning faster than one at the end. In this case it is also obvious that the runtime for an entry in the middle of the
table is equal to the averaged runtime. However, in the case of a read facilitating a binary search, it is not clear which
line would represent the averaged runtime. In general it is much better to execute the average over the runtimes of
several reads accessing different parts of the table.
This can be done by running the test program with l_max larger than 1, i.e. with n_max = 10, pre-read on, i_max =
1000, s_max = 20 and l_max = 20. The program accesses l_max different lines equidistantly distributed over the
whole table. The results are shown in the figures 1 and 2 as red lines. These are the best results and do resemble
our expectations quite well.
4. Results
Figure 1 and Detail: Averaged runtime (in micro-sec) for the hashed read with table key for different table sizes N
according to the method explained above. The different colors display different setting of the parameter values
(n_max, pre-read, i_max, s_max, l_max). The blacks lines have only the table size n variation (10, off, 1, 1, 1), the
orange lines add a pre-read before the measurements (10, on, 1, 1, 1), the green lines an internal i variation because
of the restricted time resolution (10, on, 1000, 1, 1), the blue lines also the statistical (s) variation (10, on, 1000, 20, 1),
and the red lines also a location variation (10, on, 1000, 20, 20).

Figure 2 and Detail: Averaged runtime (in micro-sec) of a sorted read with table key for different table sizes N
according the method explained above. The different colors display different settings of the parameter values (n_max,
pre-read, i_max, s_max, l_max). The blacks lines have only the table size n variation (10, off, 1, 1, 1), the orange lines
add a pre-read before the measurements (10, on, 1, 1, 1), the green lines an internal i variation because of the
restricted time resolution (10, on, 1000, 1, 1), the blue lines also the statistical (s) variation (10, on, 1000, 20, 1), and
the red lines also a location variation (10, on, 1000, 20, 20).

Measuring operations on internal tables in principle seems very simple. However, to get really reliable data bit
more effort must be put into the measurement. How it should be done was discussed here.
Further Reading: Performance-Optimierung von ABAP-Programmen (in German!)
More information on performance topics can be found in my new textbook on performance (published Nov 2009).
However please note, that it is right now only available in German.
Chapter Overview:
1. Introduction
2. Performance Tools
3. Database Know-How
4. Optimal Database Programming
5. Buffers
6. ABAP - Internal Tables
7. Analysis and Optimization
8. Programs and Processes
9. Further Topics
10. Appendix
In the book you will find detailed descriptions of all relevant performance tools. An introduction to database
processing, indexes, optimizers etc. is also given. Many database statements are discussed and different
alternatives are compared. The resulting recommendations are supported by ABAP test programs which you can
download from the publishers webpage (see below). The importance of the buffers in the SAP system are discussed
in chaptr five. Of all ABAP statements mainly the usage of internal tables is important for a good performance. With all
the presented knowledge you will able to analyse your programs and optimize them. The performance implications of
further topics, such as modularisation, workprocesses, remote function calls (RFCs), locks & enqueues, update
tasks and prallelization are explained in the eight chapter.
Even more information - including the test programs - can be found on the webpage of the publisher.
I would recommend you especially the examples for the different database statements. The file with the test program
(K4a) and necessary overview with the input numbers (K4b) can even be used, if you do not speak German!
Follow SCN
Site Index Contact Us SAP Help Portal
Privacy Terms of Use Legal Disclosure Copyright
Average User Rating
(1 rating)
0 Tweet 0
2668 Views Topics: abap
Share 0 Like
3 Comments
Like (0)
Rui Pedro Dantas Nov 9, 2007 3:28 AM
Very good posts, both this and the previous one you linked to.
I have one doubt, that maybe was answered somewhere else by you already: why did you prefer to
use GET RUN TIME in the code, instead of measuring with SE30? Was it not reliable?
Like (0)
Siegf ried Boes Nov 9, 2007 4:31 AM (in response to Rui Pedro Dantas)
Thank you for your interest, I took the GET RUNTIME for 2 reasons:
+ the SE30 has more overhead than the GET RUNTIME
+ and for usability, how should I extract the results of the different measurements from the
SE30 and display it in the result list?
Siegfried
Like (0)
Sreenivas Mamidi Nov 11, 2009 3:21 PM
Hi,
Your blogs are extremely useful, Thank you very much for the clear explanations.
Cheers,
Sreenivas.

How To Measure Operations On Internal Tables - SCN

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

How To Measure Operations On Internal Tables - SCN

Hochgeladen von

Copyright:

Verfügbare Formate

Getting Started Newsletters Store

Products Services & Support About SCN Downloads

Das könnte Ihnen auch gefallen