Beruflich Dokumente
Kultur Dokumente
Data provided using a data transfer process SID values can be generated Data records with the same key are aggregated during activation Data is available for reporting after activation
The BAPI, BAPI_ODSO_READ_DATA_UC, for reading data, enables you to make DataStore data available to external systems. Activation queue: Used to save DataStore object data records that need to be updated, but that have not yet been activated. After activation, this data is deleted if all requests in the activation queue have been activated. See: Example of Activating and Updating Data. Active data: A table containing the active data (A table). Change log: Contains the change history for the delta update from the DataStore object into other data targets, such as DataStore objects or InfoCubes. The tables of active data are built according to the DataStore object definition. This means that key fields and data fields are specified when the DataStore object is defined. The activation queue and the change log are almost identical in structure: the activation queue has an SID as its key, the package ID and the record number; the change log has the request ID as its key, the package ID, and the record number.
Data can be loaded from several source systems at the same time because a queuing mechanism enables a parallel INSERT. The key allows records to be labeled consistently in the activation queue.
The data arrives in the change log from the activation queue and is written to the table for active data upon activation. During activation, the requests are sorted according to their logical keys. This ensures that the data is updated to the table of active data in the correct request sequence
...
1. Request 1 with amount 10 and request 2 with amount 30 are loaded parallel into the DataStore object. This takes you to the activation queue. You are given a unique request ID there. 2. When you carry out the activation step, the requests are sorted by key, transferred into the table containing the active data, and immediately deleted from the activation queue. In the table containing the active data, the amount 10 is replaced by 30 (since Overwrite is set as the update type). 3. When you activate the data, the change log is also notified: The old record from the active table is saved as a negative (-10) and the new record is stored as a positive (+30). 4. If all the records are activated, you can update the changes to the data records for the DataStore object in the related InfoProvider in a separate step. The amount in this example is increased in the related InfoProviders by 20.
1.
Loading the data into the BI system and storing it in the PSA.
The data requested by the BI system is stored initially in the PSA. A PSA is created for each DataSource and each source system. The PSA is the storage location for incoming data in the BI system. Requested data is saved, unchanged, to the source system.
2. Processing and storing the data in DataSource objects
In the second step, the DataSource objects are used on two different levels. 1. a. On level one, the data from multiple source systems is stored in DataSource objects. Transformation rules permit you to store the consolidated and cleansed data in the technical format of the BI system. On level one, the data is stored on the document level (for example, orders and deliveries) and constitutes the consolidated database for further processing in the BI system. Data analysis is therefore not usually performed on the DataSource objects at this level. 2. b. On level two, transfer rules subsequently combine the data from several DataStore objects into a single DataStore object in accordance with business-related criteria. The data is very detailed, for example, information such as the delivery quantity, the delivery delay in days, and the order status, are calculated and stored per order item. Level 2 is used specifically for operative analysis issues, for example, which orders are still open from the last week. Unlike multidimensional analysis, where very large quantities of data are selected, here data is displayed and analyzed selectively.
3. Storing data in the InfoCube
In the final step, the data is aggregated from the DataStore object on level two into an InfoCube, meaning in this scenario, that the InfoCube does not contain the order number, but saves the data, for example, on the levels of customer, product, and month. Multidimensional analysis is also performed on this data using a BEx query. You can still display the detailed document data from the DataStore object whenever you need to. Use the report/report interface from a BEx query. This allows you to analyze the aggregated data from the InfoCube and to target the specific level of detail you want to access in the data.
Write-Optimized DataStore Objects: Consists of the table of active data only From data transfer process Details:
Use
Data that is loaded into write-optimized DataStore objects is available immediately for further processing. They can be used in the following scenarios:
You use a write-optimized DataStore object as a temporary storage area for large sets of data if you are executing complex transformations for this data before it is written to the DataStore object. The data can then be updated to further (smaller) InfoProviders. You only have to create the complex transformations once for all data. You use write-optimized DataStore objects as the EDW layer for saving data. Business rules are only applied when the data is updated to additional InfoProviders.
The system does not generate SIDs for write-optimized DataStore objects and you do not need to activate them. This means that you can save and further process data quickly. Reporting is possible on the basis of these DataStore objects. However, we recommend that you use them as a consolidation layer, and update the data to additional InfoProviders, standard DataStore objects, or InfoCubes.
Structure
Since the write-optimized DataStore object only consists of the table of active data, you do not have to activate the data, as is necessary with the standard DataStore object. This means that you can process data more quickly. The loaded data is not aggregated; the history of the data is retained. If two data records with the same logical key are extracted from the source, both records are saved in the DataStore object. The record mode responsible for aggregation remains, however, so that the aggregation of data can take place later in standard DataStore objects. The system generates a unique technical key for the write-optimized DataStore object. The standard key fields are not necessary with this type of DataStore object. If there are standard key fields anyway, they are called semantic keys so that they can be distinguished from the technical keys. The technical key consists of the Request GUID field (0REQUEST), the Data Package field (0DATAPAKID) and the Data Record Number field (0RECORD). Only new data records are loaded to this key. You can specify that you do not want to run a check to ensure that the data is unique. If you do not check the uniqueness of the data, the DataStore object table may contain several records with the same key. If you do not set this indicator, and you do check the uniqueness of the data, the system generates a unique index in the semantic key of the InfoObject. This index has the technical name "KEY". Since writeoptimized DataStore objects do not have a change log, the system does not create delta (in the sense of a before image and an after image). When you update data into
the connected InfoProviders, the system only updates the requests that have not yet been posted. Use in BEx Queries For performance reasons, SID values are not created for the characteristics that are loaded. The data is still available for BEx queries. However, in comparison to standard DataStore objects, you can expect slightly worse performance because the SID values have to be created during reporting. If you want to use write-optimized DataStore objects in BEx queries, we recommend that they have a semantic key and that you run a check to ensure that the data is unique. In this case, the write-optimized DataStore object behaves like a standard DataStore object. If the DataStore object does not have these properties, you may experience unexpected results when the data is aggregated in the query. DataStore Data and External Applications The BAPI, BAPI_ODSO_READ_DATA_UC, for reading data, enables you to make DataStore data available to external systems.
In the previous release, BAPI BAPI_ODSO_READ_DATA was used for this. It is now obsolete.
Example:
1.
Loading the data into the BI system and storing it in the PSA
At first, the data requested by the BI system is stored in the PSA. A PSA is created for each DataSource and each source system. The PSA is the storage location for incoming data in the BI system. Requested data is saved, unchanged, to the source system.
2. Processing and storing the data in DataSource objects
In the second step, the data is posted at the document level to a write-optimized DataStore object (pass thru). The data is posted from here to another writeoptimized DataStore object, known as the corporate memory. The data is then distributed from the pass thru to three standard DataStore objects, one for each region in this example. The data records are deleted after posting.
3. Storing data in InfoCubes
In the final step, the data is aggregated from the DataStore objects to various InfoCubes depending on the purpose of the query, for example for different distribution channels. Modeling the various partitions individually means that they can be transformed, loaded and deleted flexibly.
DataStore Objects for Direct Update: Consists of the table of active data only From APIs Details:
Structure
The DataStore object for direct update consists of a table for active data only. It retrieves its data from external systems via fill or delete APIs. The following APIs exist:
RSDRI_ODSO_INSERT: Inserts new data (with keys not yet in the system). RSDRI_ODSO_INSERT_RFC: see above, can be called up remotely RSDRI_ODSO_MODIFY: inserts data having new keys; for data with keys already in the system, the data is changed. RSDRI_ODSO_MODIFY_RFC: see above, can be called up remotely RSDRI_ODSO_UPDATE: changes data with keys in the system RSDRI_ODSO_UPDATE_RFC: see above, can be called up remotely RSDRI_ODSO_DELETE_RFC: deletes data
The loading process is not supported by the BI system. The advantage to the way it is structured is that it is easy to access data. Data is made available for analysis and reporting immediately after it is loaded.
Creating a DataStore Object for Direct Update When you create a DataStore object, you can change the DataStore object type under Settings in the context menu. The default setting is Standard. You can only switch between DataStore object types Standard and Direct Update if data does not yet exist in the DataStore object.
Integration
Since you cannot use the loading process to fill DataStore objects for direct update with BI data (DataSources do not provide the data), DataStore objects are not displayed in the administration or in the monitor. However, you can update the data in DataStore objects for direct update to additional InfoProviders. If you switch a standard DataStore object that already has update rules to direct update, the update rules are set to inactive and can no longer be processed. Since a change log is not generated, you cannot perform a delta update to the InfoProviders at the end of this process. The DataStore object for direct update is available as an InfoProvider in BEx Query Designer and can be used for analysis purposes. Example:
DataStore objects for direct update ensure that the data is available quickly. The data from this kind of DataStore object is accessed transactionally. The data is written to the DataStore object (possibly by several users at the same time) and reread as soon as possible. It is not a replacement for the standard DataStore object. It is an additional function that can be used in special application contexts. The DataStore object for direct update consists of a table for active data only. It retrieves its data from external systems via fill or delete APIs. See DataStore Data and External Applications. The loading process is not supported by the BI system. Der Vorteil seiner Struktur liegt in der schnelleren Verfgbarkeit der Daten. Data is made available for analysis and reporting immediately after it is loaded.
Delete by Request
Use
This function allows you to delete both inactive and active requests from DataStore objects. It enables you to delete incorrect requests as the system usually only recognizes errors in the data or update rules after the request has been activated. The request is deleted both from the table for active data and the change log.
Integration
An error message appears if the request has already been updated into additional InfoProviders. In this case, you first have to delete the request to be deleted from the data targets. See also Deleting from Already Updated Data. Afterwards, you have to manually reset the data mart status in the DataStore object. Then you can delete the request. You can load more deltas after this. If you do not reset the data mart status, the delta update is deactivated in the connected InfoProvider when deletion is performed.
Features
You can only directly delete requests that have not yet been activated. The system uses rollback for requests that have already been activated. Rollback: With rollback, the system reverts back to the status in the DataStore object before you updated the incorrect requests. This means that all requests that were updated after the incorrect request are also deleted. You can repost requests that are available in the PSA afterwards. Processing upon deletion can be performed in parallel on a package-by-package basis. In this case, the packages in a request are processed in parallel. This is possible because the package has a unique key. Processing upon deletion is only always performed in serial for requests that have been loaded, and activated, using a DTP for real-time data acquisition.
If you post three requests and want to delete the middle one, the final request is also deleted. It is also possible to combine three requests into one single request in the DataStore object when you activate. Requests 1, 2, and 3 in the source DataStore object correspond to request 1 in the target DataStore object. If you want to delete request 3 from the source DataStore object, you have to delete request 1 from the target DataStore object because the three requests are combined in this one. Subsequently you also have to delete all three requests from the source DataStore object in order to remain consistent.
Procedure
...
1. In the administration of the InfoProvider, choose the tab page Requests. For your request to be deleted, choose Data Mart Status for the Data Target. The following dialog box displays the request that was updated in additional data targets. Keep the description of this request in mind. 2. Choose Monitor on this dialog box. You arrive at the monitor for this request. 3. Choose Manage InfoProviders. You arrive at the administration for the connected InfoProvider. 4. Delete the respective request. 5. This takes you to the source DataStore object administration screen. 6. Reset the delta administration. To do this, choose Data Mart Status for the InfoProvider and from the subsequent dialog box, choose Reset Delta Administration. 7. Now you can also delete the respective request from the source DataStore object. You can now load data as full or data again.
Features
The system determines the SIDs before the process of activating the data starts. The process of activating the data only begins after the SIDs have been determined.
If the activation process terminates while the SIDs are being determined, the data remains inactive and stays in the activation queue.
When the data is activated, it is written to the table of active data, where it is then available for reporting. Requests are sorted by the key of the DataStore object, request ID, data package ID, or data record number. This ensures that the data is updated to the table of active data in the correct request sequence. During an activation session, packages (from several DTP requests) are created that can be activated in parallel. Only one activation session can run at a time. When one activation session ends, the next one in the sequence is triggered. This is, however, only relevant when data is activated automatically. When you activate data manually, the pushbutton that you use to trigger the process disappears from the toolbar and is available again only after the current activation run is complete.
If an activation process is canceled, you cannot activate any subsequent requests. You have to keep repeating the activation process that was canceled until it is completed successfully.
If you set this indicator, when the activation is complete a request is displayed in the change log for each of the requests that has been loaded. This means you can delete requests individually to restore a previous status of the DataStore object. However, when you update to another InfoProvider, all requests that are active but have not yet been updated are combined into one single request.
If you want to update requests to connected InfoProviders individually, you have to update the requests immediately after you have activated them. You can do this using process chains.
If you do not set this indicator, all the requests activated in this process are compressed into one change log request. Only this request can be rolled back fully from the DataStore object.
Settings for parallel processing:
By choosing Extras PSA -> Parallel DTA, you can set the maximum number of parallel processes for the update process from DataSource 3.x of this InfoPackage to the defined data targets. For determining SIDs as well as for activating requests, processing is set to be carried out in three parallel processes. You can change this setting. If you change the setting to 1, serial processing occurs. Processing is controlled by BI background management. More information: Setting Parallel Processing of BI Processes
This blog describes a new DataStore in BI 7.0, "Write-Optimized DataStore" that supports the most detail level of tracking history, retaining the document status and a faster upload without activation.
In a database system, read operations are much more common than write operations and consequently, most of database systems have been read optimized. As the size of the main memory increases, more of the database read requests will be satisfied from the buffer system and also the number of disk write operations when compared to total disk operations will relatively increase. This feature has turned the focus on write optimized database systems. In SAP Business Warehouse, it is necessary to activate the data loaded into a Data Store object to make it visible for reporting or to update it to further InfoProviders. As of SAP NetWeaver 2004, a new type of Data Store object was introduced: the Write-Optimized DataStore object. The objective of this new DataStore is to save data as efficiently as possible to further process it without any activation, additional effort of generating SIDs, aggregation and data-record based delta. This is a staging DataStore used for a faster upload. In BI 7.0, three types of DataStore objects exist: 1. 2. 3. Standard DataStore (Regular ODS). DataStore Object for Direct Updates ( APD ODS). Write-Optimized DataStore (new).
In this weblog, I would like to focus on the features, usage and the advantages of Write-Optimzied DataStore. Write-Optimized DSO has been primarily designed to be the initial staging of the source system data from where the data could be transferred to the Standard DSO or the InfoCube. o The data is saved in the write-optimized Data Store object quickly. Data is stored in at most granular form. Document headers and items are extracted using a DataSource and stored in the DataStore. o The data is then immediately written to the further data targets in the architected data mart layer for optimized multidimensional analysis. The key benefit of using write-optimized DataStore object is that the data is immediately available for further processing in active version. YOU SAVE ACTIVATION TIME across the landscape. The system does not generate SIDs for write-optimized DataStore objects to achive faster upload. Reporting is also possible on the basis of these DataStore objects. However, SAP recommends to use Write-Optimized DataStore as a EDW inbound layer, and update the data into further targets such as standard DataStore objects or InfoCubes. Fast EDW inbound layer - An Introduction Data warehousing has been developed into an advanced and complex technology. For some time it was assumed that it is sufficient to store data in a star schema optimized for reporting. However, this does not adequately meet the needs of consistency and flexibility in the long run. Therefore data warehouses are structured using layer architecture like Enterprise data warehouse layer and Architectured data mart layer. These different layers contain data at different levels of granularity as shown in Figure 1.
Figure 1 Enterprise Data Warehouse Layer is a corporate information repository The benefit of Enterprise Data warehouse Layer includes the following:
Reliability, Trace back - Prevent Silos o 'Single point of truth'. o All data have to pass this layer on it's path from the source to the summarized EDW managed data marts.Controlled Extraction and Data staging (transformations, cleansing) o Data are extracted only once and deployed many. o Merging data that are commonly used together. Flexibility, Reusability and Completeness. o The data is not manipulated to please specific project scopes (unflavored). o The coverage of unexpected adhoc requirements. o The data is not aggregated. o Normally not used for reporting, used for staging, cleansing and transformation one time. o Old versions like document status are not overwritten or changed but useful information may be added. o Historical completeness - different levels of completeness are possible from availability of latest version with change date to change history of all versions including extraction history. o Modeled using Write-Optimized DataStore or standard DataStore. Integration o Data is integrated. o Realization of the corporate data integration strategy. Architectured data marts are used for analysis reporting layer, aggregated data, data manipulation with business logic, and can be modeled using InfoCubes or Multi Cubes. When is it recommended to use Write-Optimized DataStore Here are the Scenarios for Write-Optimized DataStore. (As shown in Figure 2). o Fast EDW inbound layer. o SAP recommends Write-Optimized DSO to be used as the first layer. It is called Enterprise Data Warehouse layer. As not all business content come with this DSO layer, you may need to build your own. You may check in table RSDODSO for version D and type "Write-Optimized". o There is always the need for faster data load. DSOs can be configured to be Write optimized. Thus, the data load happens faster and the load window is shorter. o Used where fast loads are essential. Example: multiple loads per day (or) short source system access times (world wide system landscapes). o If the DataSource is not delta enabled. In this case, you would want to have a Write-Optimized DataStore to be the first stage in BI and then pull the Delta request to a cube. o Write-optimized DataStore object is used as a temporary storage area for large sets of data when executing complex transformations for this data before it is written to the DataStore object. Subsequently, the data can be updated to further InfoProviders. You only have to create the complex transformations once for all incoming data. o Write-optimized DataStore objects can be the staging layer for saving data. Business rules are only applied when the data is updated to additional InfoProviders. o If you want to retain history at request level. In this case you may not need to have PSA archive; instead you can use Write-Optimized DataStore. o If a multi dimensional analysis is not required and you want to have operational reports, you might want to use Write Optimized DataStore first, and then feed data into Standard Datastore. o Probably you can use it for preliminary landing space for your incoming data from diffrent sources. o If you want to report daily refresh data with out activation.In this case it can be used in reporting layer with InfoSet (or) MultiProvider. I have discussed possible scenarios but request you to decide where this data store can fit in your data flow. Typical Data Flow using Write-Optimized DataStore
Figure 2 Typical Data flow using write-optimized DataStore. Functionality of Write-Optimized DataStore (As shown in Figure 3). Only active data table (DSO key: request ID, Packet No, and Record No): o No change log table and no activation queue. o Size of the DataStore is maintainable. o Technical key is unique. o Every record has a new technical key, only inserts. o Data is stored at request level like PSA table. No SID generation: o Reporting is possible(but you need make sure performance is optimized ) o BEx Reporting is switched off. o Can be included in InfoSet or Multiprovider. o Performence improvement during dataload. Fully integrated in data flow: o Used as data source and data target o Export into info providers via request delta Uniqueness of Data: o Checkbox Do not check Uniqueness of data. o If this indicator is set, the active table of the DataStore object could contain several records with the same key. Allows parallel load. Can be included in Process chain with out activation step. Support Archive. You cannot use reclustering for write-optimized DataStore objects since this DataStore data is not meant for querying. You can only use reclustering for standard DataStore objects and the DataStore objects for direct update. PSA and Write optimized DSO are the two different entities in the data flow as each one has its own features and usage. Write optimized DSO will not replace the PSA in a data flow but it allows to stage (or) store the data without activation and to apply business rules. Write-optimized DataStore Object is automatically partitioned. Manual Partitioning can be done according to SAP Notes 565725/742243. Optimized Write performance has been achieved by request level insertions,
similarly like F table in InfoCube. As we are aware that F fact table is write-optimized while the E fact table is read optimized.
Figure 3 Overview of various DataStore objects types in BI 7.0 To define Write-Optimized DataStore, just change Type of DataStore Object to Write-Optimized as shown in Figure 4.
Figure 4 Technical settings for Write-Optimized DataStore. Understanding Write-Optimized DataStore keys: Since data is written into Write-optimized DataStore active-table directly, you may not need to activate the request as is necessary with the standard DataStore object. The loaded data is not aggregated; the history of the data is retained at request level. . If two data records with the same logical key are extracted from the source, both records are saved in the DataStore object. The record mode responsible for aggregation remains, however, the aggregation of data can take place later in standard DataStore objects. The system generates a unique technical key for the write-optimized DataStore object. The technical key consists of the Request GUID field (0REQUEST), the Data Package field (0DATAPAKID) and the Data Record Number field (0RECORD) as shown in Figure4. Only new data records are loaded to this key. The standard key fields are not necessary with this type of DataStore object. Also you can define WriteOptimized DataStore without standard key. If standard key fields exist anyway, they are called semantic keys so that they can be distinguished from the technical key.
Semantic Keys can be defined as primary keys in further target Data Store but it depends on requirement. For example if you are loading data into ScheduleLine Level ODS thru Write-optimized DSO, you can have header,item,scl as the semantic keys in your Write-optimized DSO. The purpose of the semantic key is to identify error in the incoming records or duplicate records. All subsequent data records with same key are written to error stack along with the incorrect data records. These are not updated to data targets; these are updated to error stack. A maximum of 16 key fields and 749 data fields are permitted. Semantic Keys protect the data quality. Semantic keys wont appear in database level. In order to process error records or duplicate records, you must have to define Semantic group in DTP (data transfer process) that is used to define a key for evaluation as shown in Figure 5. If you assume that there are no incoming duplicates or error records, there is no need to define semantic group, its not mandatory. The semantic key determines which records should be detained when processing. For example, if you define "order number" and item as the key, if you have one erroneous record with an order number 123456 item 7, then any other records received in that same request or subsequent requests with order number 123456 item 7 will also be detained. This is applicable for duplicate records as well.
Figure 5 Semantic group in data transfer process. Semantic key definition integrates the write-optimized DataStore and the error stack through the semantic group in DTP as shown in Figure 5. With SAP NetWeaver 2004s BI SPS10, the write-optimized DataStore object is fully connected to the DTP error stack function. If you want to use write-optimized DataStore object in BEx queries, it is recommend that you define semantic key and that you run a check to ensure that the data is unique. In this case, the write-optimized DataStore object behaves like a standard DataStore object. If the DataStore object does not have these properties, unexpected results may be produced when the data is aggregated in the query. Delta Administration: Data that is loaded into Write-Optimized Data Store objects is available immediately for further processing. The activation step that has been necessary up to now is no longer required. Note here that the loaded data is not aggregated. If two data records with the same logical key are extracted from the source, both records are saved in the Data Store object, since the technical key for the both records not unique. The record mode (InfoObject 0RECORDMODE (space,X,A,D,R)) responsible for aggregation remains, however, the aggregation of data can take place at a later time in standard Data Store objects (or) InfoCube. Write-Optimized DataStore does not support the image based delta(RECORDMODE), it supports request level delta, and you will get brand new delta request for each data load. When you load a DataStore object that is optimized for writing, the delta administration is supplied with the change log request and not the load request . Since write-optimized DataStore objects do not have a change log, the system does not create delta (in the sense of a before image and an after image). When you update data into the connected InfoProviders, the system only updates the requests that have not yet been posted. Write-Optimized Data Store supports request level delta. In order to capture before and after image delta, you must have to post latest request into further targets like Standard DataStore or Infocubes. Extraction method - Transformations thru DTP (or) Update Rules thru InfoSource Prior to using DTP, you must have to migrate 3.x DataSource into BI 7.0 DataSource by using transaction code RSDS as shown in Figure 6.
Figure 6 Migration of 3.x Data Source -> Data Source using Tcode RSDS, and then replicate the data source into BI 7.0. After data source replication into BI 7.0, you may have to create data transfer process (DTP) to load data into Write-Optimized DataStore. Write-optimized DataStore objects can force a check of the semantic key for uniqueness when data is stored. If this option is active and if duplicate records are loaded with regard to semantic key, these are logged in the error stack of the Data Transfer Protocol (DTP) for further evaluation. In BI7 you are having the option to create error DTP. If any error occurs in data, the error data will be stored in Error stack. So, you can correct the errors in stack, and if you schedule the error DTP, the error data will be stored to target. Otherwise, you have to delete the error request from target and you need to reschedule the DTP. In order to integrate Write-Optimized DataStore into Error stack, you must have to define semantic keys in DataStore definition and create semantic group in DTP as shown in Figure 5. Semantic group definition is necessary to do parallel loads to Write-Optimized DataStore. You can update write-optimized DataStore objects in parallel after you have implemented OSS 1007769 note. When you include a DTP in process chain for write-optimized DataStore Object, you will need to make sure that there is no subsequent activation step for this DataStore. On the other hand you can just link this DSO thru the Infosource with update rules as well by using 3.x functionality. Reporting Write-Optimized DataStore Data: For performance reasons, SID values are not created for the characteristics that are loaded. The data is still available for BEx queries. However, in comparison to standard DataStore objects, you can expect slightly worse performance because the SID values have to be created during reporting. However, it is recommended that you use them as a staging layer, and update the data to standard DataStore objects or InfoCubes. OLAP BEx query perspective, there is no big difference between Write-Optimized DataStore and Standard DataStore, the technical key is not visible for reporting, so the look and feel is just like regular DataStore. If you want to use write-optimized DataStore object in BEx queries, it is recommended that they have a semantic key and that you run a check to ensure that the data is unique. In this case, the write-optimized DataStore object behaves like a standard DataStore object. If the DataStore object does not have these properties, unexpected results may be produced when the data is aggregated in the query. In a nut shell, Write Optimized DSO is not for reporting purpose unless otherwise required to do so, its a staging DataStore used for faster upload. The direct reporting on this object is also possible without activation but keeping in mind the performance, you can use an infoset or multi-provider. Conclusion: Using Write-Optimized DataStore, you will have snapshot for each extraction. This data can be used for trending old KPIs or deriving new KPIs at any time because the data is stored at request level. This most granular level data by calendar day/time can be used for slice and dice, data mining, root-cause analysis, behavioral analysis which will help in better decision making. Moreover you need not worry about the status of extracted documents into BI since data is stored as of extracted date/time. For example Order-to-Cash/Spend analysis...etc life cycle can be monitored in detail to identify the bottlenecks in the process. Although there is help documentation available from SAP on Write-Optimzied DataStore, I thought it would be useful to write this blog that gives a clear view on Write-Optimized DataStore concept, the typical scenarios of where, when and how to use; you can customize the data flow/ data model as per reporting(or)downstream requirement. A more detailed step-by-step technical document will be released soon. Useful OSS notes:
Please check the latest OSS notes / support packages from SAP to overcome any technical difficulties occurred and make sure to implement them. OSS 1077308: In a write-optimized DataStore object, 0FISCVARNT is treated as a key, even though it is only a semantic key. OSS 1007769: Parallel updating in write-optimized DataStore objects OSS 1128082 - P17:DSO:DTP:Write-optimized DSO and parallel DTP loading OSS 966002: Integration of write-opt DataStore in DTP error stack OSS 1054065: Archiving supports.
The Generation of SID Values flag should not be set if you are using the DataStore object for data storage purposes only. If you do set this flag, SIDs are created for all new characteristic values. If you are using line items (document number or time stamp for example) as characteristics in the DataStore object, set the flag in characteristic maintenance to show that they are Attribute Only.
SID values can be generated and parallelized on activation, irrespective of the settings. More information: Runtime Parameters of DataStore Objects.
Indexing
For queries based on DataStore objects, use selection criteria. If key fields are specified, the existing primary index is used. The more frequently accessed characteristic should appear on the left. If you have not specified the key fields completely in the selection criteria (you can check this in the SQL trace), you can improve the runtime of the query by creating additional indexes. You create these secondary indexes in DataStore object maintenance. However, you should note that load performance is also affected if you have too many secondary indexes.
The saving in runtime is influenced primarily by the SID determination. Other factors that have a favourable influence in runtime are a low number of characteristics and a low number of disjointed characteristic attributes. The percentages shown here are minimum values and can be higher if favourable factors like the ones mentioned above - apply.
If you use the DataStore object as the consolidation level, we recommend that you use the writeoptimized DataStore object. This makes it possible to provide data in the Data Warehouse Layer up to 2.7 times faster than with a standard DataStore object with unique data records. More information: Scenarios for Using Write-Optimized DataStore Objects.
To improve performance when processing data in the DataStore object, you can make a number of settings for each DataStore object in the runtime parameters or maintain general default values for DataStore objects. You navigate to runtime parameter maintenance either in Customizing by choosing SAP Customizing Implementation Guide SAP NetWeaver Business Intelligence Performance Settings Maintain Runtime Parameters of DataStore Objects or in the Administration area of the Data Warehousing Workbench by choosing Current Settings DataStore Objects. You can define the following: Parameters for activation:
The minimum number of data records for each data package when the data in the DataStore object is activated. In this way, you define the size of the data packages that are activated. The maximum wait time in seconds when data in the DataStore object is activated. This is the time that the main process (batch process) waits for the split process before it determines that it has failed.
Minimum number of data records for each data package when SIDs are generated in the DataStore object. Maximum wait time in seconds when SIDs are generated in the DataStore object.
Process parameters:
Type of processing: serial, parallel (batch), parallel (dialog) Maximum number of parallel processes Server group to be used for parallel processing of data in DataStore objects. You have to create the server groups first using the following path: SAP Easy Access Tools Administration Network RFC Destination, RFC RFC Groups. If you do not specify anything here, processing runs on the server on which the batch activation process was started. If a server in the server group is not active, processing is terminated.
Definition of Clustering
Prerequisites
You can only change clustering if the DataStore object does not contain any data. You can change the clustering of DataStore objects that are already filled using the Reclustering function. For more information, see Reclustering.
Features
In the DataStore maintenance, select Extras DB Performance Clustering. You can select MDC dimensions for the DataStore object on the Multidimensional Clustering screen. Select one or more InfoObjects as MDC dimensions and assign them consecutive sequence numbers, beginning with 1. The sequence number shows whether a field has been selected as an MDC dimension and determines the order of the MDC dimensions in the combined block index.
In addition to block indexes for the different MDC dimensions within the database, the system creates the combined block index. The combined block index contains the fields of all the MDC dimensions. The order of the MDC dimensions can slightly affect the performance of table queries that are restricted to all MDC dimensions and those that are used to access the combined block index. When selecting, proceed as follows:
Select InfoObjects that you use to restrict your queries. For example, you can use a time characteristic as an MDC dimension to restrict your queries. Select InfoObjects with a low cardinality. For example, the time characteristic 0CALMONTH instead of 0CALDAY. Assign sequence numbers using the following criteria:
Sort the InfoObjects according to how often they occur in queries (assign the lowest sequence number to the InfoObject that occurs most often in queries). Sort the InfoObjects according to selectivity (assign the lowest sequence number to the InfoObject with the most different data records).
Note: At least one block is created for each value combination in the MDC dimension. This memory area is reserved independently of the number of data records that have the same value combination in the MDC dimension. If there is not a sufficient number of data records with the same value combinations to completely fill a block, the free memory remains unused. This is so
that data records with a different value combination in the MDC dimension cannot be written to the block. If for each combination that exists in the DataStore object, only a few data records exist in the selected MDC dimension, most blocks have unused free memory. This means that the active tables use an unnecessarily large amount of memory space. Performance of table queries also deteriorates, as many pages with not much information must be read.
Example
The size of a block depends on the PAGESIZE and the EXTENTSIZE of the tablespace. The standard PAGESIZE of the DataStore tablespace with the assigned data class DODS is 16K. Up to Release SAP BW 3.5, the default EXTENTSIZE value was 16. As of Release SAP NetWeaver 7.0 the new default EXTENTSIZE value is 2. With an EXTENTSIZE of 2 and a PAGESIZE of 16K the memory area is calculated as 2 x 16K = 32K, this is reserved for each block. The width of a data record depends on the width and number of key fields and data fields in the DataStore object. If, for example, a DataStore object has 10 key fields, each with 10 bytes, and 30 data fields with an average of 9 bytes each, a data record needs 10 x 10 bytes + 30 x 9 bytes = 370 bytes. In a 32K block, 32768 bytes/370 bytes could write 88 data records. At least 80 data records should exist for each value combination in the MDC dimensions. This allows optimal use of the memory space in the active table.
Multidimensional Clustering
Use
Multidimensional clustering (MDC) allows you to save the sorted data records in the active table of a DataStore object. Data records with the same key field values are saved in the same extents (related database storage unit). This prevents data records with the same key values from being spread over a large memory area and thereby reduces the number of extents to be read upon accessing tables. Multidimensional clustering greatly improves active table queries.
Prerequisites
Currently, the function is only supported by the database platform IBM DB2 Universal Database for UNIX and Windows.
Features
Multidimensional clustering organizes the data records of the active table of a DataStore object according to one or more fields of your choice. The selected fields are also indicated as MDC
dimensions. Only data records with the same values in the MDC dimensions are saved in an extent. In the context of MDC, an extent is called a block. The system creates block indexes from within the database for the selected fields. Block indexes link to extents instead of data record numbers and are therefore much smaller than row-based secondary indexes. They save memory space and can be searched through more quickly. This particularly improves performance of table queries that are restricted to these fields. You can select the key fields of an active table of a DataStore object as an MDC dimension. Multidimensional clustering was introduced in Release SAP NetWeaver 7.0 and can be set up separately for each DataStore object. For procedures, see Definition of Clustering.
Reclustering
Use
Reclustering allows you to change the clustering of InfoCubes and DataStore objects that already contain data. You may need to make a correction if, for example, there are only a few data records for each of the value combinations of the selected MDC dimension and as a result the table uses an excessive amount of memory space. To improve the performance of database queries, you may want to introduce multidimensional clustering for InfoCubes or DataStore objects.
Integration
This function is only available for the database platform DB2 for Linux, UNIX, and Windows. You can use partitioning to improve the performance of other databases. For more information, see Partitioning.
Features
Reclustering InfoCubes
With reclustering, the InfoCube fact tables are always completely converted. The system creates shadow tables with a new clustering schema and copies all of the data from the original tables into the shadow tables. As soon as the data is copied, the system creates indexes and the original table replaces the shadow table. After the reclustering request has been successfully completed, both fact tables exist in their original state (name of shadow table) as well as in their modified state with the new clustering schema (name of original table). You can only use reclustering for InfoCubes. Reclustering deactivates the active aggregates of the InfoCubes; they are reactivated after the conversion.
Reclustering completely converts the active table of the DataStore object. The system creates a shadow table with a new clustering schema and copies all of the data from the original table into the shadow table. As soon as the data is copied, the system creates indexes and the original table replaces the shadow table. After the reclustering request has been successfully completed, both active tables exist in their original state (name of shadow table) as well as in their modified state with the new clustering schema (name of original table). You can only use reclustering for standard DataStore objects and DataStore objects for direct update. You cannot use reclustering for write-optimized DataStore objects. User-defined multidimensional clustering is not available for write-optimized DataStore objects.
Monitoring
You can monitor the clustering request using a monitor. The monitor shows you the current status of the processing steps. When you double-click, the relevant logs appear. The following functions are available in the context menu of the request or editing step:
Delete: You delete the clustering request. It no longer appears in the monitor and you cannot restart. All tables remain in their current state. This may result in inconsistencies in the InfoCube or DataStore object. Reset Request: You reset the clustering request. This deletes all the locks for the InfoCube and all its shadow tables. Reset Step: You reset the canceled editing steps so that they are reset to their original state. Restart: You restart the clustering request in the background.
Activities
You access reclustering in the Data Warehousing Workbench under Administration or in the context menu of your InfoCube or DataStore object. You can schedule repartitioning in the background by choosing Initialize. You can monitor the clustering requests by choosing Monitor.
Partitioning
Use
You use partitioning to split the total dataset for an InfoProvider into several, smaller, physically independent and redundancy-free units. This separation improves system performance when you analyze data delete data from the InfoProvider.
Integration
All database providers except DB2 for Linux, UNIX, and Windows support partitioning. You can use clustering to improve the performance for DB2 for Linux, UNIX, and Windows. If you are using IBM DB2 for i5/OS as the DB platform, you require database version V5R3M0 or higher and an installation of component DB2 Multi System. Note that with this system constellation the BI system with active partitioning can only be copied to other IBM iSeries with an SAVLIB/RSTLIB operation (homogeneous system copy). If you are using this database you can also partition PSA tables. You first have to activate this function using RSADMIN parameter DB4_PSA_PARTITIONING = 'X'. SAP Note 815186 includes more comprehensive information on this.
Prerequisites
You can only partition a dataset using one of the two partitioning criteria calendar month (0CALMONTH) or fiscal year/period (0FISCPER). At least one of the two InfoObjects must be contained in the InfoProvider.
If you want to partition an InfoCube using the fiscal year/period (0FISCPER) characteristic, you have to set the fiscal year variant characteristic to constant. See Partitioning InfoCubes using Characteristic 0FISCPER.
Features
When you activate the InfoProvider, the system creates the table on the database with one of the number of partitions corresponding to the value range. You can set the value range yourself.
Choose the partitioning criterion 0CALMONTH and determine the value range
6 years x 12 months + 2 = 74 partitions are created (2 partitions for values that lay outside of the range, meaning < 01.1998 or >12.2003). You can also determine the maximum number of partitions created on the database for this table.
Choose the partitioning criterion 0CALMONTH and determine the value range
Choose 30 as the maximum number of partitions. Resulting from the value range: 6 years x 12 calendar months + 2 marginal partitions (up to 01.1998, from 12.2003) = 74 single values. The system groups three months together at a time in a partition (meaning that a partition corresponds to exactly one quarter); in this way, 6 years x 4 partitions/year + 2 marginal partitions = 26 partitions created on the database. The performance gain is only achieved for the partitioned InfoProvider if the time characteristics of the InfoProvider are consistent. This means that with a partition using 0CALMONTH, all values of the 0CAL x characteristics of a data record have to match.
In the following example, only record 1 is consistent. Records 2 and 3 are not consistent:
Note that you can only change the value range when the InfoProvider does not contain data. If data has already been loaded to the InfoProvider, you have to perform repartitioning. For more information, see Repartitioning.
We recommend that you use partition on demand. This means that you should not create partitions that are too large or too small. If you choose a time period that is too small, the partitions are too large. If you choose a time period that ranges too far into the future, the number of partitions is too great. Therefore we recommend that you create a partition for a year, for example, and that you repartition the InfoProvider after this time.
Activities
In InfoProvider maintenance, choose Extras DB Performance Partitioning and specify the value range. Where necessary, limit the maximum number of partitions.
Repartitioning
Use
Repartitioning can be useful if you have already loaded data to your InfoCube, and:
You did not partition the InfoCube when you created it. You loaded more data into your InfoCube than you had planned when you partitioned it. You did not choose a long enough period of time for partitioning. Some partitions contain no data or little data due to data archiving over a period of time.
Integration
All database providers support this function except DB2 for Linux, UNIX, Windows and MAXDB. For DB2 for Linux, UNIX and Windows, you can use clustering or reclustering instead. For more information, see Clustering .
Features
Merging and Adding Partitions
When you merge and add partitions, InfoCube partitions are either merged at the bottom end of the partitioning schema (merge), or added at the top (split). Ideally, this operation is only executed for the database catalog. This is the case if all the partitions that you want to merge are empty and no data has been loaded outside of the time period you initially defined. The runtime of the action is only a few minutes. If there is still data in the partitions you want to merge, or if data has been loaded beyond the time period you initially defined, the system saves the data in a shadow table and then copies it back to the original table. The runtime depends on the amount of data to be copied. With InfoCubes for non-cumulatives, all markers are either in the bottom partition or top partition of the E fact table. Whether mass data also has to be copied depends on the editing options. For this reason, the partitions of non-cumulative InfoCubes cannot be merged if all of the markers are in the bottom partition. If all of the markers are in the top partition, adding partitions is not permitted. If this is the case, use the Complete Repartitioning editing option. You can merge and add partitions for aggregates as well as for InfoCubes. Alternatively, you can reactivate all of the aggregates after you have changed the InfoCube. Since this function only changes the DB memory parameters of fact tables, you can continue to use the available aggregates without having to modify them.
We recommend that you completely back up the database before you execute this function. This ensures that if an error occurs (for example, during a DB catalog operation), the can restore the system to its previous status.
Complete Partitioning
Complete Partitioning fully converts the fact tables of the InfoCube. The system creates shadow tables with the new partitioning schema and copies all of the data from the original tables into the shadow tables. As soon as the data is copied, the system creates indexes and the original table replaces the shadow table. After the system has successfully completed the partitioning request, both fact tables exist in the original state (shadow table), as well as in the modified state with the new partitioning schema (original table). You can manually delete the shadow tables after repartitioning has been successfully completed to free up the memory. Shadow tables have the namespace /BIC/4F<Name of InfoCube> or /BIC/4E<Name of InfoCube>.
You can only use complete repartitioning for InfoCubes. A heterogeneous state is possible. For example, it is possible to have a partitioned InfoCube with non partitioned aggregates. This does not have an adverse effect on functionality. You can automatically modify all of the active aggregates by reactivating them.
Monitor
You can monitor the repartitioning requests using a monitor. The monitor shows you the current status of the processing steps. When you double-click, the relevant logs appear. The following functions are available in the context menu of the request or editing step:
Delete: You delete the repartitioning request; it no longer appears in the monitor, and you cannot restart. All tables remain in their current state. The InfoCube may be inconsistent. Reset Request: You reset the repartitioning request. This deletes all the locks for the InfoCube and all its shadow tables. Reset Step: You reset the canceled editing steps so that they are reset to their original state. Restart: You restart the repartitioning request in the background. You cannot restart a repartitioning request if it still has status Active (yellow) in the monitor. Check whether the request is still active (transaction SM37) and, if necessary, reset the current editing step before you restart.
Transport
Since the metadata in the target system is adjusted without the DB tables being converted when you transport InfoCubes, repartitioned InfoCubes may only be transported when the repartitioning has already taken place in the target system. Otherwise inconsistencies that can only be corrected manually occur in the target system.
Activities
You can access repartitioning in the Data Warehousing Workbench using Administration, or in the context menu of your InfoCube.
You can schedule repartitioning in the background by choosing Initialize. You can monitor the repartitioning requests by choosing Monitor.
Prerequisites
When partitioning using 0FISCPER values, values are calculated within the partitioning interval that you specified in the InfoCube maintenance. To do this, the value for 0FISCVARNT must be known at the time of partitioning; it must be set to constant.
Procedure
...
1. The InfoCube maintenance is displayed. Set the value for the 0FISCVARNT characteristic to constant. Carry out the following steps:
a. b. c.
Choose the Time Characteristics tab page. In the context menu of the dimension folder, choose Object specific InfoObject properties. Specify a constant for the characteristic 0FISCVARNT. Choose Continue.
2. Choose Extras DB Performance Partitioning. The Determine Partitioning Conditions dialog box appears. You can now select the 0FISCPER characteristic under Slctn. Choose Continue. 3. The Value Range (Partitioning Condition) dialog box appears. Enter the required data. 4. For more information, see Partitioning.
Indexes
You can search a table for data records that satisfy certain search criteria faster using an index. An index can be considered a copy of a database table that has been reduced to certain fields. This copy is always in sorted form. Sorting provides faster access to the data records of the table, for example using a binary search. The index also contains a pointer to the corresponding record of the actual table so that the fields not contained in the index can also be read. The primary index is distinguished from the secondary indexes of a table. The primary index contains the key fields of the table and a pointer to the non-key fields of the table. The primary index is created automatically when the table is created in the database.
flight model contains the assignment of the carrier counters to airports. The primary index on this table therefore consists of the key fields of the table and a pointer to the original data records.
Table SCOUNTER in the
You can also create further indexes on a table in the ABAP Dictionary. These are called secondary indexes. This is necessary if the table is frequently accessed in a way that does not take advantage of the sorting of the primary index for the access. Different indexes on the same table are distinguished with a three-place index identifier.
All the counters of carriers at a certain airport are often searched for flight bookings. The airport ID is used to search for counters for such an access. Sorting the primary index is of no use in speeding up this access. Since table SCOUNTER has a large number of entries, a secondary index on the field AIRPORT (ID of the airport) must be created to support access using the airport ID.
The optimizer of the database system decides whether an index should be used for a concrete table access (see How
to Check if an Index is Used?). This means that an index might only result in a gain in performance for certain database systems. You can therefore define the database systems on which an index should be created when you define the index in the ABAP Dictionary (see Creating Secondary Indexes.)
All the indexes existing in the ABAP Dictionary for a table are normally created in the database when the table is created if this was not excluded in the index definition for this database system.
If the index fields have key function, that is if they already uniquely identify each record of the table, an index can be defined as a unique
index.
See also:
1.
Open a second session and choose System The Trace Requests screen appears.
2.
3.
In the first window, perform the action in which the index should be used.
If your database system uses a cost-based optimizer, you should perform this action with as representative data as possible. A cost-based optimizer tries to determine the best index based on the statistics.
4.
In the second session, choose Trace off and then Trace list.
Result
The format of the generated output depends on the database system used. You can determine the index that the database used for your action with the EXPLAIN function for the critical statements (PREPARE, OPEN, REPOPEN).
Index IDs
Several indexes on the same table are distinguished by a three-place index identifier. The index identifier may only contain letters and digits. The ID 0 is reserved for the primary index. The index name on the database adheres to the convention <Table name>~<Index ID>.
TEST~A is the name of the corresponding database index in the database for table TEST and the secondary index with ID A.
Since the convention for defining the index name in the database has changed several times, some of the indexes in the database might not follow this convention. Indexes created prior to Release 3.0 can have an 8-place name. The first 7 places (possibly filled with underlining) are for the table names, and the eighth place is for the (one-place) index identifier (for example TEST___A). Indexes introduced with Release 3.0 can have a 13-place name in the database. The first 10 places (possibly filled with underlining) are for the table names, and the 11th to 13th places are for the three-place index identifier (for example TEST______A).
1.
If indexes already exist on the table, a list of these indexes is displayed. Choose .
2. 3. In the next dialog box, enter the index ID and choose Enter an explanatory text in the field Short text.
The maintenance screen for indexes appears. You can then use the short text to find the index at a later time for example through the Information System.
4. Select the table fields to be inserted in the index using the input help for the Field name column.
The order of the fields in the index is very important. See What to Keep in Mind for Secondary Indexes
5. If the values in the index fields already uniquely identify each record of the table, select Unique index.
A unique index is always created in the database at activation because it also has a functional meaning (prevents double entries of the index fields).
6. If it is not a unique index, leave Non-unique index selected.
In this case you can use the radio buttons to define whether the index should be created for all database systems, for selected database systems or not at all in the database.
7. Select for selected database systems if the index should only be created for selected database systems.
Click on the arrow behind the radio buttons. A dialog box appears in which you can define up to 4 database systems with the input help. Select Selection list if the index should only be created on the given database systems. Select Exclusion list if the index should not be created on the given database systems. Choose .
8. Choose .
Result
The secondary index is automatically created on the database during activation if the corresponding table has already been created there and index creation was not excluded for the database system. You can find information about the activation flow in the activation log, which you can call with Utilities Activation log. If errors occurred when activating the index, the activation log is automatically displayed.
See also:
Indexes
Unique Indexes
An entry in an index can refer to several records that have the same values for the index fields. A unique index does not permit these multiple entries. The index fields of a unique index thus have key function, that is they already uniquely identify each record of the table. The primary index of a table is always a unique index since the index fields form the key of the table, uniquely identifying each data record.
You can define a secondary index as a unique index when you create it. This ensures that there are no double records in the table fields contained in the index. An attempt to maintain an entry violating this condition in the table results in termination due to a database error.
The accessing speed does not depend on whether or not an index is defined as a unique index. A unique index is simply a means of defining that certain field combinations of data records in a table are unique.
A unique index for a client-dependent table must contain the client field.
An index is defined on fields FIELD1, FIELD2, FIELD3 and FIELD4 of table BSPTAB in this order. This table is accessed with the SELECT statement: SELECT * FROM BSPTAB WHERE FIELD1 = X1 AND FIELD2 = X2 AND FIELD4= X4. Since FIELD3 is not specified more exactly, only the index sorting up to FIELD2 is of any use. If the database system accesses the data using this index, it will quickly find all the records for which FIELD1 = X1 and FIELD2 = X2. You then have to select all the records for which FIELD4 = X4 from this set. The order of the fields in the index is very important for the accessing speed. The first fields should be those which have constant values for a large number of selections. During selection, an index is only of use up to the first unspecified field. Only those fields that significantly restrict the set of results in a selection make sense for an index.
The following selection is frequently made on address file ADRTAB: SELECT * FROM ADRTAB WHERE TITEL = Prof. AND NAME = X AND VORNAME = Y. The field TITLE would rarely restrict the records specified with NAME and FIRSTNAME in an index on NAME, FIRSTNAME and TITLE, since there are probably not many people with the same name and different titles. This would not make much sense in this index. An index on field TITLE alone would make sense for example if all professors are frequently selected. Additional indexes can also place a load on the system since they must be adjusted each time the table contents change. Each additional index therefore slows down the insertion of records in the table. For this reason, tables in which entries are very frequently written generally should only have a few indexes.
The database system sometimes does not use a suitable index for a selection, even if there is one. The index used depends on the optimizer used for the database system. You should therefore check if the index you created is also used for the selection (see How to Check if an Index is Used).). Creating an additional index could also have side effects on the performance. This is because an index that was used successfully for selection might not be used any longer by the optimizer if the optimizer estimates (sometimes incorrectly) that the newly created index is more selective. The indexes on a table should therefore be as disjunct as possible, that is they should contain as few fields in common as possible. If two indexes on a table have a large number of common fields, this could make it more difficult for the optimizer to choose the most selective index.
Prerequisites
You have already created an InfoPackage for loading data into the DataSource for the DataStore object. You have created one data transfer process for loading data into the DataStore object and one data transfer process for updating the data from the DataStore object to another InfoProvider.
Procedure
...
1. Call the process chain maintenance. Choose Process Chain Maintenance in the Administration area of the Data Warehousing Workbench. The Process Chain Maintenance Planning View screen appears. 2. In the left-hand screen area of the required display component, navigate to the process chain in which you want to insert the DataStore object. Double-click it to select it. The system displays the process chain planning view in the right-hand screen area.
If no suitable process chain is available, create a new process chain. More information: Creating Process Chains
3. To insert the activities around the DataStore object as processes, select Process Types in the left-hand screen area. The system displays the available process categories. 4. In the Data Target Administration process category, choose the application process type Activate DataStore Object Data. 5. Use drag and drop to insert the application process type Activate DataStore Object Data into the process chain. The dialog box for inserting process variants appears. 6. Create a process variant and confirm your entries. An additional dialog box appears. 7. Use the input help to select the InfoPackage that you want to use to load the data into the DataSource. Confirm your entries. 8. Use the input help to select the data transfer process that you want to use to load data into the DataStore object. Confirm your entries. 9. The processes Load Data, Data Transfer Process, and Data are inserted together into your process chain. Activate DataStore Object
10. In the process category Load Process and Postprocessing, choose the application process type Data Transfer Process and use drag and drop to insert it into your process chain. The dialog box for inserting process variants appears. 11. Create a process variant and confirm your entries. An additional dialog box appears. 12. Use the input help to select the data transfer process that you want to use to load data from the DataStore object into the connected InfoProvider. Confirm your entries. The process Data Transfer Processis inserted into your process chain.
13. If you now connect the processes Activate DataStore Object Data and Data Transfer Process, you can select the conditions under which further updates are triggered.
Example
The following figure shows an example of a process chain that is used to load data into a DataStore object. The activation process makes the data available for reporting and enables the data to be updated to other InfoProviders.
Prerequisites
You have to differentiate between two situations:
...
1. The request(s), or the request(s) to be deleted, are not yet activated. In these cases, the requests are simply deleted from the activation queue during the deletion process. Because they were not yet activated, no adjustment occurs for the table of active data or the change log. 2. The request(s), or the request(s) to be deleted, are already activated.
A prerequisite for this would be that the requests to be deleted have not yet been updated or are no longer updated in the connected InfoProviders. In this case, see also Deleting from Already Updated Data. Since several requests can be activated in one activation run and you have the option of putting
them together in a change log request, two cases have to be differentiated here:
For each downloaded request (called a PSA request later on), there is exactly one change log request. In this case, the change log request is deleted from the change log and from the table of active data.
Several PSA requests are assembled for a change log request in an activation run. In these situations, you have to delete all of the other PSA requests that are included in the same change log request when deleting a (PSA) request.
Features
In both of the above-named cases, a rollback is executed during deletion. Rollback means that the status is restored that prevailed before posting the request to be deleted. As long as the request to be deleted is not the last one in the DataStore object, all requests that were activated after the one to be deleted have to also be deleted from the DataStore object. In order to delete data from an DataStore object, you have several different options. All of these options can be accomplished from the administration of the DataStore object . The option you choose depends on your application scenario.
...
1. If you want to delete entire requests that are incorrect, then choose Delete by Request. With this, requests that are not activated can be deleted from the activation queue, or activated requests can be deleted from the active data table and the change log. 2. If you only want to delete individual fields instead of an entire request, choose Delete Selectively. With this, you can delete data, for example, from a time period that is no longer used. In doing so, only data from the table for active data is deleted, while the change log remains unchanged. 3. If you only want to delete data from the change log, choose Delete from the Change Log. With this, you can reduce the scope of the change log if several requests have already been loaded, and only a limited history needs to be retained. Thus, the data volume of the change log is reduced.
Procedure
...
1. In the administration of the InfoProvider, choose the tab page Requests. For your request to be deleted, choose Data Mart Status for the Data Target. The following dialog box displays the request that was updated in additional data targets. Keep the description of this request in mind. 2. Choose Monitor on this dialog box. You arrive at the monitor for this request. 3. Choose Manage InfoProviders. You arrive at the administration for the connected InfoProvider. 4. Delete the respective request. 5. This takes you to the source DataStore object administration screen. 6. Reset the delta administration. To do this, choose Data Mart Status for the InfoProvider and from the subsequent dialog box, choose Reset Delta Administration. 7. Now you can also delete the respective request from the source DataStore object. You can now load data as full or data again.
You use this function to fill a DataStore object with requests that have already been loaded into the BI system or into another DataStore object. This function is only necessary for DataStore objects that obtained their data from InfoPackages. More information: Reconstruction of DataStore Objects
Only switch on automatic activation and automatic update if you are sure that these processes do not overlap. More information: Functional Constraints of Processes
Features
In the Content tab page, you can see a list of InfoObjects for the DataStore object. Use this to check whether the data that you loaded into the DataStore object is free of technical errors. If the table contains more than 40 fields suitable for selection, you have to select the fields first. You are able to change this selection again in the main menu under Settings Fields for Selection. Choose Settings User Parameters to set-up the table display so that all DataStore object columns are shown and nothing is missing. Choose Logs to view the logs for requests that have been deleted, activated, reconstructed, or added. You can display the contents of the activation queue table by choosing New Data. If this new data is activated, the system promptly deletes the activation queue and moves the data to the active data table. Choose Active Data to view the contents of the table containing active data (A table). Choose Change Log to display the change log table. This table contains the change log requests that are created each time you activate data. The table contents are displayed in the data browser. This enables you to use functions such as the download in various formats function.
Selective Deletion
We recommend selective deletion only when individual fields are not required and need to be deleted from the request. Refer to Selective Deletion.
Features
Request Is Available for Reporting
The Request is available for reporting information is displayed when activation has been
started.
The system does not check whether the data has been activated successfully.
When a request from a DataStore object is updated into additional InfoProviders, the data mart status of the request is displayed. You use the corresponding pushbutton to manage the distribution of this request. To show the InfoProvider into which the request has been updated, choose InfoProvider. For more information, see Where-Used List for DTP Requests. The icon shows that the request has already been updated into other InfoProviders. However, in Request Got from
some cases you might have to repeat this request. If you think that data was not correctly posted, reset the monitor status and request a repeat so that the request can be posted again. To do this, choose Request Reverse Posting in the monitor. For more information, see the documentation in the monitor.
Request Status
The red traffic light means that problems occurred while processing the request, and these problems prevent a secure upload.
Only requests that have a green status after loading can be activated and displayed in the query. If a request was green after loading but is red during activation, you can restart activation. Data packages with a red or yellow traffic light are not taken into consideration when you execute a BEx query. In this case, subsequent data packages with a green traffic light are not used in the query either because the consistency of the data in the query can no longer be guaranteed. You can reset the original request status in the monitor by choosing the request status symbol in the QM Request Status after Update column, and selecting Delete Status, Back to Request Status.
that occurred. You can display a detailed overview of request processing for all request operations by choosing Log. Here you trace the start and end times of the steps in request processing,
such as status changes and activation. You can check performance by looking at the runtimes. Technical information and detailed messages make help you analyze errors.
Deleting Requests
This tab page provides information about all the requests that have run in the DataStore object. You can also delete requests if you need to. You can only directly delete requests that have not yet been activated. The system uses rollback for requests that have already been activated. See Delete by Request.
execute once the process is complete. The Delete function triggers the background job or deletes it directly.
Activating Requests
You choose to activate requests. You can specify whether the requests are to be compressed in
one request in the change log when they are activated (this request is then rolled back from the DataStore object as a whole). You can also apply settings for processing. See Activation of Data in DataStore Objects.
Integration
You can call the where-used list for DTP requests in InfoCubes and DataStore objects if the request has already been loaded into other data targets. You can also call the where-used list for requests in the PSA tables of the DataSource (R3TR RSDS).
Features
The where-used list displays the source request and all related target requests in a hierarchy. You can delete requests directly from the where-used list display.
Delete by Request
Use
This function allows you to delete both inactive and active requests from DataStore objects. It enables you to delete incorrect requests as the system usually only recognizes errors in the data or update rules after the request has been activated. The request is deleted both from the table for active data and the change log.
Integration
An error message appears if the request has already been updated into additional InfoProviders. In this case, you first have to delete the request to be deleted from the data targets. See also Deleting from Already Updated Data.
Afterwards, you have to manually reset the data mart status in the DataStore object. Then you can delete the request. You can load more deltas after this. If you do not reset the data mart status, the delta update is deactivated in the connected InfoProvider when deletion is performed.
Features
You can only directly delete requests that have not yet been activated. The system uses rollback for requests that have already been activated.
Rollback:
With rollback, the system reverts back to the status in the DataStore object before you updated the incorrect requests. This means that all requests that were updated after the incorrect request are also deleted. You can repost requests that are available in the PSA afterwards. Processing upon deletion can be performed in parallel on a package-by-package basis. In this case, the packages in a request are processed in parallel. This is possible because the package has a unique key. Processing upon deletion is only always performed in serial for requests that have been loaded, and activated, using a DTP for real-time data acquisition.
If you post three requests and want to delete the middle one, the final request is also deleted. It is also possible to combine three requests into one single request in the DataStore object when you activate. Requests 1, 2, and 3 in the source DataStore object correspond to request 1 in the target DataStore object. If you want to delete request 3 from the source DataStore object, you have to delete request 1 from the target DataStore object because the three requests are combined in this one. Subsequently you also have to delete all three requests from the source DataStore object in order to remain consistent.
Procedure
...
1. In the administration of the InfoProvider, choose the tab page Requests. For your request to be deleted, choose Data Mart Status for the Data Target. The following dialog box displays the request that was updated in additional data targets. Keep the description of this request in mind. 2. Choose Monitor on this dialog box. You arrive at the monitor for this request. 3. Choose Manage InfoProviders. You arrive at the administration for the connected InfoProvider. 4. Delete the respective request. 5. This takes you to the source DataStore object administration screen. 6. Reset the delta administration. To do this, choose Data Mart Status for the InfoProvider and from the subsequent dialog box, choose Reset Delta Administration. 7. Now you can also delete the respective request from the source DataStore object. You can now load data as full or data again.
Features
The system determines the SIDs before the process of activating the data starts. The process of activating the data only begins after the SIDs have been determined.
If the activation process terminates while the SIDs are being determined, the data remains inactive and stays in the activation queue.
When the data is activated, it is written to the table of active data, where it is then available for reporting. Requests are sorted by the key of the DataStore object, request ID, data package ID, or data record number. This ensures that the data is updated to the table of active data in the correct request sequence. During an activation session, packages (from several DTP requests) are created that can be activated in parallel. Only one activation session can run at a time. When one activation session ends, the next one in the sequence is triggered. This is, however, only relevant when data is activated automatically. When you activate data manually, the pushbutton that you
use to trigger the process disappears from the toolbar and is available again only after the current activation run is complete.
If an activation process is canceled, you cannot activate any subsequent requests. You have to keep repeating the activation process that was canceled until it is completed successfully.
The following options are available for the activation process:
Do not compress requests into a single request when activation takes place.
If you set this indicator, when the activation is complete a request is displayed in the change log for each of the requests that has been loaded. This means you can delete requests individually to restore a previous status of the DataStore object. However, when you update to another InfoProvider, all requests that are active but have not yet been updated are combined into one single request.
If you want to update requests to connected InfoProviders individually, you have to update the requests immediately after you have activated them. You can do this using process chains.
If you do not set this indicator, all the requests activated in this process are compressed into one change log request. Only this request can be rolled back fully from the DataStore object.
Settings for parallel processing:
By choosing Extras PSA -> Parallel DTA, you can set the maximum number of parallel processes for the update process from DataSource 3.x of this InfoPackage to the defined data targets. For determining SIDs as well as for activating requests, processing is set to be carried out in three parallel processes. You can change this setting. If you change the setting to 1, serial processing occurs. Processing is controlled by BI background management. More information: Setting Parallel Processing of BI Processes
created for monitoring the BI processes. The settings for parallel processing are stored in tables RSBATCHPARALLEL and RSBATCHSERVER. BI background management supports most of the load and administration processes in BI. More information: Process Types Supported by BI Background Management There is one initial setting for parallel processing for each BI process type supported by BI background management. In BI background management, these initial settings are displayed for a process type as <BI Process> Without Process Chain Variant, for example Data Transfer Process Without Process Chain Variant. In the default setting, three parallel background work processes are set for the BI process types, and processing is divided among all the servers of the system. The default is 1 (serial processing) only when updating data from the PSA into an InfoProvider with PSAPROCESS. You can change the initial setting for a process type in the BI background management. The initial setting for a BI process type is valid for a BI process of this type as long as this process has not maintained its own settings for parallelization. You can change the settings for parallelization for a BI process in the relevant maintenance dialog of the process in the Data Warehousing Workbench or in the relevant variant maintenance of the process chain maintenance. This overrides the initial setting of the process type for exactly this process. Only when the settings for parallel processing for a BI process are saved in the (variant) maintenance of the process can you display these settings in the BI background management and change them there in the central mass maintenance. The initial setting for a BI process type also appears here only if it has been changed or if settings in the (variant) maintenance of the process were saved for a process of this type. You thus have the following options for setting parallel processing of BI processes:
Here you change the setting for a process type and thus for all processes of this type for which you did not make your own settings.
Set parallel processing for a specific BI process in the (variant) maintenance of the process
Here you set parallel processing for a specific BI process if the setting differs from the setting for the process type.
Set parallel processing for a BI process in the mass maintenance of BI background management
Here you can make the settings in a table overview for all the BI processes for which parallel processing settings were already saved in the (variant) maintenance of the process.
Prerequisites
Make sure that there are sufficient background processes (type BTC) available to process the BI processes in the system in parallel. More background work processes (approximately 50% more) are needed than without BI background management. More information: Activating Background Processing
If you enter 1, the BI process is processed serially. If you enter a number larger than 1, the BI process is processed in parallel.
Note that a degree of parallelization (number of processes) greater than or equal to 2 means that further work processes are split off from the main process, which also monitors the work processes and distributes the work packages.
5. In the group frame Parallel Processing, make the relevant settings for parallel processing in the background:
1. ... 3. a. Enter a job class for defining the job priority. The job priority defines how the jobs are distributed among the available background work processes.
You can define whether, and how many, background work processes for jobs of class A (high priority) should be reserved. In this case the reserved work processes are kept for class A jobs (high priority). Class B jobs (medium priority) obtain the other free work processes before class C jobs (low priority). If you do not reserve any work processes for class A jobs, first the class A jobs get the free work processes, then class B jobs, and finally class C jobs. When defining the job class, consider in particular the effect of class A jobs on the use of
More information:
Assigning a Priority to Class A Jobs Rules for Work Process Distribution 4. b. To distribute the processing of the BI processes optimally to your system resources, define the server(s) on which the work processes should be executed under Server/Host/Group on Which Additional Processes Should Run . More information: Define Load Distribution for BI Processes in Background.
6. In the group frame Parallel Processing, you can define whether parallel processing should take place in dialog work processes or in background work processes for the processes ODSACTIVAT, ODSSID and ODSREQUDEL for the DataStore object.
If you select Dialog, you can define the load distribution of an RFC server group as Server Group for Parallel Dialog Processing.
You can create RFC server groups in transaction RZ12. More information: Defining RFC Groups for Parallel Processing Jobs
7. To write the settings to a transport request, choose Transport. The entries in tables RSBATCHPARALLEL and RSBATCHSERVER (for hosts and server groups) are written on a transport request of the Change and Transport System. 8. Save your entries.
Set parallel processing for a specific BI process in the (variant) maintenance of the process
To overwrite the initial setting of the process type for a specific BI process:
...
1.
You can call the function in the process variant maintenance of a process chain or in the process maintenance. The function call varies for the different BI processes. For example, in the data transfer process you call the function with Goto Background Manager Settings. The Settings for Parallel Processing dialog box appears.
2. Proceed as described above in steps 4 - 8.
Once you have saved the settings, this BI process is displayed in the BI background management.
Set parallel processing for a BI process in the mass maintenance of BI background management
...
1. In the Data Warehousing Workbench choose Administration Current Settings Batch Manager or transaction RSBATCH.
The screen Maintain Parallel Processing Settingsappears, in which all BI processes and their settings are displayed in a table, as well as the initial settings for the process types. This user interface gives you a good overview of multiple BI processes and simplifies the maintenance for multiple BI processes.
With Process Type Selection you can improve the clarity by restricting the display of the table to a selected process type.
3. In the corresponding columns, make the settings described above in steps 4 - 8 concerning the number of parallel work processes, the work process type, the servers to be used, and possibly the job priority. Save your entries.
Result
The BI processes are processed according to your settings. If there is no work process free on the assigned servers for a BI process with parallel processing, it is normally processed serially.
Features
Choose the Reconstruct tab page. Reconstructing the DataStore object gives you data from the change log, so that you can load deltas into the reconstructed DataStore object. Choose Selection to restrict the display of the request for a DataStore object, an InfoCube, an InfoSource, or a source system. All the requests that have already been loaded for your selection are displayed. You can select individual requests, and use them to reconstruct the DataStore object. To do this, choose Reconstruct/Add. You can either start loading the DataStore object immediately, or schedule the load as a background job. In Subsequent Processing, specify the events that you want to execute once the process is complete.
If you select more than one request for reconstruction, deactivate the functions for automatic activation and automatic update. You can also reconstruct using Process Chains. Using Parallelism, you can apply settings for processing requests. The default setting is serial. If you enter a value not equal to one for the number of processes, parallel processing occurs. Processing is controlled by BI background management. For more information, see BI Background Management.
To ensure that the various processes do not lock each other out, SAP recommends that you use process chains. See also the example for Including DataStore Objects in Process Chains. When designing complex process chains, you yourself have to ensure that these processes do not lock each other.
While Loading
It is possible to:
During Activation
It is not possible to:
Delete the contents of the DataStore object Delete data request by request Delete data selectively Delete master data Archive data Reactivate data Update data to other InfoProviders.
Delete the contents of the DataStore object Start deleting more requests Delete data selectively Activate data Archive data Update data to other InfoProviders.
Delete the contents of the DataStore object Delete data request by request Start deleting more data selectively Activate data Archive data Update data to other InfoProviders.
Features
The Load Status column shows whether the data was loaded successfully to the DataStore object or if errors occurred. If there are errors shown in the load status (red traffic light), you can branch straight to the Manage DataStore Object function. The Activation column shows whether all the data has been activated yet. If the data has not yet been activated, you can start activation by choosing Activate DataStore Object Data.
The Update column shows whether all DataStore object data has been updated to the connected InfoProviders yet. If the data has not yet been updated, you can update the data to the InfoProviders by choosing the pushbutton. In addition, you can jump to the DataStore object display or maintenance screen from here. Select one or more DataStore objects and choose Display (or double-click on the name), or Manage.
Activities
In the Data Warehousing Workbench, choose Administration DataStore Objects.