Sie sind auf Seite 1von 23

Product Subject Area

This document provides an overview of the Product Subject Area design followed in RBC DATA
MART (RBC) version 3.1.

Background
The re-architecture and migration of classical BANK CORE SYSTEM Enterprise Warehouse (REW)
to RBC, involves extending the warehouse functionality. A significant step in this regard is to move
from a passive data warehouse to an active data warehouse model. Active data warehousing is a
process that provides an integrated, consistent repository of data to drive strategic and operational
decision support within an organisation.
Some of the benefits of using RBC are as follows:
Reduced costs through efficient processes
Implementation of regional model
Remove reliance from legacy technology and move towards new technology
Facilitate the installation of BI warehouse in BANK CORE SYSTEM or Non-BANK CORE
SYSTEM countries
National Language Support (NLS).
The purpose of One RBC DATA MARTis to implement a single Group Enterprise Warehousing
strategy.
This was identified as a central component in the Global Business Intelligence Program initiated in
November 2002.
When the Data Centre of Excellence (DCOE) was created in 2003, one of its key responsibilities was
the creation and maintenance of the Group Enterprise Warehouse Model (GEWM).
The Group Enterprise Warehouse is the group standard for logical warehouse modelling. This model
is also based on IBM Information Framework (IFW), but has been further enhanced and customised
to suit BANKs requirements.
GEWM consists of the following nine data concepts:
Arrangement
Business Direction Item
Condition
Classification
Event
Involved Party
Location
Product
Resource Item
To align with the Group strategy, RBC is designed with the following considerations:
Single source of integrated data
Single source of cleansed, integrated data for decision support and analysis, where the data is
consolidated and updated at near-real time.
Regionalization
A common core software version for a region.
Internationalization
Second Language Description field for all reference tables.
De-linking from BANK CORE SYSTEM
Deployable in any RBC DATA MARTsite with or without BANK CORE SYSTEM; the
current BANK CORE SYSTEM centric approach is removed
Existing data model is re-designed to be R2 compliant, which means that it adheres to
technologies and the method that prepares RBC DATA MARTfor the future
New platform replacing iSeries; and data standards pioneered by Data Centre of Excellence
(DCOE) have been adopted to ensure group-wide consistency
Database construction features of ERwin are used to create the physical database on the DBMS.
Following this, the MetaStage tool takes this design into the metadata repository.
DataStage tool is used to access the metadata for automated development of the appropriate
extraction, staging, population and distribution programs
To describe the detailed relationship amongst the data elements and entities the logical database
design technique, Relational Modelling is used to eliminate data redundancy by achieving higher
normal form (RBC warehouse is in 3+ Normal Form).
Each table in the database is normally controlled by a key (ID) that defines the uniqueness of each
record. This key is known as Surrogate Key. These unique IDs are created using DB2 function based
on the unique source identifier key.
In compliance with the GEWM, BI warehouse contains the following nine major data concepts:
Arrangement
Business Direction Item
Condition Classification
Involved Party
Event
Product
Location
Resource Item
The scope of this document is to describe the design overview of Product Subject Area. To satisfy
the standardized approach of feeding data to the RBC warehouse database, interface files are used.
These interface files are structured and are based on the logical model attributes of their respective
subject areas.
For example, structure of product interface file considers all the data items as per the logical data
model rather than the physical model.

Overview
According to IFW Banking Data Warehouse Model (BDWM), Product can be defined as A
dimension, which classifies measures by describing which values apply to specific goods or services
offered, sold, or purchased by the financial institution (Bank), and its other involved parties or in
which the financial institution has an interest during the normal course of its business activity.
Accordingly, product identifies goods and services that can be offered, sold, or purchased by the
bank.
The Product Subject Area contains data relating to the different products RBC DATA MARTcan offer
for sale to its customers. This Subject Area holds Product, together with its attributive, associative,
classification and immediate subtype entities.
Following is a list of the important entities in Product Subject Area:
Product
Product class code
Product class code product (relationship between product and product class code)
These and other entities in product subject area are related through a unique surrogate key called
product ID.
In adherence to the standardized approach of feeding data to the BI warehouse database, interface
files are used. Accordingly, data to those entities in product subject area are fed from product
interface file.
The interface file consolidates data from various source systems and acts as standard input facilitating
a unified Extract, Transform and Load (ETL) process into BI warehouse database.
Data into this standard interface file can be fed from any source system (BANK CORE SYSTEM or
non-BANK CORE SYSTEM).

AU Subject

Process flow
The following figure illustrates the process flow diagram for Accounting Unit and Product Related
Posting Entries.
Process flow diagram for Accounting Unit

Process flow diagram for Product Related Posting Entries

You can refer to the following points to get a better understanding of the aforementioned process flow
diagram for Accounting Unit Subject Area. Numbers in the given diagram correspond to the number
points described in the following list:
1. Source extraction for Accounting Unit information from Group Standard Solution Source
System, are considered from their respective files
2. Through the CDC process, change information from Group Standard Solution source systems
are captured and propagated to pSeries
3. For non-Group Standard Solution source systems, the local IT should use the ETL tool to
extract their source data into DB table which has the standard interface layout
Note
The local IT may use any other programming solution as an
alternative to ETL tools to generate the source data into a
sequential file with the standard interface layout. However, this
depends upon the regional requirements.

4. Information coming from Group Standard (Core Source) Systems is populated into a DB2
table in BI stage. Similar information coming from the non-standard source systems is fed
into DB2 table or sequential text file in BI stage
5. These interface files, from Group Standard Solution source system and non- Group Standard
Solution source system are consolidated, and local Codes are converted to Regional codes
6. Information is then regionalised and consolidated into a dataset format.
7. ETL process, triggered by the scheduler:
a. Extracts information from the interface file.
b. Performs RI check, Data Integrity check
c. Generates Surrogate Key (where applicable)
d. Performs data type changes, if required
e. Performs data cleansing, if required
f. Populates (loads) data into target tables
8. Data records failed during RI check and data integrity check are then fed into Audit and
Error handling process, where the rejected and statistical information is fed back to source
system for further verification
9. Information is then populated into their corresponding tables in Accounting Unit Subject
Area.

Implementing RBC DATA MART for the first time


If the site is implementing OCBB for the first time, the following process is carried out:
After the environment setup, conversion jobs are run to migrate existing information, if any, from
existing warehouse (for example from HEW3.5) to BI warehouse
An Initial run is run, during which, full image from source systems which are related to
Accounting Unit (which are not related to Posting Related Entries) are fed into Accounting Unit
interface file. As normal daily (or near real-time) run processes, incremental changes are fed into
Accounting Unit interface file.
For GE 3.2 onwards all the Posting Related entries are moved to Month End process instead of
daily. So there is no concept of daily or initial execution for Product Related Posting Entries.
Information from interface file is then fed into ETL process for populating the target BI
warehouse.
Source extraction
This section provides information about the source system, its data validation process and key fields.

Source system
Accounting Unit data is sourced from Group Standard (Core Source system) System and/or non-
Group Standard systems. While sourcing their data feed, the source systems should ensure the
following:
Provide data in mandatory fields
Provide Unique Identifier Keys from their source system
Verify local to regional conversion mapping

Data validation process


OCBB ETL design has a building process to validate the data accuracy; all the data defined as
Mandatory that failed in the checking process are rejected. The Optional data field is reset to their
default value. To avoid the rejection, a cleanup and data enrichment process needs to be applied by
the local site.
In all the standard Accounting Unit interfaces, the fields have been either defined as Mandatory or
Optional.
For mandatory fields, source data is required and value needs to be valid; Key field as well as
Date/Time format have to be correctly formatted.
If data item is deemed optional, it can be left as NULL and all the validation process are to be skipped
in this case.
The validation rules applied to the checking process are as explained in the following sections.

Key fields
The Accounting Unit Subject Area supports one Surrogate Key. All the source keys used to determine
the Surrogate Key are mandatory, and must be valid.
Accounting Unit Identifier SK is derived as follows:
Accounting Unit SK = Unique Accounting Unit Identifier Key in source + Source System Code
For example,
If the source system code is HUB, then HUB needs to be defined as a valid code in the source
system code table.
For Posting Entries the surrogate (Post_Entr_Seq_Num) key will be only one. There should be no
duplicate entries at any point of time.
This would be generated by using the YYYYMM (Current Processing Date) + generated Sequence
Number. Ex: 20080800000000001.

.
Foreign Keys and Referential Integrity
Most of the code fields are defined as optional; if their values have been provided, then these values
must be present in the corresponding Regional Code Table. In general, any value that is not blank or
NULL is subject to Referential Integrity check and is rejected if it fails to pass this test.

Application Subject area

Introduction
This document provides an overview of the Application Subject Area design followed in One BANK
Business Intelligence (OCBB) Core 3.1.

Background
Based on the Basel II framework and FSA guidance, banks are required to maintain sufficient data
history to support depth analysis and monitoring of rating systems. This guidance also specifies that
banks must have a minimum of five years of internal data to calculate Probability of Default (PD) and
seven years of data for Loss Given Default (LGD) and Exposure at Default (EAD) estimates.
In order to comply with these requirements, there is a need to set up a Credit Database that facilitates
the calculation of these estimates. Following various discussions on this subject it was decided to use
the BANK Group data warehouse structure known as Business Intelligence R2 (BI R2) to support this
initiative. Business Intelligence R2 is targeted to be the standard Active Enterprise Data Warehouse
and analytical solution for BANK Group with BI R2 being one of its components.
BI R2 version 1 was developed to replace HEW (HUB Enterprise Warehouse 3.5) currently active in
over thirty HUB countries and is projected to be the standard warehouse solution for all group
countries including non-HUB. Currently, version 1.6 of BI R2 is developed by BANK Software
House and this version will add to version 1, a set of Change Requests, identified during the
development of that version while including enhancements for performance optimization on data
retrieval. Version 1.6 is the base on which version 1.7 is developed. Similarly, OCBB version 1.7 is
the base on which version 3.1 is developed.
OCBB Core 3.1 is a subsequent version of OCBB R2 version 1.7. The scope of this project also
covers any local implementation judged necessary for the overall solution to be satisfactory for
HBME as well as the roll-out of this solution for the other countries in the region.

Overview
The Application Subject Area establishes a relational data infrastructure for Credit and Risk
information which comprises a centralised, consistent, and accurate information repository providing
the functionality of Subject Areas like the Application, Arrangement, Event and Involved Party in
OCBB data warehouse.
Application Subject Area is sourced from Credit Approvals and Risk Management (CARM) system.
Current and backfill data from CARM sources are extracted, consolidated, transformed, and loaded
into a single relational database in the AIX environment.
The proposed database allows Credit and Risk Analysts to employ wider array of tools, including
Cognos ReportNet Framework, for menu-driven query and ad-hoc reporting.
The design supports comprehensive analysis spanning seven years of Credit and Risk information in
OCBB data warehouse.
It also improves overall productivity of Credit and Risk Analysts by reducing the amount of time it
takes to find relevant information and answer ad-hoc questions without specialised knowledge of the
underlying data sources or structures.
This Subject Area holds Credit and Risk information, together with its attributive, associative,
classification and immediate sub-type entities.
Following is a list of the important entities in the Application Subject Area:
Application Arrangement
Application Customer
Application Customer Organization
Application Customer Security Detail
Application Customer Credit Facility Detail
Application Customer Credit Facility Security Relationship
Corporate Credit Facility Application Arrangement.
Corporate Credit Facility Application Product
Application Customer Security Security Relationship
These and other entities in Application Subject Area are related through a unique Surrogate Key
called, Application ID.
In adherence to the standardized approach of feeding data to the BI warehouse database, interface
files are used. Accordingly, data to those entities in Application Subject Area are fed from three
Application interface files.
The interface files consolidate data from CARM source systems and acts as standard input facilitating
a unified Extract, Transform and Load (ETL) process into BI database warehouse.
The Application ETL process involves some transformation process to handle the field name
mapping, data type changes and simple data cleansing.
It also applies business rules to transform the amounts and customer percent (PCT) for a specific
period.
Only the applications with approved or declined application status code are to be considered for
source extraction.
The first data type change is the amount field attribute. Starting from CARM, the amount is expressed
as a DECIMAL with n DECIMAL places, where n is based on the related currency.
The second data type change is in some of the DATE attributes. They are expressed as true dates in
OCBB, instead of a series of numbers. This filters invalid DATE values from the source as well.
The interface tables are initiated as part of OCBB architecture to satisfy the standardized approach of
feeding data to the OCBB database. This reduces the risk of dependency failures and change impacts;
and helps build a modular, loosely coupled solution.
The interface tables consolidate data from CARM source system and act as a stage. Bounded by the
data requirements of OCBB, the Application interface tables are based upon the major concepts of the
Application Subject Area and data cardinality.
The Application interface tables contain data related to the Application Subject Area and an indicator
representing the change action (INSERT, UPDATE or DELETE), as required.
These jobs are the processes that feed data from the Application interface tables into the
corresponding tables in OCBB data warehouse. It involves some transformation process to handle the
functions:
Field name mapping,
Data type changing,
Simple data cleansing,
Default value handling
Surrogate Key (SK) generation/conversion.
The SK generation/conversion is required because the Application ID is unique at the regional level,
within OCBB.
Finally, all validated data are mapped to their corresponding target tables with minimal data type
changes. Depending on the change action type, either an INSERT or UPDATE is performed against
the target tables.

EVENT Subject Area:

Overview
According to IFW Banking Data Warehouse Model (BDWM), Event includes communications,
accounting and maintenance transactions and posting entries. Bank customers, vendors, employees
and other Involved Parties initiate actions through communications with the Financial Institution in
order to make requests of, and participate in transactions with the Financial Institution. Information
about the action, such as when and at what location the action occurred and what if any additional
actions are required, is kept.
In compliance with GEWM, posting entries are made as a part of the Accounting Unit Subject Area.
The Event Subject Area contains data related to transactions, communications, campaigns, and
relationships between events. This Subject Area holds Event, together with its attributive, associative,
classification and immediate subtype entities. Accordingly, the following are the important entities in
the Event Subject Area:
Event (super type of the Subject Area)
Transaction
Communication
Contacts
Referrals
Campaign
Relationships
Event to product relationship
Event to IP relationship
Arrangement to Event relationship
Communication to Communication relationship
Campaign to Segment relationship
These and other entities in the Event Subject Area are related through a unique surrogate key called
Event ID. These Surrogate Keys are generated based on the unique Event Identifier Key using DB2
function.
In adherence to the standardized approach of feeding data to the BI warehouse database, interface
files are used. Accordingly, data to those entities in the Event Subject Area are fed from Event
interface files.
The interface file consolidates data from various source systems and acts as a standard input,
facilitating a unified Extract, Transform and Load (ETL) process into BI warehouse database. Data
into this standard interface file is fed in from any source system (HUB or non-HUB).
Process flow
The following figure illustrates the process flow diagram for Event Subject Area.

Process flow diagram for Event Subject Area

The following points can be referred to get a better understanding of the aforementioned process flow
diagram for Event Subject Area. Numbers in the given diagram correspond to the number points
described in the following list:
10. Source extraction from Group Standard Solution Source System for event information is
considered from their respective master files
11. The CDC process captures change information from Group Standard Solution source systems
and propagates it to the pSeries
12. For non-Group Standard Solution source systems, the local IT should use the ETL tool to
extract their source data into DB table which has the standard Event interface layout
13. Information coming from Group Standard (Core Source) Systems is populated into a DB2
table in BI stage. Similar information coming from non-standard source systems is fed into a DB2
table or sequential text file in the BI stage.
14. These interface files, from Group Standard Solution Source System and non-Group Standard
Solution Source System, are then consolidated. Appropriate data conversion is applied to convert
local codes and other information to regional counterparts. In addition the Last HUB Processing
Date is fetched from HUB DATE control file, which is maintained in BI stage.
15. Consolidated and regionalized information is then fed into Event interface files in DataStage
dataset format.
16. ETL process, triggered by the scheduler:
g. Extracts information from the interface file
h. Performs Referential Integrity (RI) checks, Data Integrity checks
i. Generates surrogate keys
j. Performs data type changes, if required
k. Performs data cleansing, if required
l. Populates (loads) data to target tables
17. Data records failing RI checks and Data Integrity checks are then fed into Audit and Error
handling process. Rejection and statistical information is fed back to the source system for further
verification
18. Information is then populated into their respective tables in the Event Subject Area.

Implementing OCBB for the first time


If the site is implementing OCBB for the first time, the following process is carried out:
After the environment setup, conversion jobs are run to migrate existing information, if any, from
existing warehouse (for example, from HEW3.5) to BI warehouse

An Initial run is run, during which, full image from source systems are fed into interface file.
During the regular daily (or near real-time) run process incremental changes are fed into
interface file(s). Information from interface file(s) is then fed into ETL process for target BI
warehouse population.

Involved party :

Overview
According to IFW Banking Data Warehouse Model (BDWM), IP is defined as, Any individual,
group of individuals, organization, organization unit, or employment position about which, the
financial institution wishes to keep information. This entity type provides the fundamental data
object required for references in maintaining relationships with accounts (Arrangement/Involved
Party Relationship), alternative identification schemes (Involved Party Alternative Identifier), and
various inter-relationships (Involved Party/Involved Party Relationship). IP is also the super-type for
a number of entity-types, including Individual and Organisation.
The IP Subject Area contains data related to the customer (retail, commercial, potential), employee,
employee position, organisation unit, segment and relationship between the IPs.
This subject area holds Involved Party, together with its attributive, associative, classification and
immediate subtype entities. Accordingly, following are the important entities in Involved Party
Subject Area:
Involved party (super type of the subject area)
Customer
Customer contact preferences
Customer assets and liabilities
Organization
Individual
Individual dependant and occupation
Employee
Household
Organization unit
IP to IP relationship
IP credit risk profile
Segment

The aforementioned entities and other entities in IP Subject Area are related through a unique
Surrogate Key called IP ID. These Surrogate Keys are generated based on the unique Involved Party
identifier key using the DB2 function.
In adherence to the standardized approach of feeding data to the BI warehouse database, interface
files are used. Accordingly, data to those entities in IP Subject Area are fed from IP interface files.
The interface file consolidates data from various source systems and acts as a standard input,
facilitating a unified Extract, Transform and Load (ETL) process into BI warehouse database.
Data into this standard interface file is fed in from any source system (HUB or non-HUB).

Location Subject Area

Overview
According to IFW Banking Data Warehouse Model (BDWM), Location is a place where something
can be found, a destination of information or a bounded area, such as country or a state. Types of
location include postal address, phone numbers, email addresses and geographic area. Usually these
locations are used for the purpose of locating Involved Parties.
The Location Subject Area contains data related to postal address, city-state-country information (as
code tables), geographic area, email address, telephone address and relationships between Location
and other entities like IP.
This subject area holds Location, together with its attributive, associative, classification and
immediate subtype entities. Accordingly, following are the important entities in Location Subject
Area:
Country and state (as code lookup tables)
Country to currency relationship (as code lookup table)
Electronic mail address
Postal address
Postal address line
Telephone address
In adherence to the standardized approach of feeding data to the BI warehouse database, interface
files are used. Accordingly, data to those entities in Location Subject Area are fed from Location
interface files.
The interface file consolidates data from various source systems and acts as a standard input,
facilitating a unified Extract, Transform and Load (ETL) process into BI warehouse database. Data
into this standard interface file is fed in from any source system (HUB or non-HUB).

Condition Subject Area

Overview
According to IFW Banking Data Warehouse Model (BDWM), Condition describes the specific
requirements that pertain to how the business of a financial institution is conducted and includes
information such as prerequisite or qualification criteria and restrictions or limits associated with
these requirements. Conditions can apply to various aspects of a financial institutions operations,
such as the sale and servicing of products, the determination of eligibility to purchase a product, the
authority to perform business transactions, the assignment of specific general ledger accounts
appropriate for different business transactions, the required file retention periods for various types of
information and the selection criteria for a market segment.(IBM Unique ID: BDW12594)
Accordingly, Condition describes data concerning conditions related to financial products and other
aspects of the business. Examples include the following financial aspects:
Interest rates
Terms
Pricing structures.
This Subject Area holds Condition, together with its attributive, associative, classification and
immediate subtype entities.
Following is a list of the important entities in Condition Subject Area:
Condition
Price Control
Credit Plan
Arrangement Condition Relationship
Condition Condition Relationship
These and other entities in Condition subject area are related through a unique surrogate key called
Condition ID.
In adherence to the standardized approach of feeding data to the BI warehouse database, interface
files are used. Accordingly, data to those entities in Condition Subject Area are fed from Condition
interface file.
The interface file consolidates data from various source systems and acts as standard input facilitating
a unified Extract, Transform, Load (ETL) process into BI warehouse database.
Data into this standard interface file can be fed from any source system (HUB or non-HUB).

Arrangement Subject Area :

Overview
According to IFW Banking Data Warehouse Model (BDWM), Arrangement represents an
agreement, either potential or actual, involving two or more Involved Parties, that provides and
affirms the rules and obligations associated with the sale, exchange or provision of goods and
service.
For example,
Arrangement number 123: a specific Certificate of Deposit agreement between the Financial
Institution and John Doe.
While the logical model for Arrangement contains many levels of sub-typing, the default physical
model reduces this to the following two types:
An Arrangement super-type
A set of specific Arrangement sub-types.
Every instance of an Arrangement subtype is represented by an instance of an Arrangement super-
type.
In this implementation, Arrangements are broadly equivalent to Accounts or other product instances
such as insurance policies. Four separate sub-types of Arrangement are used to represent different
variations from the BDW.
This Subject Area holds Arrangement, together with its attributive and associative classifications and
immediate sub-type entities.
Accordingly, following are the main entities in Arrangement Subject Area:
Arrangement (super-type of the subject area)
Account facility arrangement
Automated payment arrangement
Access facility arrangement
Service arrangement
Card access arrangement

Standing instructions arrangement


Product arrangement
Account arrangement (account arrangement is a sub-type of product arrangement)
Deposit arrangement
Loan arrangement
Automobile loan arrangement

Credit facility arrangement


Line of credit arrangement
Credit card arrangement

Insurance arrangement
Life insurance arrangement
Non-life insurance arrangement

Investment arrangement
Instrument holding
Financial market instrument

Treasury trading arrangement


Non account arrangement
Non account service arrangement
Merchant card machine arrangement
Security arrangement
Guarantee arrangement
Relationships
Arrangement to IP relationship
Arrangement to Arrangement relationship
These and other entities in Arrangement Subject Area are related through a unique Surrogate Key
called Arrangement Id. This Surrogate Key is generated based upon the Unique Arrangement
Identifier Key using DB2 function.
In adherence to the standardized approach of feeding data to the BI warehouse database, interface
files are used. Accordingly, data to those entities in Event Subject Area are fed using the Arrangement
interface files.
The interface file consolidates data from various source systems and acts as a standard input
facilitating a unified Extract, Transform, Load (ETL) process into BI warehouse database.
Data into this standard interface file is fed in from any source systems (Group standard solution
system like HUB or non-Group standard solution systems).
Process flow
The following figure illustrates the process flow diagram for Application Subject Area.

Process flow diagram for Application Subject Area

The following points can be referred to get a better understanding of the aforementioned process flow
diagram for Application Subject Area. Numbers in the given diagram correspond to the number points
described in the following list:
19. Credit and Risk source data and other necessary information are fed from CARM system
20. The CARM Generic Interface XML messages are extracted from the Messaging Queue (MQ)
and placed in a staging area in OCBB system
21. These messages are then accessed directly from DataStage, which extracts the Application
data from these XML messages. It performs the following transformation before feeding
them to the DB2 tables:.
m. Only applications with approved or declined application status code are to be considered for
the source extraction
n. Adjust decimal format for the amount fields
o. Convert numeric dates into ISO DATE format
p. Cleanse data by setting the correct default values
q. Convert CARM customer number (13 bytes) extracted from the CARM to the standard HUB
customer number (15 bytes)
r. Lookup CUST_START_DT
s. Apply business rules to calculate the period's amount and PCT for the customer
22. The following Application interface DB2 tables are fed with the data extracted from CARM
system, respectively. The process flow is updated by the following interfaces:
t. APP_INTERFACE_HEW
u. APP_CUST_CRED_FCL_DET_INTERFACE_HEW
v. APP_CUST_SEC_DET_INTERFACE_HEW
w. APP_CUST_SEC_SEC_REL_INTERFACE_HEW
23. These three interface tables are the source of Application interface to target ETL process
24. This SK generation process generates new Surrogate Key for the new records. This should be
accomplished before the process can continue. All Application records from CARM system
are inserted into HEW_APP_ARR table by the new SK. Old Application with the same
REL_NUM is ended by setting the APP_END_DT to APP_APPROVE_DT of the new
Application
25. Processes 7 to 9 need to lookup the SK, transformation and output data to the related
Application customer tables.
Source extraction
This section provides information about the source system, its data validation process and key fields.

Source system
Application data is sourced from the Group Standard Solution Systems (for example, CARM). While
sourcing their data feed, the source systems should ensure the following:
Provide data in mandatory fields
Provide unique product identifier from their source system.

Data validation process


OCBB ETL design has a building process to validate the data accuracy; all the data defined as
Mandatory that failed in the checking process are rejected. The Optional data field is reset to their
default value. To avoid the rejection, a cleanup and data enrichment process needs to be applied by
the local site.
In all the standard Application interfaces, the fields have been either defined as Mandatory or
Optional.
For mandatory fields, source data is required and value needs to be valid; Key field as well as
Date/Time format have to be correctly formatted.
If data item is deemed optional, it can be left, as NULL and all the validation process are to be
skipped in this case.
The validation rules applied to the checking process are as explained in the following sections.

Key Fields
Application Subject Area has three Surrogate Keys. All the source keys used for determining the
Surrogate Key are mandatory and are required to be valid.
Application Identifier (ARR_ID_APP) is determined by the Source System Code
(APP_SRCE_SYS_CDE) and Application Number (APP_NUM), which is concatenated from
REL_NUM and APP_SER_NUM. The Source System Code needs to be defined in the Regional code
table HEW_SRCE_SYS_CDE.
For example, if the Source System Code is CARM, then CARM needs to be defined as a valid
code in the Source System Code table.
The key fields APP_NUM, CUST_SRCE_KEY and APP_SRCE_SYS_CDE cannot be NULL or
empty. Invalid rows are sent for reject handling.
Process flow
The following figure illustrates the process flow diagram for Product Subject Area.

System Flow diagram for Product

pSeries
2 9

Source BI Stage BI W/H


6

1 Conversion

P:Product & related interface files.ds


P:Product & related interface files.ds
Xref file
Group-Std 4
source System
Product Subject Area
Lookup
CDC Process

Product 7
3 related
Interface files Consolidation ETL Product
(db2 file) & Processes Tables
Non-Group-Std
Local to Regional
source System
Conversion Product
Related
Product related Tables
Product 5
interface files
related interface
(text file)
files 8
(text file)
Audit & Error
Handling

Process flow diagram for Product Subject Area

You can refer to the following points to get a better understanding of the aforementioned process flow
diagram for Product Subject Area. Numbers in the given diagram correspond to the number points
described in the following list:
26. Source extraction for any address master files from Group Standard Solution Source System
(for example, WHIRL, BANK CORE SYSTEM and so on), are converted to the standard
address interface structure
27. Through the CDC process, change information from Group Standard Solution source systems
are captured and propagated to pSeries
28. For non-Group Standard Solution source systems, the local IT should use the ETL tool to
extract their source data into DB table which has the standard Product interface layout

Note
The local IT may use any other programming solution as an
alternative to ETL tools to generate the source data into a
sequential file with the standard Product interface layout. However,
this depends upon the regional requirements.

29. Information coming from Group Standard (Core Source) Systems is populated into a DB2
table in BI stage. Similar information coming from the non-standard source systems is fed
into DB2 table or sequential text file in BI stage
30. These interface files, from Group Standard Solution source system and non- Group Standard
Solution source system are consolidated. And appropriate data conversion is applied to
convert local codes and other information to regional.
31. Consolidated and regionalised information is then fed into a data set product interface.
32. ETL process, triggered by the scheduler:
x. Extracts information from the interface file.
y. Performs RI check, Data integrity check
z. Generates Surrogate Key
aa. Performs data type changes, if required
bb. Performs data cleansing, if required
cc. Populates (loads) data into target tables
33. Data records failed during RI check and data integrity check are then fed into Audit and
Error handling process, where the rejected and statistical information is fed back to source
system for further verification
34. Corresponding entities and attributes are populated in BI warehouse.
For the first time when the site is implementing the software (BI R2), following process is
carried out,
After the environment set-up, conversion jobs are run to migrate existing information, if any
(for example from HEW3.5), to BI warehouse.
An Initial Run is completed, during which, full image from source systems is fed into
interface file.
And as normal daily (or near real-time) run process, incremental changes are fed into
interface file.
Information from interface file is then fed into ETL process for the target BI warehouse
population.
Source extraction
This section provides information about the source system, its data validation process and key fields.

Source system
Product data is sourced from Group Standard Solution Systems (for example, BANK CORE
SYSTEM, WHIRL) and/or non- Group Standard Solution Systems. While sourcing their data feed the
source systems should:
Provide data in mandatory fields
Provide unique product identifier from their source system
And verify the local to regional conversion mapping.

Data validation process


RBC ETL design has a building process to validate the data accuracy; all the data defined as
Mandatory that failed in the checking process are rejected. The Optional data field is reset to their
default value. To avoid the rejection, a cleanup and data enrichment process needs to be applied by
the local site.
In all the standard Product interfaces, the fields have been either defined as Mandatory or
Optional.
For mandatory fields, source data is required and value needs to be valid - that is RI has to be
addressed, Key field as well as Date/Time format have to be correctly formatted.
If data item is deemed optional, it can be left, as null and all the validation process will be skipped.
The validation rules applied to the checking process are as explained in the following sections.

Key Fields
Product Subject Area has three surrogate keys. All the source keys used for determining the surrogate
key are mandatory and are required to be valid,
Product Identifier (PROD_ID) is determined by Source System Code (SRCE_SYS_CDE) and Source
Product ID (PROD_SRCE_SYS_ID). The Source System Code needs to be defined in the Regional
code table - HEW_SRCE_SYS_CDE.
Rate Cost of Funds Value of Funds Identifier is determined by Source System Code
(SRCE_SYS_CDE) and Source Rate Cost of Funds Value of Funds Identifier
(RATE_COF_VOF_ID). The Source System Code needs to be defined in the Regional code table -
HEW_SRCE_SYS_CDE.
Resource item identifier is determined by source system code (SRCE_SYS_CDE) and unique
identifier for the resource item from source. The Source System Code needs to be defined in the
Regional code table - HEW_SRCE_SYS_CDE.
For example, The source system code is BANK CORE SYSTEM, then BANK CORE SYSTEM
needs to be defined as a valid code in the source system code table.
Date/Time/Timestamp field
RBC supports only the Date/Time/Timestamp format in ISO format (example, YYYY-MM-DD
HH:MM:SS.xxxxxx). All the source fields defined as Date/Time/Timestamp are required to be
converted to this format.
The value of the Date/Time needs to be valid. In case of the following dates, the format is correct.
However, the values are not valid as Month 02 (February) does not have day 29 or 30. Therefore,
2000-02-30 and 1999-02-29 are both considered invalid and will be rejected in the ETL process.

Das könnte Ihnen auch gefallen