Sie sind auf Seite 1von 53

SAP BO Data Services

27/08/2013

TransCit DS Dev - Off


IS-Life Sciences1-Parent
Deepika Rai (202931)
SAP Business objects Data Services (BODS)
Deepika.rai@tcs.com

Confidentiality Statement
Include the confidentiality statement within the box provided. This has to be legally
approved
Confidentiality and Non-Disclosure Notice
The information contained in this document is confidential and proprietary to TATA
Consultancy Services. This information may not be disclosed, duplicated or used for any
other purposes. The information contained in this document may not be released in
whole or in part outside TCS for any purpose without the express written permission of
TATA Consultancy Services.

Tata Code of Conduct


We, in our dealings, are self-regulated by a Code of Conduct as enshrined in the Tata
Code of Conduct. We request your support in helping us adhere to the Code in letter and
spirit. We request that any violation or potential violation of the Code by any person be
promptly brought to the notice of the Local Ethics Counsellor or the Principal Ethics
Counsellor or the CEO of TCS. All communication received in this regard will be treated
and kept as confidential.

Abstract
SAP Business Objects Data Services (BODS) is a software tool designed by Business Object (a company that got
acquired by SAP in 2007). This tool pull data from any system/database/tables apply changes to modify the data
and load the data in to any other system/database. This process in known as Extraction, transformation and
Loading (ETL)
This training material will provide guidance to beginners to understand BODS architecture, components, objects,
transformations, mappings, Job execution, scheduling and monitoring of logs and error.

Table of Content
1.

BO Data Services Introduction.............................................................................................. 5

2.

BO Data Services Architecture ............................................................................................... 6


2.1

3.

4.

5.

BODS Architecture Components....................................................................................... 7

Starting with DS4 Designer ..................................................................................................... 8


3.1

Login to DS4 Designer ...................................................................................................... 8

3.2

BODS Objects................................................................................................................. 10

3.3

Defining Source and Target Metadata............................................................................. 12

3.4

Defining File Format........................................................................................................ 14

3.5

Create First Batch Job .................................................................................................... 15

Scheduling and Monitoring the Job through Admin Console .................................................. 38


4.1

Login to Admin Console of Data Services........................................................................ 38

4.2

Monitoring Job Log ......................................................................................................... 40

4.3

Scheduling of Job ........................................................................................................... 44

4.4

Manual Execution of Job ................................................................................................. 49

BODS Benefits ...................................................................................................................... 52

1. BO Data Services Introduction


An ETL tool: Extract, Transform and Load
Business Objects Data Services is a GUI workspace that allows create jobs that extracts
data from heterogeneous sources, transforms that data using built-in transforms and
functions to meet business requirements and then loads the data into a single datastore or
data warehouse for further analysis.
Data Services is an All-In-One solution for Data Integration, Data Migration, Data
Warehouse and Data Quality.
It provides one development UI, metadata repository, data connectivity layer, run-time
environment and management console for development, execution scheduling and
monitoring of jobs.

Figure 1: ETL

2. BO Data Services Architecture


Below is the architecture diagram of Data Services.

Figure 2: BODS Architecture

2.1 BODS Architecture Components


Designer - Designer is the graphical user interface that lets you create, test, execute and debug BODS Job. It
consists of data mappings, transformations and control logics.

Local Repository - Repository is like a database that stores predefined system objects and user defined objects
including source/target metadata and transformation rules. This is a mandatory repository for BODS functioning.

Central Repository - A central repository is an optional component that can be used to support multi-user
development. The central repository provides a shared object library allowing developers to check objects in and
out of their local repositories.
Job Server and Engine Job Server starts the data movement engine processes to perform data extraction,
transformation and movement.

Access Server It facilitates real time job execution by passing messages between web applications and the
Data Services - Job Server and engines

Administrator A web based application to schedule/monitor/execute jobs, configuring/starting/stopping real


time services, configuring job server/Access server and repository usage, managing users.

3. Starting with DS4 Designer


3.1 Login to DS4 Designer
Below is the login screen for BODS version 4.Enter the credentials and press Log on button .It will
show the list of repositories on which that user id has access. Then select the repository from the
list and press ok to login to designer.

Figure 3: Data Services Login Screen

Below is the DS4 designer screen.

Project, Job, Workflow,


Dataflow, Transform,
Datastores, File Format,

Figure 4: Data Services Designer Screen

3.2 BODS Objects

Project - A Project is the highest-level object in Designer. A Project is a single-use objects that allows us to group
and organize Jobs in Designer. Only one project can be open and visible in the Project Area at a time.

Jobs - composed of work flows and/or data flows. A job is the smallest unit of work that can Schedule
independently for Execution. Jobs must be associated with project to display logs in Admin console. Also, Job
wont be displayed in the Job list of that repository in admin console if it is not associated with Project

Work Flows - is the collection of several Data flows into a sequence. A Work flow orders Data flows and the
operations that support them. It also defines the interdependencies between data flows. Work flows can be used
to define strategies for error handling or to define conditions for running the Data flows. A workflow is optional.

Data Flow - is the process by which source data is transformed into target data. It describes how to process a
task.

Transforms - are the in built transformation objects available in DS for transforming source data as per business
rules. The following is a list of available transforms. The transforms that you can use depend on the software
package that you have purchased. If a transform belongs to a package that you have not purchased, it is greyed
out and cannot be used in a job.
Transform Category

Transform

Description

Data Integrator

Data_Transfer

Allows a data flow to split its processing into two sub data
flows and push down resource consuming operations to
the database server.
Generates a column filled with date values based on the
start and end dates and increment that you provide.
Generates an additional "effective to" column based on the
primary keys "effective date."
Flattens hierarchical data into relational tables so that it can
participate in a star schema. Hierarchy flattening can be
both vertical and horizontal.
Converts rows flagged as UPDATE to UPDATE
plus INSERT, so that the original
Values are preserved in the target. You
specify in which column to look for updated
data
Generates new keys for source data, starting from a value
based on existing keys in the table you specify
Sorts input data, maps output data, and resolves
Before- and after-images for UPDATE rows.

Date_Generation
Effective_Date
Hierarchy_Flattening

History_Preserving

Key_Generation
Map_CDC_Operation

Data Quality

Pivot (Columns to
Rows)
Reverse Pivot (Rows to
Columns)
XML_Pipeline
Associate

Rotates the values in specified columns to


Pivot (Columns to Rows) rows
Rotates the values in specified rows to columns
Processes large XML inputs in small batches.
Combine the results of two or more Match transforms or
two or more Associate transforms, or any combination of
the two, to find matches across match sets.

Country ID
Data Cleanse

Global Address Cleanse

Global Suggestion List

Match

Table_Comparison

USA Regulatory
Address Cleanse
User-Defined

Platform

Case

Map_Operation
Merge
Query

Parses input data and then identifies the


Country ID country of destination for each record.
Data Cleanse Base Transform It parses and
manipulate various forms of international data, as well as
operational and product data.
Identifies, parses, validates, and corrects global address
data, such as primary number, primary name, primary type,
directional, secondary identifier, and secondary number
Completes and populates addresses with
minimal data, and it can offer suggestions
for possible matches
Identifies matching records based on your
Business rules. Also performs candidate
selection, unique ID, best record, and other
Operations.
Compares two data sets and produces the difference
between them as a data set with rows flagged as INSERT
and UPDATE.
Identifies, parses, validates, and corrects USA address data
according to the U.S. Coding Accuracy Support System
(CASS).
Does just about anything that you can write Python code to
do. You can use the User-Defined transform to create new
records and data sets, or populate a field with a specific
value, just to name a few possibilities.
Simplifies branch logic in data flows by consolidating case
or decision making logic in one transform. Paths are defined
in an expression table.
Allows conversions between operation Map_Operation
codes.
Unifies rows from two or more sources into
Merge a single target.
Retrieves a data set that satisfies conditions that you
specify. A query transform is similar to a SQL SELECT
statement.

Script - A Script is a single-use object that is used to call functions and assign values in a workflow. To apply
decision-making and branch logic to work flows DI scripting language is used.

3.3 Defining Source and Target Metadata


Datastores represent connections between Data Services and Relational Databases or Application Databases.
Through the datastore connection, it can import the metadata from the data source.
DS uses these datastores to read data from source tables or load data to target tables.
Click on Datastore tab and right click on the windows Pane and Click on New. Below Screen appears. Provide the
Datastore information

Figure 5: Data Store Creation Screen

Once the Database type is selected, below screen appears. Provide the credentials.

Click on OK to create Datastore. Create the datastore for Source and Target Database.
To import the table, right click on Datastore name -> Import by Name.

Enter the Table Name and click Import

3.4 Defining File Format


File Formats are connections to flat files in the same way that datastores are connections to
databases. The Local Object Library stores file format templates that are used to define specific
file formats as sources and targets in dataflows.
There are three types of file format objects namely Delimited format, Fixed Width format and SAP
R/3 format (pre defined Transport_Format).
The file format editor is used to set properties of the source/target files. The editor has three
working areas: Property Value, Column Attributes and Data Preview.

Figure 6: File Format Editor Screen

3.5 Create First Batch Job


In

BODS, Batch job and Real Time Job both can be created. Batch jobs are those that run in
batches at a predefined time and after a predefined time period (frequency). Any batch job in
BODS basically contains one or more data flow or workflow. A workflow can contain one or
more data flows.
Data flow is a single logical unit where the whole logic to transport data from one schema to
other, is specified. A data-flow, being a logical unit, cannot execute on its own. It must be
encapsulated inside a batch job in order to execute it. Data flows can also be grouped under
one or different workflows and those workflows can, in turn, be executed through the batch job.
Below are the steps to show a simple mapping to extract data from SAP table and load into target oracle database.
Steps 1 - First, import Source and Target Tables in Data Services .For ex: source Table is DD07T and target table is ZDD07T (Target
Table should be created in Target Database before importing)

Go to Data Store > Expand the Source Data store Name -> Right Click on Tables -> Import by Name .

Figure 7: Table Import

Provide the Table Name and click on Import. Instance of Table is now available in Data Services for pulling the data. Similarly, click on
Target Data Store and import the target table.

Steps 2 Workflow creation


Click on Workflow -> Right Click on Workflow pane and click New to create a workflow. Give some Name to workflow i.e. C_DD07T.
Double click on Workflow name. Below screen appears.

Figure 8: Workflow Creation

Steps 3 Dataflow creation


Drag the Dataflow icon from the right

Figure 9: Dataflow Creation

Steps 4 ABAP Dataflow creation

Double click on Data Flow. Select ABAP Data flow .Below screen appears.

Figure 10: R3/ABAP Flow Creation

Click on Data Store tab and drag the source table DD07T

Drag the Query transforms from the tool Palette .Connect the Source table with Query Transform.

Double click on Query Transform. Drag the fields from Schema In to Schema Out.

Change the name of column in Schema out according to the description of the column in Source table, if required. Here, Column names
should be same as the columns in Target Table.

Finally, Columns in Schema out looks like as below:

Drag the Data Transport transform and join it with Query transform.

Figure 11: R3/ABAP Flow Created

Double Click on Data Transport transform and provide the name of .dat file

Now go to Data Flow, add Query transform to add audit information like LOAD_DATE. Double click on Query transform and map all
columns from Schema in to Schema out.

Go to Datastores tab and click on Target database datastore and import target table.

Steps 5 Dataflow Validation


To check for errors, click on Validate all.

Figure 12: Dataflow Created

Now for execution, workflow has to be added in job. Go to Job tab

Steps 6 Add Dataflow to a Job

Add Workflow to the job and check for errors.

Figure 13: Job Created

Go to Tab Project. Right click on Project Area and select New.

Steps 7 Add Job to a Project

Now Go to Job Tab and add the job to the Project Area.

Steps 8 Job Execution

Now For Execution, Double click the job and click on Execute and OK.

Figure 14: Job Execution Screen

Below is the trace.

Trace, Monitor,

Monitor logs shows Record count.

4. Scheduling and Monitoring the Job through Admin Console

4.1 Login to Admin Console of Data Services

Figure 15: Admin Console Login Screen

Click on Administrator

Click on status and Select the repository where Job was created.

4.2 Monitoring Job Log

Click on Trace, Job Monitor Log, Job Error Log

4.3 Scheduling of Job


Click on Batch Job Configuration

Select the Job which is to be scheduled and Click on Add the Schedule

Provide the
Schedule Name

Select the Day


for execution

Select the
Time for

Figure 15: Schedule Created

Fill the Scheduling details and Press Apply

Go to Repository Schedules, Select the Schedule and Press Activate. Schedule will execute only when it is in Active State.

Figure 16: Schedule Activated

4.4 Manual Execution of Job


Click on Batch Job Configuration

Below Screen appears:

Click on Execute.

Click on Execute. Job will get executed.

5. BODS Benefits
Below are the benefits that Data Services Provide:

Single Infrastructure for the data movement to enable faster and lower cost implementation
Integrate data across many systems and reuse that data for many purposes
Implement Pre-packaged data solutions for fast deployment and quick ROI.
Customizes and manages data access and uniquely combines industry-leading
technologies for delivering data to analytic, supply-chain management, customer
relationship management, and Web applications.

Contact
For more information, contact gsl.cdsfiodg@tcs.com (Email Id of ISU)

About Tata Consultancy Services (TCS)


Tata Consultancy Services is an IT services, consulting and business
solutions organization that delivers real results to global business,
ensuring a level of certainty no other firm can match. TCS offers a
consulting-led, integrated portfolio of IT and IT-enabled infrastructure,
engineering and assurance services. This is delivered through its unique
Global Network Delivery ModelTM, recognized as the benchmark of
excellence in software development. A part of the Tata Group, Indias
largest industrial conglomerate, TCS has a global footprint and is listed on
the National Stock Exchange and Bombay Stock Exchange in India.
For more information, visit us at www.tcs.com.

IT Services
Business Solutions
Consulting
All content / information present here is the exclusive property of Tata Consultancy Services Limited (TCS). The content /
information contained here is correct at the time of publishing. No material from here may be copied, modified, reproduced,
republished, uploaded, transmitted, posted or distributed in any form without prior written permission from TCS.
Unauthorized use of the content / information appearing here may violate copyright, trademark and other applicable laws,
and could result in criminal or civil penalties. Copyright 2011 Tata Consultancy Services Limited

Das könnte Ihnen auch gefallen