Beruflich Dokumente
Kultur Dokumente
Objectives
Having completed this module you will be able:
Page 1-1
This information is correct at time of writing but may change over time. Always consult
the IBM website for up to date information about supported operating systems. There are also
operating system version dependencies; for example Information Server version 8.0 only versions
9 and 10 of the Solaris operating system are supported. At the time of writing the website to
consult is http://www.ibm.com/software/data/infosphere/info-server/overview/requirements2.html
Page 1-2
Page 1-3
metadata delivery
connectivity
reporting
The services themselves are invisible to the products that use them. They
are delivered through a WebSphere Application Server (WAS) over what
is called the "Application Server Backbone" (ASB). If the product server
such as a DataStage server is on a different machine than the WAS,
then an ASB agent process must be started to effect delivery of services to
the particular product. This, too, is invisible to the user of the product.
For example to connect a DataStage client to a DataStage server, the user
supplies the name (or IP address) of the machine hosting the Information
Server, plus a user ID and password. This is relayed by the login/security
service to the Information Server which authenticates the user then, if due
authority exists, sends back to the login screen via the login/security
service again a list of available DataStage servers and possibly,
depending on which client is being used, the projects associated with each.
While this may seem quite complex it actually simplifies administration
because, for example, there is a single place where security information
needs to be maintained, that is the Information Server. Figure 1-1 shows
the relationship between Information Server, its services and products.
Page 1-4
Figure 1-1 Relationship between Information Server and its Product Suite
Page 1-5
Page 1-6
Product
Description
Business Glossary
Connectivity Software
DataStage
FastTrack
Federation Server
Information Analyzer
Information Services
Director
Metadata Workbench
QualityStage
What Is DataStage?
We begin with a few words from IBM marketing.
2
Business Glossary Anywhere was introduced in version 8.1. This allows any term
selected in any Windows application to be looked up in the business glossary.
Page 1-7
Page 1-8
DataStage is, first and foremost, an "ETL" tool. In this context the
acronym stands for "extraction, transformation and load".
"Programming", or developing, with DataStage involves drawing a picture
of what you want to happen during the ETL processing, compiling that
into something that can be executed (a "job"), and then requesting its
execution and monitoring the results. Once the job has been properly
tested it can then be put into production.
Figure 1-2 shows the design of a simple parallel job. With knowledge of
what the various icons represent you could perform a walkthrough of
this code to explain it to someone else. For example:
Distribution data are received from the mainframe and detail and header
records retrieved separately from that before being joined so that each
detail row includes its header information.
One copy of these data is stored for further processing. The other copy of
these data is summarized and the summary is stored for further
processing.
Page 1-9
DataStage Editions
There are three different editions of DataStage, indicating what job types
are available.
DataStage Clients
There are three DataStage client tools that are installed when DataStage
client software is installed on a Windows PC. These client tools are as
follows (the illustrated shortcut icons may be installed on the desktop
when the DataStage client software is installed).
Administrator client is used
by an administrator to
create, delete and configure
DataStage projects and
may be used by authorized
DataStage developers to edit some of the properties of a
project.
Designer client is used to
draw the pictures of
DataStage job designs (for
example Figure 1-2), to
compile job designs and to manage metadata associated
with those designs.
Director client may be used
to request execution of
compiled DataStage jobs, to
Page 1-10
DataStage Terminology
DataStage is programmed by drawing a picture of the desired ETL
process. This is then compiled into an executable object called a job.
Each job is made up of stages, which perform specific tasks, and links
between then, which represent the flow of data between stages.
Page 1-11
Page 1-12
Review
There are three servers that must be available in order for DataStage to
execute jobs or job sequences. These servers are:
Further Reading
Introduction to IBM Information Server (in documentation set3)
particularly Chapters 2 and 7.
IBM Information Server information center
(http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r0/index.jsp)
Parallel Job Developer Guide (in documentation set) Chapter 1
Start > All Programs > IBM Information Server > Documentation
Page 1-13