Beruflich Dokumente
Kultur Dokumente
Version 9.5
SC18-9801-04
DB2 Warehouse
Version 9.5
SC18-9801-04
Note
Note: Before using this information and the product it supports, read the information in Notices on page 121.
This edition applies to Version 9.5 of the DB2 Warehouse products and to all subsequent releases and modifications
until otherwise indicated in new editions.
Copyright International Business Machines Corporation 2007. All rights reserved.
US Government Users Restricted Rights Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.
Contents
DB2 Warehouse Tutorial, Version 9.5 . . 1
Introduction to the DB2 Warehouse Tutorial . . . . 1
Running the tutorial in a Windows client-server
environment . . . . . . . . . . . . . 7
Running the tutorial in a Linux client-server
environment . . . . . . . . . . . . . 8
Running the tutorial in a mixed client-server
environment . . . . . . . . . . . . . 8
Optional: Introduction to the Design Studio . . . . 9
Lesson 1: Design Studio perspectives, views,
editors, and projects . . . . . . . . . . 10
Lesson 2: Customizing the Design Studio . . . 12
Module 1: Designing the physical data model for
your data warehouse . . . . . . . . . . . 13
Lesson 1: Creating a data design project in the
Design Studio . . . . . . . . . . . . 14
Lesson 2: Creating a physical data model based
on the DWESAMP database . . . . . . . . 14
Lesson 3: Adding foreign key constraints to the
tables in the MARTS schema . . . . . . . 15
Lesson 4: Validating your physical data model . 16
Lesson 5: Updating the DWESAMP database
with changes from the data model . . . . . . 17
Module 2: Designing applications to build a data
warehouse . . . . . . . . . . . . . . . 20
Optional: Start the tutorial here . . . . . . . 20
Lesson 1: Setting up the warehouse building
environment . . . . . . . . . . . . . 22
Lesson 2: Designing a data flow that loads a
warehouse table . . . . . . . . . . . . 24
Lesson 3: Modifying a data flow that loads a
dimension table in a data mart . . . . . . . 33
Module 3: Deploying and running an application
that loads a data mart . . . . . . . . . . . 35
Lesson 1: Designing the control flow for the data
mart . . . . . . . . . . . . . . . . 36
Lesson 2: Preparing a data warehouse application
for deployment . . . . . . . . . . . . 38
Lesson 3: Deploying the application that loads
the MARTS tables . . . . . . . . . . . 39
Lesson 4: Running and monitoring a process in a
data warehouse application . . . . . . . . 42
Module 4: Designing OLAP metadata . . . . . 43
Optional: Start the tutorial here . . . . . . . 44
Lesson 1: Creating a complete cube model . . . 45
Lesson 2: Adding a hierarchy to the Time
dimension . . . . . . . . . . . . . . 50
Lesson 3: Creating a cube . . . . . . . . . 51
Lesson 4: Deploying your OLAP metadata to the
DWESAMP sample database . . . . . . . 53
Lesson 5: Creating MQT recommendations using
the Optimization Advisor wizard . . . . . . 53
Notices . . . . . . . . . . . . . . 121
Trademarks .
Contacting IBM
. 123
. . . . . . . . . . 125
Product Information . . . . .
Accessible documentation . . .
Comments on the documentation.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 125
. 125
. 126
iii
iv
Note: Module 8: Combining text analysis and OLAP on page 95 uses a different
scenario to show how the IT department at JK Superstore uses unstructured text
analysis to study trends in the IT job market.
The JK Superstore data warehousing team is committed to consolidating the
company data into a DB2 database, and a new warehouse that provides a
consistent data source for analysis and reporting. The technical teams data
architect has designed the data warehouse in the DWH schema, and a data mart
for analysis in the MARTS schema.
The DWH schema contains the transactional data for the JK Superstore retail chain.
Figure 1 shows the physical data model of the DWH schema.
The following table describes the nine tables that are in the physical model:
Table 1. Description of the tables in the physical data model of the DWH schema
Description
ITM_TXN
Item transaction
Table 1. Description of the tables in the physical data model of the DWH schema (continued)
Physical table name
Description
MKT_BSKT_TXN
OU
Organization unit
PD
Product
PD_X_GRP
Products by group
GRP
Group
IP
Involved party
MSR_PRD
Measurement period
CL
Customer
The MARTS schema contains the aggregated data for the JK Superstore retail chain
that is required for sales and pricing analysis. Figure 2 shows the physical data
model of the MARTS schema.
The following table describes the four tables that are in the physical model of the
MARTS schema:
Table 2. Description of the tables in the physical data model of the MARTS schema
Physical table name
Description
PRCHS_PRFL_ANLYSIS
STORE
Store
TIME
Time
PRODUCT
Product
This tutorial shows you how to use the main DB2 Warehouse features to
implement an end-to-end business intelligence solution for JK Superstore.
Learning objectives
The tutorial has the following learning objectives:
v Design and update a physical data model for a warehouse database
v Design applications for building warehouse and mart tables by using SQL-based
data flows and control flows
v
v
v
v
v
v
Time required
The complete tutorial should take approximately 11 hours to finish.
However, you can work on specific modules individually rather than complete the
entire tutorial from start to finish. Most of the modules take approximately 60 to 90
minutes each to complete, according to your familiarity with database,
warehousing, and business intelligence concepts and practices. If you are
experienced in these areas, the following table shows the estimated amount of time
that is required to complete each module.
Table 3. Time required to complete each module
Module
Time required
20 minutes
120 minutes
60 minutes
100 minutes
90 minutes
75 minutes
60 minutes
90 minutes
You can start this tutorial from the optional introduction module, Module 1,
Module 2, or Module 4. To start from Module 2 or Module 4, complete the Start
the tutorial here lesson at the beginning of those modules.
You can also skip all of the lessons in Module 6 by completing a shortcut lesson at
the beginning of the module.
To see the results of most of the tutorial lessons, you can open the completed
sample projects in the Design Studio. From the File menu, select New Example
Data Warehousing Examples and complete the wizard. You can also access this
wizard directly from the Design Studio Welcome view.
Audience
This tutorial covers a wide range of data warehousing and BI features, including
SQL warehousing flows, OLAP metadata and summary tables, mining models,
Miningblox applications, Alphablox reports, and unstructured text analysis. Many
enterprises divide the design and administration tasks and the domain areas (SQL
warehousing, OLAP, mining, reporting) among multiple people. Some of the
lessons in this tutorial might not apply to you. However, each lesson should apply
to someone on your team.
System requirements
To complete this tutorial from end to end, you must install several server, client,
and documentation components of DB2 Warehouse on one or more systems.
DB2 Warehouse server components
v DB2 Enterprise Server Edition, Version 9.5
v Intelligent Miner
v WebSphere Application Server
v Cubing Services
v IBM Alphablox
v Administration Console
DB2 Warehouse client components
v IBM Data Server Client
v Design Studio
Intelligent Miner plug-ins
SQL Warehousing Tool plug-ins
Cubing Services plug-ins
IBM Alphablox Blox Builder plug-ins
v Intelligent Miner Visualization
Documentation
v DB2 Warehouse Samples and Tutorial
Prerequisites
The tutorial setup scripts, which create the DWESAMP sample database, are
certified to run on a Windows or Linux instance of DB2. Complete the tutorial
by using one of the following client-server configurations:
v A Windows-only configuration, with the Design Studio and the runtime
environment installed on two separate Windows computers
v A Windows-to-Linux configuration, with the Design Studio installed on a
Windows client and the runtime environment installed on a 64-bit Linux server.
Expected results
If you finish the entire tutorial, you will have a complete working DB2 data
warehouse that is optimized for analysis. You will also have two web-based
reports. Each module lists the expected results to help you track your learning
progress when you finish each module.
5. From the client, catalog the remote DWESAMP database as a local database
with exactly the same name. You must use uppercase letters for the name
DWESAMP. Use the CATALOG TCPIP NODE and CATALOG DATABASE
commands, as described in the DB2 Information Center.
6. Verify that the cataloged DWESAMP database is visible from the client:
db2 list database directory
7. From the client, create and load the tables in the DWESAMP database:
DB2 Warehouse Tutorial, Version 9.5
5. From the client, catalog the remote DWESAMP database as a local database
with exactly the same name. You must use uppercase letters for the name
DWESAMP. Use the CATALOG TCPIP NODE and CATALOG DATABASE
commands, as described in the DB2 Information Center.
6. From the client, create and load the tables in the DWESAMP database:
a. Go to the following directory: /opt/IBM/dwe/samples/data
b. Type ./setupdwesamp.sh -r DWESAMP db2inst1 password
When the script is complete, a list of row counts is displayed for the loaded
tables. For detailed information about what this script does, see the
readme_linux.txt file in the data directory.
5. From the client, catalog the remote DWESAMP database as a local database
with exactly the same name. You must use uppercase letters for the name
DWESAMP. Use the CATALOG TCPIP NODE and CATALOG DATABASE
commands, as described in the DB2 Information Center.
6. From the client, create and load the tables in the DWESAMP database:
a. Open a DB2 Command Window.
b. Go to the directory where the setup script was installed. The default
installation path to the setup script is: C:\Program Files\IBM\dwe\
samples\data\setupdwesamp.bat. Do not try to run the script without
going to the data directory first. For information about what this script
does, see the Readme.txt file in the data directory.
c. Type setupdwesamp.bat -r DWESAMP db2inst1 password
When the script is complete, a list of row counts is displayed for the loaded
tables.
Learning objectives
After completing the lessons in this module you will:
v Be familiar with perspectives, views, editors, and projects in general
v Be familiar with the BI and Blox Builder perspectives, including the Data Project
Explorer, Database Explorer, and Properties views
v Understand how to customize your view of the Design Studio
DB2 Warehouse Tutorial, Version 9.5
Time required
This module should take approximately 20 minutes to complete.
10
that actually modify a database, you must have a DB2 user account that
includes the appropriate authority and privileges. The DB2 databases (and
aliases) that exist in your local catalog are listed automatically in the
Database Explorer. You can set up connections to other databases as
needed.
Properties
This view, and others such as Data Output and Problems, is open by
default in the lower, right area of the Design Studio. Each of these views
has a title tab that you click to make it active, which brings it to the
foreground. You can use the Properties view to define and modify many of
the objects that you create. To open the Properties view when it is closed
or hidden, click Window Show View Properties.
Tip: If you cannot find an option or control that you expected to work with,
ensure that you have opened the correct view. Notice that the Data Project
Explorer options and controls differ from those in the Database Explorer.
Editors
An editor is a visual component of the Design Studio that you typically use to
browse or modify a resource, such as an object in a project. When you use an
editor to modify an object, you must explicitly save the changes, because the
Design Studio does not automatically save them.
The Design Studio displays the appropriate editor for the object type that you are
working with; for example, different editors are available for physical data models
and control flows. An editor typically has an associated custom palette. The Design
Studio opens the editor in the upper, right area of the canvas by default, and opens
its palette on the right side of the canvas.
Projects
A project is a set of objects that you create in the Design Studio as part of the data
transformation or warehouse building processes. You can build projects in the
Design Studio, and test their validity without impacting the database. Each project
that you build is represented by an icon in the Data Project Explorer, where you
can expand it, explore its contents, and access editors that enable you to work with
it. You create different types of objects according to the type of project you are
building.
You save your project files in the workspace directory of your file system.
Integration with Concurrent Versions System (CVS), which is an open source
version control and collaboration tool, allows you to share Design Studio projects,
and work with them in a coordinated development team environment.
To work with this tutorial, you primarily use the following project types:
Data design project (OLAP)
You use a data design project for database design and information
integration. A data design project can include physical data models, OLAP
objects, and scripts.
Data warehouse project
You use a data warehouse project for designing and building the
warehouse. This project type can include SQL warehousing objects such as
physical data models, data flows, control flows, and mining flows.
DB2 Warehouse Tutorial, Version 9.5
11
Lesson checkpoint
In the Design Studio, you work with perspectives, which provide a variety of
useful views and editors.
You learned about the following concepts:
v The Design Studio BI and Blox Builder perspectives and the views and editors
that you use in this tutorial
v Projects that you create in this tutorial, such as data design and data warehouse
projects
).
12
Lesson checkpoint
To use the Design Studio more efficiently, you can customize the arrangement of
the views and editors that you use.
You learned how to:
v Maximize and minimize windows
v Rearrange dockable windows
v Close and open windows
Module 1: Designing the physical data model for your data warehouse
In this module, you use the Design Studio to import the existing physical data
model for the new JK Superstore data warehouse and marts and complete the
design of the MARTS schema. You also update the DWESAMP database with the
MARTS schema changes.
In the previous module, you became familiar with the Design Studio GUI and
navigation features that you use in this tutorial. In this module, you learn how to
use the Design Studio to complete the following lessons:
v Creating a data design project
v Creating a physical data model based on the DWESAMP database
v Adding foreign key constraints to the tables in the MARTS schema
v Validating your physical data model
v Updating the DWESAMP database with changes from the data model
Learning objectives
After you complete the lessons in this module, you will be able to:
v Create a data design project in the Design Studio
v Reverse engineer a physical data model based on a database
v Use the editor to modify a schema by adding constraints
v Analyze a physical data model to ensure its validity
v Deploy your updated physical database design to a database
Time required
This module should take approximately 60 minutes to complete.
13
Prerequisites
You must have the Design Studio installed and you must meet all of the
prerequisites that are described in Introduction to the DB2 Warehouse Tutorial
on page 1.
Lesson checkpoint
You learned how to create a data design project, which you can use for physical
data modeling.
14
Lesson checkpoint
You created a physical data model in your data design project. This model is based
on an existing database.
You learned how to create a new physical data model by reverse engineering the
data model of an existing database.
15
Lesson checkpoint
You added the necessary foreign key constraints between the fact table and each of
the dimension tables.
You learned how to:
v Work with an overview schema diagram
v Add constraints using the diagram editor
16
3. In the Problems view, review any errors or warnings that resulted from the
data model analysis. In this case, the analysis process should not result in
errors or warnings.
Lesson checkpoint
You learned how to use the Analyze Model wizard to validate your model against
a set of rules.
17
b. Ensure that Schema is selected in the Structural Compare window and click
the Copy from Left to Right icon ( ) in the Property Compare toolbar.
The window shows that all of the changes now exist in the
DWESAMP:DWESAMP.MARTS original source list. See Figure 4 on page 19.
18
Lesson checkpoint
You generated and ran a DDL script to update the DWESAMP database with the
database model changes that you made in the previous lesson.
You learned how to:
v Compare a database model to a source database
v Generate a delta DDL script that you run to propagate changes from a design
project to a database
19
Learning objectives
After you complete the lessons in this module you will be able to:
v Create a data warehouse project that references an existing data design project
v Connect to a DB2 database from the Design Studio and check the contents of
tables
v Design data flows that use various SQL warehousing operators:
File import and export
Table source
Bulk load target
Table join, union, key lookup
v
v
v
v
v
Time required
This module should take approximately 120 minutes to complete.
Prerequisites
Check that your client computer contains the samples directory that supports this
tutorial:
Windows
C:\Program Files\IBM\dwe\samples
Linux
/opt/IBM/dwe/samples
20
You can skip the previous modules and start the tutorial here by completing a few
short steps. If you already completed the previous modules, do not complete these
steps; go to Lesson 1.
If you did not complete the earlier modules and want to start here, you need to
complete these steps.
To start the tutorial here:
1. Create the appropriate version of the DWESAMP database by opening a DB2
Command Window and running the following script:
Windows
C:\Program Files\IBM\dwe\samples\data\setupsqw.bat
Linux
/opt/IBM/dwe/samples/data/setupsqw.sh
For information about what the script does, see the Readme.txt or
readme_linux.txt file in the data directory. For general tutorial setup
information, see one of the following procedures:
v Running the tutorial in a Windows client-server environment on page 7
v Running the tutorial in a Linux client-server environment on page 8
v Running the tutorial in a mixed client-server environment on page 8
2. Open the Design Studio:
Windows
21
d. On the Select Connection page, select Use an existing connection and select
DWESAMP from the Existing connections list. Click Next.
e. On the User Information page, type your database username and password.
Click Next.
f. On the Schema page, select the check boxes for the DWH and MARTS
schemas. Click Next.
g. On the Database Elements page, click Next.
h. On the Options page, click Finish.
The DWH and MARTS schemas are now in the physical data model in your
data design project.
Lesson checkpoint
You completed the prerequisite steps for starting the tutorial here. You can now
continue with Lesson 1.
22
b. Open the model file and browse the tables in the DWH and MARTS
schemas.
3. In the Database Explorer, verify that you have a connection to the DWESAMP
database. This connection allows you to interact with the live database and see
a sample of the data in the tables.
4. Expand the DWESAMP database tree. Go to Schemas DWH Tables,
right-click the PD table, and select Data Sample Contents. A subset of the
rows in the table is displayed in the Data Output view.
5. Create a variable group that contains two variables. These variables define
directories, which contain files that you need to reference when you build data
flows and control flows later in this tutorial:
a. In the Data Warehousing menu, select Manage Variables. The Manage
Variables window opens.
b. Select the dwhproj project that you created earlier in this lesson and click
Next.
c. Click New on the left side of the window to create a new variable group.
d. Name the group datadirs.
e. Click New on the right side of the window to create a new variable in the
datadirs group.
f. In the Variable Information window, define the new variable as follows and
click OK:
v Name: tempdir
v Type: Directory
v Current value:
Windows
C:\temp
Linux
/tmp
Important: Make sure that this directory exists on the client computer. If
the directory does not exist, create it.
v Final phase for value changes: EXECUTION_INSTANCE
g. Click New on the right side of the window to create a second new variable
in the datadirs group.
h. Define the second variable as follows and click OK:
v Name: sample_datadir
v Type: Directory
v Current value:
Windows
C:\Program Files\IBM\dwe\samples\data
Linux
/opt/IBM/dwe/samples/data
Important: If necessary, adjust the current value to match the path to the
DB2 Warehouse installation directory on your client computer.
v Final phase for value changes: EXECUTION_INSTANCE
i. Close the Manage Variables window.
23
Lesson checkpoint
You learned how to:
v Create a data warehouse project that references a data design project
v View the source and target metadata for your project
v Connect to a DB2 database and see a sample of its contents
v Create variables that can be used to design flexible flows
24
Figure 5. First piece of the data flow that you will build in this lesson
The second piece of the data flow moves data from the data station through
another series of transformations and finally loads the ITM_TXN table. The
following figure shows the second piece of the data flow.
Figure 6. Second piece of the data flow that you will build in this lesson
In the tutorial, these two pieces form one data flow, but in practice you might
choose to create two data flows instead. The data station operator represents a
persistent target table that could mark the end of the first data flow. That target
table could easily be used as a source table for the second data flow. (If you build
two data flows instead of one, you can use a control flow to run them in
sequence.)
The following instructions assume that you are designing your first complex data
flow in the Design Studio. However, the instructions do not explain very basic
tasks such as how to place operators on the canvas and how to connect them. If
you are unfamiliar with these basic tasks, play the Show Me viewlet for creating a
data flow before proceeding with the lesson. You can launch this viewlet from the
Design Studio by selecting Help Welcome Overview Tour the Design Studio
SQL Warehousing Demonstrations Designing a data flow that loads a
warehouse table.
To design and test-run the data flow that loads the ITM_TXN table:
25
1. Right-click the Data Flows folder in your data warehouse project and select
New Data Flow.
2. Name the data flow dwh-fact, select Work against data models (Offline) as the
working mode for the flow, and click Finish. The data flow editor opens.
3. In the Properties view underneath the empty canvas, type DWH in the SQL
Execution Schema field and select DWESAMP from the SQL execution database
list. The SQL execution database must be a DB2 database. This database runs
the SQL code that the data flows generate and need not be the same as the
databases where data is extracted and loaded.
4. Leave the two table space fields blank and accept the default setting for the
Use DB2 Data Partitioning Feature (DPF) option. Because the DWESAMP
database is not partitioned, this option will be ignored.
5. Build part 1 and part 2 of the data flow by following the next two procedures
in the tutorial. You need at least one hour to build this flow from beginning to
end.
v Designing the ITM_TXN data flow (part 1)
v Designing the ITM_TXN data flow (part 2) on page 29
Lesson checkpoint
This lesson covered the end-to-end design process for a complex data flow that
loads a warehouse fact table.
You learned how to:
v Create a new data flow
v Define properties for various SQL warehousing operators:
File import and export
Table source
Bulk load target
v
v
v
v
v
Key lookup
Union
Distinct
Use a data station operator to define a staging point in a data flow
Create a new table as part of the data flow, add it to the physical model, and
run its DDL script
Use operator variables that can be replaced with actual values at run time
Use the SQL Condition Builder to expedite the definition of conditions and
expressions
Validate and run a data flow directly from the Design Studio
26
The following figure shows how the first piece of the data flow should look when
it is complete:
Figure 7. First piece of the data flow that you will build
Note: The instructions in these lessons assume that you will use the Properties
view to define the specific details for each operator rather than the wizard pages
that open when you drag certain operators to the canvas. You can close the wizard
pages by clicking Finish without defining any properties. By using the Properties
view below the canvas, you will be able to see the properties and the highlighted
piece of the data flow at the same time.
To build the first piece of the ITM_TXN data flow:
1. Define two file import operators in the same way. These operators read data
from flat files, based on the format that you specify.
a. Drag two file import operators to the left side of the empty canvas.
b. On the General page of the Properties view for the first file import operator,
click the
icon next to the File name field and select Use Variable.
The File name values will look the same on Windows and Linux platforms
because you are using a variable for the platform-specific path to the
samples directory. The forward slash character (/) works on both platforms.
f. In the File location list, accept the default entry (Client).
g. On the File Format page, click Load from File Format and browse to the
sample fileformat file:
Windows
C:\Program Files\IBM\dwe\samples\data\sqw\
dwh_itm_txn.fileformat
DB2 Warehouse Tutorial, Version 9.5
27
Linux
/opt/IBM/dwe/samples/data/sqw/dwh_itm_txn.fileformat
Repeat this step for the second file import operator.
h. Do not change the list of selected columns in the Column Select page; all
columns are selected by default.
i. Do not define any properties on the Advanced Options and Partition
Options pages.
Tip: Use the default names of all of the operators when you build this data
flow. You do not need to use the Label field in the General page to rename the
operators. In some cases, if you rename the operators, it is difficult to identify
the source of virtual table columns when they pass through the data flow.
2. Define two distinct operators in the same way:
a. Drag two distinct operators to the canvas and connect each file import
operator to a distinct operator.
b. On the Column Select page of the Properties view, make sure that only the
following three columns are selected for both distinct operators:
v MKT_BSKT_TXN_ID
v PD_ID
v ITM_TXN_TMS
c. Do not define any properties on the Staging Table Settings page.
3. Define two file export operators in the same way:
a. Drag two file export operators to the canvas and connect the discard port of
each distinct operator to a different file export operator.
b. Define the same file format for both file export operators:
1) On the General page, click the
select Use Variable.
2) Click the
push button next to the File name field, select the
tempdir variable, and click Replace.
3) In the File name field, append /discard1.txt to the variable string.
4.
5.
6.
7.
28
4) Repeat the variable selection process for the second file export operator,
but append /discard2.txt to the variable string.
5) On the File Format and Advanced Options pages, accept the default
settings.
Define a union operator:
a. Drag a union operator to the canvas and connect the result ports of the two
distinct operators to the input1 and input2 ports of the union operator.
b. On the Set Details page of the union operator, select UNION (not
UNION_ALL).
Drag a data station operator to the canvas but do not define any of the
operators properties.
Connect the result port of the union operator to the input port of the data
station operator.
In the Properties view, define the data station operator:
a. Set the station type to PERSISTENT_TABLE.
b. Type ITM_TXN_STAGE in the Table name field and DWH in the Schema name
field.
c. Select the Automatically create staging table check box but do not select
any of the other check boxes on the General page. Selecting the Delete all
rows check box is recommended only if you intend to run the data flow
multiple times and you want to clear the contents of the staging table after
each run. If the staging table is not empty when you run the data flow, the
performance will degrade significantly. One disadvantage to selecting this
option is that you will not be able to inspect the contents of the staging
table after each run.
d. Do not define any properties on the Staging Table Settings page.
8. Save your work.
Complete the data flow by following part 2 of this lesson.
29
Figure 8. Second piece of the data flow that you will build in this lesson
30
The following figure shows what the Results Columns list will look like after
you add the DATA columns back into the list:
31
Figure 10. Bulk load target operator with the columns connected by name
32
Figure 11. Final three operators in the data flow that you are building
33
d. Select the Tutorial - Data Model data design project that you worked with
earlier in the tutorial and click OK. The sample data warehouse project is
linked to the data design project.
2. Select Data Warehousing Manage Variables and make sure that the tempdir
and sample_datadir variables are set correctly for your platform and
installation. These two variables belong to the datadirs variable group. By
default, the correct variable values are set for the Windows platform. You might
also need to adjust the sample_datadir path to match the actual path to the
installation directory on your client computer.
v tempdir
Windows
C:\temp
Linux
/tmp
v sample_datadir
Windows
C:\Program Files\IBM\dwe\samples\data
Linux
/opt/IBM/dwe/samples/data
3. Navigate to the Data Flows folder and open the marts-store data flow.
4. On the General page of the Properties view for the data flow, make sure that
the database schema is set to MARTS and that the execution database is set to
DWESAMP. Leave the two table space fields blank and accept the default setting
for the Use DB2 Database Partitioning Feature (DPF) option. Because the
DWESAMP database is not partitioned, this option will be ignored.
5. On the Condition page of the Properties view for the final join operator in the
data flow, add the following condition to the end of the syntax in the Join
condition field: AND IN_034.OU_TP_ID = IN1_034.CL_ID. The virtual table names
might be different in your data flow. The complete set of join conditions is:
IN3_034."parentkey" = IN2_034."memberkey" AND
IN4_034."parentkey" = IN3_034."memberkey" AND
IN5_034."parentkey" = IN4_034."memberkey" AND
IN_034.OU_IP_ID = IN5_034."memberkey" AND
IN_034.OU_IP_ID = IN6_034.OU_IP_ID
AND IN_034.OU_TP_ID = IN1_034.CL_ID
34
Figure 12. Result Columns list shows the mapping of columns in the source table to columns in the target table
8. Save and validate the completed flow by selecting the data flow in the canvas
and selecting Data Flow Validate.
9. Run the data flow.
a. Select the data flow in the canvas and select Data Flow Execute. The Flow
Execution window opens.
b. Accept the default run profile, execution schema (MARTS), and execution
database (DWESAMP).
c. Click Execute. After a short time, an Execution succeeded message is
displayed.
Lesson checkpoint
In this lesson, you modified and completed an existing data flow by adding a
series of operators and defining their properties.
You learned how to:
v Define table joins and other SQL warehousing operators
v Build onto an existing data flow
35
v
v
v
v
Learning objectives
After you complete the lessons in this module you will know how to:
v Design control flows that contain the following operators:
Data flow
Parallel container
Execute command
E-mail
v Create a deployment package for a data warehouse application (deployment
preparation)
v Deploy, run, and manage an application in the WebSphere Application Server
environment by using the Administration Console
Time required
This module should take approximately 60 minutes to complete.
Prerequisites
Complete Module 2: Designing applications to build a data warehouse on page
20.
36
3.
4.
5.
6.
7.
37
8. Drag an email operator to the canvas and connect the On Failure link of the
parallel container to the email operator.
9. Define the properties of the email operator:
a. Using fixed values, type your own e-mail address for both the sender and
recipient. The default for these fields is Use Variable, so start by changing
the fields to Use Fixed Value.
b. In the Subject field, type One of the dimension table data flows failed.
10. Drag a second email operator to the canvas and connect the On Failure link
of the marts-fact data flow operator to the email operator.
11. Define the properties of the email operator:
a. Using fixed values, type your own e-mail address for both the sender and
recipient.
b. In the Subject field, type The fact table data flow failed.
12. Define a second command operator.
a. Drag a command operator next to the marts-fact data flow operator and
use the On Success link to connect the operators.
b. On the General page of the properties view, set the Command Type value
to DB2 SQL Script.
c. Set the SQL script location field to the sample_datadir variable, then
append /countMartTables.sql to the variable string: ${datadirs/
sample_datadir}/countMartTables.sql
d. Set the DB2 Connection field to DWESAMP.
e. On the Diagnostics page, accept the default levels for logging and tracing.
13. Save and validate the control flow by selecting the control flow in the canvas
and selecting Control Flow Validate. You should not see any errors.
To test control flows before using them in a production environment, you can run
or debug them directly in the Design Studio by selecting Control Flow Execute
or Control Flow Debug. In this case, subsequent lessons explain how to run a
control flow by deploying a data warehouse application to the WebSphere
environment and using the Administration Console to start control flow processes.
Lesson checkpoint
This lesson explained the end-to-end design process for a control flow that loads a
data mart.
You learned how to connect and define the following control flow operators:
v Parallel container
v Data flow
v Command
v Email
38
data warehouse application, which is based on a data warehouse project and contains
one or more control flows. After you select the control flows that you need,
generate code, and package the results in a zip file, the zip file is ready for
deployment to the WebSphere Application Server. In this lesson, you will prepare
to deploy a simple application that consists of one control flow.
In addition to selecting flows, you need to define the resources and variables that
the application will use. These attributes represent the application profile. The
deployment preparation wizard has three sections: (1) create and save the profile,
(2) proceed with code generation, and (3) generate the final deployment package (a
zip file).
To prepare a data warehouse application for deployment:
1. Right-click SQWSamplePartial in the Data Project Explorer and select New
Data Warehouse Application. The Data Warehouse Application Deployment
Preparation wizard opens.
2. In the Project Selection page, select SQWSamplePartial and click Next.
3. Define the application profile, then generate the code and deployment package.
a. Type the profile name marts_load_profile, then click Next.
b. Move the marts_flow control flow to the Selected Control Flows list, then
click Next.
c. Click Next until you reach the Code Generation page. You can ignore the
intermediate pages. Optionally, browse the contents of the generated-code
folder on the Code Generation page.
d. Click Next to go to the Package Generation page, then specify a local
directory where you want to save the deployment zip file, such as C:\temp
on Windows platforms or /tmp on Linux platforms.
e. Click Finish to generate the package and complete the wizard.
4. Verify that the deployment zip file was created by checking the directory that
you specified.
Lesson checkpoint
This lesson showed how to prepare a data warehouse application for deployment.
You learned how to:
v Define an application profile
v Generate the code for an application
v Generate the deployment package for an application (a zip file that you can
deploy to the WebSphere environment)
39
the contents of the file on the computer where the WebSphere Application Server is
running. Deployed applications are visible and executable from the Administration
Console.
Before you deploy an application, you must define data sources that are referenced
as source and target databases inside data flows. In this lesson, you need to define
the DWESAMP database as a data source.
If global security is configured on your application server, you can do certain
console tasks only if you are logged in as a user with the appropriate role-based
privileges. The console supports three different roles: administrator, manager, and
operator.
To deploy the control flow application that loads the MARTS tables:
1. Ensure that the WebSphere Application Server software is running on the
application server computer. To start the server:
Windows
2.
3.
4.
5.
6.
40
e.
f.
g.
h.
i.
The new application is deployed and displayed in the list of applications on the
Manage Warehouse Applications page.
8. Click the underlined application name to display the properties for the
deployed application.
Lesson checkpoint
This lesson showed how to deploy a data warehouse application.
You learned how to:
v Start the DB2 Warehouse Administration Console and navigate to the Common
and SQL Warehousing pages.
v Create a data source that is required by an application.
v Deploy an application that is based on a deployment package that you created
in the previous lesson.
DB2 Warehouse Tutorial, Version 9.5
41
42
b. Click the underlined log file for the marts_flow process. The log file
includes a history of the runtime output for the process.
Lesson checkpoint
This lesson explained how to run and monitor a process in a data warehouse
application.
You learned how to:
v View the processes and activities that make up an application
v Schedule a process to run on a fixed schedule
v View the statistics and log entries for the first run of the scheduled process
Learning objectives
After completing the lessons in this module you will know how to perform the
following tasks:
v Import OLAP metadata
v Add a hierarchy to a dimension
v Create a cube
v Deploy OLAP metadata to a database
DB2 Warehouse Tutorial, Version 9.5
43
Time required
This module should take approximately 100 minutes to complete.
You can skip the previous modules and start the tutorial here by completing a few
short steps. If you already completed the previous modules, do not complete these
steps; go to Lesson 1.
If you did not complete the earlier modules and want to start here, you need to
complete these steps.
To start the tutorial here:
1. Create the appropriate version of the DWESAMP database by opening a DB2
Command Window and running the following script:
Windows
C:\Program Files\IBM\dwe\samples\data\setupolapandmining.bat
Linux
/opt/IBM/dwe/samples/data/setupolapandmining.sh
For information about what the script does, see the Readme.txt or
readme_linux.txt file in the data directory. For general tutorial setup
information, see:
v Running the tutorial in a Windows client-server environment on page 7
v Running the tutorial in a Linux client-server environment on page 8
v Running the tutorial in a mixed client-server environment on page 8
2. Create a data design project called Tutorial - Data Model.
a. In the Design Studio, click File New Project.
b. In the New Project wizard, expand the Data Warehousing folder, select
Data Design Project (OLAP), and click Next.
c. In the New Data Design Project wizard, type Tutorial Data Model for the
project name, and click Finish.
The Design Studio displays your Tutorial - Data Model project icon in the Data
Project Explorer.
3. Create a physical data model with reverse engineering.
a. Use the New Physical Data Model wizard to create the physical data model.
Right-click the Data Models folder in the Data Project Explorer and click
New Physical Data Model.
b. On the Model File page, specify the following selections:
1) Change the File name field to DWESampleTutorial.
2) Check that the Version field is set to V9.5.
3) Select Create from reverse engineering and click Next.
44
c. On the Source page, check that the Database option is selected and click
Next.
d. On the Select Connection page, select Use an existing connection and select
DWESAMP from the Existing connections list. Click Next.
e. On the User Information page, type your database username and password.
Click Next.
f. On the Schema page, select the check boxes for the DWH and MARTS
schemas. Click Next.
g. On the Database Elements page, click Next.
h. On the Options page, click Finish.
The DWH and MARTS schemas are now in the physical data model in your
data design project.
Lesson checkpoint
You completed the prerequisite steps for starting the tutorial here. You can now
continue with Lesson 1.
45
Figure 14. Facts object. How a facts object and measures relate to relational data
Dimensions are connected to the facts object in a cube model like the dimension
tables are connected to the fact table in a star schema. Columns of data from
relational tables are represented by attributes that are organized to make a
dimension.
Figure 15 on page 47 shows how dimensions are built from relational tables.
Hierarchies store information about how the levels within a dimension are related
to each other and are structured. A hierarchy provides a way to calculate and
navigate across the dimension. Each dimension has a corresponding hierarchy that
contains levels that correspond to one or more columns within a table. In a cube
model, each dimension can have multiple hierarchies.
46
Figure 15. Dimension. How dimensions are built from relational tables
All of the dimensions are connected to a facts object in a cube model that is based
on a star schema or snowflake schema. Joins can connect tables to create a facts
object or a dimension. In a cube model, joins can connect facts objects to
dimensions. The dimensions reference their corresponding hierarchies, levels,
attributes, and related joins. Facts objects reference their measures, attributes, and
related joins. Figure 16 on page 48 shows how the metadata objects are related to
each other in a cube model and map to a relational snowflake schema.
47
Figure 16. Cube model. How metadata objects fit together and map to a relational snowflake
schema
To create a cube model that is based on your relational schema, you can use the
Quick Start wizard, which creates the metadata objects that the wizard can
logically infer from the schema. You specify the fact table, and the wizard detects
the corresponding dimensions, joins, and attributes. After you complete the Quick
Start wizard, you need to add calculated measures, hierarchies, and levels to the
cube model so that the cube model is complete and can be optimized.
You can also import existing metadata. You might have existing metadata that was
previously created using the Design Studio, DB2 Cube Views, or another OLAP
tool that provides a bridge to the OLAP metadata in DB2 Warehouse (or DB2 Cube
Views at the Version 8.1, FixPak 10 level).
JK Superstore has metadata already defined in another OLAP tool that you can
import into DB2 Warehouse and optimize.
To import a cube model and its corresponding metadata:
1. Set the preferences for the metadata database connection:
a. Click Window Preferences. The Preferences dialog opens.
b. In the Preferences dialog, click Data Warehousing Repository
c. Specify the following settings:
48
C:\Program Files\IBM\dwe\samples\OLAP\partialSample.xml
Linux
/opt/IBM/dwe/samples/OLAP/partialSample.xml
d. In the Import into list, select the DWESAMP (Tutorial Data
Model/DWESampleTutorial.dbm) project database. Click Next.
e. On the Import OLAP Objects page, make sure that Replace existing objects
is selected, and click Finish.
4. Browse the OLAP metadata that you imported into your project.
a. In the Data Project Explorer tree, expand Tutorial Data Model Data
Models DWESampleTutorial.dbm DWESAMP MARTS OLAP
Objects Cube Models to view the Purchase Profile Analysis cube model
and related metadata that you imported.
b. Expand the Purchase Profile Analysis cube model to view the Purchase
Profile Analysis facts object and the Product, Store, and Time dimensions.
c. Select the Purchase Profile Analysis facts object and view the Measures page
of the Properties view. The measures in the following table were imported
for your JK Superstore analytical needs.
Table 4. Measures that you imported
Measure name
Measure description
Number of Items
49
Measure description
Profit Amount
Sales Amount
d. In the Data Project Explorer view, expand the Time dimension information
folder. You can see the following two objects:
PRCHS_PRFL_ANLYSIS-TIME
The dimension-to-facts join
Time
Lesson checkpoint
In this lesson, you imported metadata from an XML file.
You learned how to:
v Import your OLAP metadata
v Navigate your OLAP metadata in the Data Project Explorer
50
Lesson checkpoint
In this lesson, you updated the Time dimension by adding a second hierarchy that
defines time in terms of the calendar year. The Calendar Year Hierarchy will be
important for JK Superstores reports.
You learned about dimensions, hierarchies, and levels, and you learned how to
create hierarchies and levels.
51
b. In the Available Dimensions window, click Select All and click OK.
c. On the General page of the Cube Dimensions, type the label for the
dimension in the Label field. The value for the label is the name value
without the phrase (Price Analysis). For example, if the name value is
Product (Price Analysis), then the label value is Product. The Alphablox
report will fail if the label value is incorrect.
You can expand the Price Analysis cube in the Cubes folder to see the cube
facts and the cube dimensions.
4. Specify a cube hierarchy for the cube dimension.
a. On the Cube Hierarchy page of the Properties view, click the
push button to specify the cube hierarchy.
b. Select the Calendar Year Hierarchy for the Time cube dimension.
5. Ensure that all of the levels in each cube hierarchy are included by opening the
Levels page of the Properties view for the cube hierarchy. To include a level,
select the check box for the level.
6. Add the measures that you want to include to the cube facts object:
a. Select the cube facts object and open the Measures page of the Properties
view.
b. Click the Add Measure icon (
) in the toolbar to select measures from
the cube models facts object to include in the cube facts object. Select the
following measures:
v Number Of Items
v Product Book Price Amount
v Sales Amount
v
v
v
v
v
Lesson checkpoint
In this lesson, you created the Price Analysis cube that can be used for Alphablox
analytics applications for JK Superstore.
You learned about cubes and you learned how to:
v Create a cube
v Add cube dimensions to a cube including cube hierarchies and cube levels
v Add measures to a cube facts object
52
Lesson checkpoint
In this lesson, you successfully validated and deployed your OLAP metadata into
the DWESAMP database.
You learned how to use the Deploy OLAP Objects wizard to deploy your OLAP
metadata to the sample database.
53
to ensure optimal performance. You do not yet have a workload, but you can
create MQTs based on the OLAP model that you completed in the previous
lessons.
To optimize the cube model, use the Optimization Advisor wizard in the Design
Studio:
1. In the Database Explorer, expand the tree view for the DWESAMP database by
clicking Schemas MARTS OLAP Objects Cube Models. In the Cube
Models folder, right-click the Purchase Profile Analysis cube model and click
Optimization Advisor. The cube model is validated against the metadata rules.
If any part of the cube model is not valid, you will receive an error and you
will need to modify the cube model so that it is valid before you can optimize
it.
2. On the Target Queries page of the Optimization Advisor wizard, specify the
target of the queries that you want to optimize the Price Analysis cube for. The
target of queries is used to improve the optimization results.
a. Click the
push button to open the Optimization Slices window.
b. On the Optimization Slices window, create an optimization slice for the
Price Analysis cube.
1) Click the Add slice icon (
) in the toolbar to add a new optimization
slice to the cube.
2) For the new slice that appears in the table, specify a level for each
hierarchy. Click the level in the cube dimension column and select the
appropriate level from the list:
Time cube dimension
Select the Calendar Year level.
Product cube dimension
Select any level.
Store cube dimension
Select any level.
3) Click OK.
c. Click OK and then click Next.
3. On the Summary Tables page, specify that you want deferred update summary
tables. Specify the table space in which to store the summary tables and
summary table indexes and click Next. You can accept the default
USERSPACE1 table space settings.
4. On the Limitations page, select Do not specify a disk space limit and Do not
specify a time limit to provide unlimited time and disk space for the
Optimization Advisor. Specify that you want to allow data sampling. The more
space, information, and time that you specify, the more significantly your
performance results will improve. Click Start Advisor to start the Optimization
Advisor. After the Optimization Advisor completes its recommendations, click
Next.
Note: The Optimization Advisor might take several minutes to create its
recommendations.
5. On the SQL Scripts page, type unique file names into the following fields:
Windows
54
Lesson checkpoint
In this lesson, you used the Optimization Advisor wizard in the Design Studio to
create summary table recommendations. The recommended summary tables can
dramatically improve the performance of OLAP-style queries from your Alphablox
analytics applications and other applications that exchange OLAP metadata with
DB2 Warehouse.
You learned about summary tables, and you learned how to create summary table
recommendations.
In this lesson, you will learn how to use the Administration Console to connect to
a database that is a new data source for WebSphere Application Server. You will
also learn how to verify that the database is enabled for OLAP and deploy your
recommended MQTs.
If global security is configured on your application server, you can do certain
console tasks only if you login as a user with the appropriate role-based privileges.
The console supports three different roles: administrator, manager, and operator.
To deploy your recommended summary tables, use the Administration Console:
1. Ensure that the WebSphere Application Server software is running on the
application server computer. To start the server:
55
Windows
2.
3.
4.
5.
6.
7.
56
C:\dwetutorial\olap\createmqts.sql
Linux
/tmp/dwetutorial/olap/createmqts.sql
c. Click Run Script. The SQL Script Results page displays.
d. On the SQL Script Results page, download the execution log for the
summary table scripts. Open the execution log to ensure that the summary
tables were created successfully.
e. Click Finish.
Lesson checkpoint
In this lesson, you created the recommended summary tables in the DWESAMP
database. The recommended summary tables contain pre-aggregated tables that
help business reports run more quickly and can be used by OLAP-style queries
that are issued by users who are using Alphablox multidimensional reports.
You learned how to:
v Test the database connection
v Deploy recommended MQTs to a database
57
Lesson checkpoint
In this lesson, you started the cube server, added a cube to the cube server, and
started the cube. In Module 5, you learn to use Alphablox Blox Builder to create a
reporting application that retrieves and displays data from your cube.
You learned how to:
v Start the cube server in the Administration Console
v Add a cube to the cube server
v Start the cube
58
v
v
v
v
v
Learning objectives
After you complete the lessons in this module, you will understand basic concepts
about Alphablox and Blox Builder and know how to:
v Start Alphablox
v Create an IBM Cubing Services Adapter DataSource
v Create a project
v Create an application
v Create a report
v Create a query
v
v
v
v
v
Customize a report
Preview your report
Add your reports to an application navigation
Preview the application
Deploy the application
Time required
This module should take approximately 90 minutes to complete.
Prerequisites
Ensure that the following prerequisites are met:
v Module 4 has been completed successfully
v You have the Blox Builder plug-ins installed in the Design Studio
59
c. [Apache Tomcat 3.2.4 with Alphablox 8.4] On Windows, select Start > All
Programs > Alphablox > Startup Alphablox.
3. Log into the Alphablox home page as the admin user by entering the following
URL in a browser window: http://<hostname:portnumber>/AlphabloxAdmin/
home/ where <hostname:portnumber> represents the name of the server and port
number on which Alphablox runs.
4. Create an IBM Cubing Services Adapter DataSource:
Note: Your queries will need a DataSource defined under Alphablox to connect
to a cube.
a. Click the Administration tab and then click the Data Sources link.
b. Click the Create button.
c. From the Adapter menu, select the adapter named IBM Cubing Services
Adapter.
d. Type ACS_External in the Data Source Name text box.
e. Optional: Enter a description in the Description text box.
f. Ensure that your Cubing Services Server name points to the host name or IP
address of your IBM cube server. Your IBM cube server should be running
on the WebSphere server.
g. Ensure that you include the port number for the IBM cube server, as
discussed in the previous module. See Module 4: Lesson 7: Starting the cube
server for more information.
h. Enter your DB2 user name and password.
i. Specify a number in the Maximum Rows and the Maximum Columns text
boxes. The values limit the number of rows or columns returned for queries
entered through this data source. The default values are 1000.
j. Click the Save button to save the data source.
Lesson checkpoint
You learned how to:
v Start Alphablox
v Create an IBM Cubing Services Adapter DataSource
60
b. Expand Blox Builder and select Blox Builder Project. The Blox Builder
Project window displays.
c. Name your project BloxBuilderProject in the Project name field. You can
check the Use default location box to store your project in your workspace,
or uncheck the option to type or browse to a location to save your project.
Click Finish. A window displays asking if you want to change to the Blox
Builder perspective. Alphablox is the specialization of the Blox Builder
perspective. Click Yes. Your BloxBuilderProject displays in the Blox
Builder Project Explorer.
2. Create an application:
Note: Your reports will be displayed in the navigation of an application.
a. In the Blox Builder Project Explorer, expand BloxBuilderProject.
b. Right-click on the Applications folder and select New > Application. The
New Application wizard displays.
c. Type priceAnalysis for the name of your application in the Application
name field. The name of your application reflects the name of the folder
that contains your application when you export it to the Blox Builder server.
For the display name type Price Analysis. Click Finish. The new
priceAnalysis application is displayed under the Applications folder in the
Blox Builder Project Explorer.
Lesson checkpoint
You learned how to:
v Create a project
v Create an application
61
e. In the Properties view, click the Data Datasource tab and type ACS_External
for the DataSourceName.
f. Click the button that is second to the right. The Alphablox Server
Configuration window displays.
g. Provide the following:
v A name for the server configuration
v IP address of the Alphablox server or DNS name of the Alphablox server
v 9080 as the port number for Alphablox
v The username and password
Click Save to save this configuration.
h. Click Test Server Configuration. A window will display informing you if
your connection to the server was successful. Click Save Configuration.
Click Next.
i. In the Properties tab, select ACS_External as your DataSourceName, enter
your user name and password, and click Finish. You might be prompted to
log into Alphablox; after you log in, the Query Designer page opens.
j. Select the name of your cube from the drop-down list and click Run Default
Query.
k. Click Apply Query Update.
l. Click Preview Query to preview your query. The Alphablox Server
Configuration window displays. You should already have your server
configuration information filled out.
m. Click Finish to display the Query Tools page. Click Close Window to close
the Query Tools page.
n. Save your query.
2. Create a Stores By Time query:
a. In the Blox Builder Project Explorer, expand BloxBuilderProject.
b. Right-click the Queries folder and select New > Query. The New Query
wizard displays.
c. Name your query Stores By Time. Click Finish. Your new query object
displays under the Queries folder in the Blox Builder Project Explorer. The
Query editor displays the query object.
d. In the Properties view, click the Data Datasource tab and type ACS_External
for the DataSourceName.
e. Click the button that is second to the right. The Alphablox Server
Configuration window displays. You should already have your server
configuration information filled out. Click Next.
f. In the Properties tab, type ACS_External for the DataSourceName and click
Finish. You might be prompted to log into Alphablox; after you log in, the
Query Designer page opens.
g. Select Price Analysis from the Cubes drop-down list and click Run Default
Query.
h. A single cell of data returned by the default query appears in the grid.
Perform the following actions in the Grid user interface to generate the
query:
v In the DataLayout panel, drag the Store dimension to Row axis.
v Right-click on the All Stores member on the Row axis and choose
Expand All to see all descendants of stores.
v Drag the Time dimension to the Column axis.
62
63
SELECT
DISTINCT( {[Price Analysis].[Time].[All Time].[2003].[1]} ) ON AXIS(0)
, DISTINCT( Distinct(Hierarchize({[Price Analysis].[Store].[All
Store],Descendants([Price Analysis].[Store].[All
Store],[Price Analysis].[Store].[All Store].level,AFTER)})) ) ON AXIS(1)
FROM [Price Analysis]
WHERE
(
[Price Analysis].[Measures].[Product Book Price Amount],
[Price Analysis].[Product].[All Product]
)
Lesson checkpoint
You learned how to:
v Create three separate queries: a simple basic query, a query that shows Stores By
Time, and a query that shows Stores By Quarter.
64
the query string and the name of the data source that the DataBlox connects to. In
the next lesson, we will show how these component properties can be driven by
property references instead of hard coded values as we are doing here.
When you add a report to an application, you can override the values of a reports
properties so that the report acts like a template. For example, you can create a
report that displays the sales data for a certain time period. A custom report
property contains the value that determines the time period. In an application, you
can set the report property to different values to display reports for different time
periods.
You use the DataBlox component to access the query. You can also override
properties that you defined in the query by setting properties in the DataBlox
component.
To create an Alphablox report, you will use a PresentBlox and a DataBlox:
1. Create a report:
a. In the Blox Builder Project Explorer, expand BloxBuilderProject.
b. Right-click the Reports folder and select New > Report. The New Blox
Builder Report wizard displays with fields that you can customize.
c. Type PresentAndData for the name of your report in the Report name field.
The Blox Builder Report overview displays on the canvas. There are three
tabs, Overview, Model, and Layout, that display at the bottom of the canvas
view. Click Finish. The new PresentAndData report is displayed under the
Report folder in the Blox Builder Project Viewer perspective.
Note: Initially, the folder will contain the report, internal files used by the
interface, and the generated XML and HTML files used by the server to
display the report. By default, the internal files will be hidden from view.
For more complex reports, you can add your own images, localized resource
files, customized HTML layouts and other assets associated with the report.
Files added by the developer will always be visible in the folder.
2. Add a PresentBlox component:
Note: The PresentBlox component combines several Blox in one. It provides
you with simultaneous chart and grid views of the same data in the same
window space. The PresentBlox component has a graphical user interface that
can nest ChartBlox, GridBlox, PageBlox, ToolbarBlox, and DataLayoutBlox
within a single presentation. Application assemblers use PresentBlox properties
to tailor how these Blox will appear. Blox properties can be set through either
Live Layout, which uses the Bloxs interface for setting properties on the Blox,
or by setting them manually. Because we are setting only two properties, we
will set them manually.
a. Click the Model tab. A categorization of the reusable Blox components
displays in a palette view.
b. Drag and drop a PresentBlox component from the palette onto the canvas.
In the Properties view, located below the canvas, click the Present property
tab and provide the following values:
Divider location
Type 0.55.
Divider location provides a line between the chart and the grid in a
PresentBlox. A valid value is anything from 0 - 1, where 0.5 will
divide the chart and grid down the middle.
DB2 Warehouse Tutorial, Version 9.5
65
splitPaneOrientation
Select HORIZONTAL from the list.
splitPaneOrientation controls whether you want the chart and grid
on top of each other, or if you want them side by side.
3. Add a DataBlox component:
Note: A DataBlox component offers the following functionality:
v Provides a representation of a data set (in grid form), either relational or
multidimensional, for the assembler to access
v Enables application scripting (such as executing a query)
v Serves as a data source for other Blox (such as ChartBlox or GridBlox)
a. Drag and drop a DataBlox component from the palette onto the canvas.
4. Select Connection on the palette. Connect the DataBlox to the PresentBlox
component. Select and drag your cursor from the DataBlox port inside the
DataBlox component to the DataBlox port inside the PresentBlox component.
Connecting the two components through the DataBlox port ensures that the
components can communicate with each other.
5. Click the Save button to save your changes to the report. Ensure that you check
the Problems tab for any errors. You can click on a problem in the Problems
tab to view detailed information.
Lesson checkpoint
You learned how to:
v Create a report
66
Because queries live outside of applications and reports, when you preview a
query from the query designer, if it contains any property references you will need
to supply those values at runtime.
To create property references and parameterize your queries and reports:
1. Create the currentQuarter property reference based on the quarter of the
current day:
Note: You will need two property references: one that you will use in the
query for the current quarter of the year and another one that you will use in
the report. The property reference that you will create for the report is also the
query ID used in the report.
a. Double-click the priceAnalysis application.
b. Click the Custom tab in the Properties view.
c. Click the Add button to create the property reference. The Property Value
Editor opens.
d. For the Name field, type currentQuarter.
e. For the Type field, select Integer.
f. Click the icon button to the right of the Value field. The Property Value
Editor opens.
1) Click the Property Reference wizard button to bring up the Property
Reference wizard.
2) In the Scope tab, select System and click Next.
3) In the Property Name tab, select the existing dataTime property
reference and click Next.
4) In the Default Property Reference tab, leave the setting at Do not add a
default value to the reference and click Next.
5) In the Expression tab, select the datePart expression from the list.
6) In the Arguments table, click within the Value section and select
QUARTER. Click Finish.
7) In the Property Value Editor dialog, you should now see:
${system:dateTime}.datePart(QUARTER) Click OK. The same value
appears in the Value field on the Create Custom Property dialog.
8) In the Scope field, type application. Click OK.
9) Save your application.
2. Create the queryId property reference and set its value to Stores By Time:
a. Double-click the PresentAndData report. Click the Custom tab in the
Properties view.
b. Click the Add button to create the property reference.
c. For the Name field, type queryId.
d. For the Type field, select String.
e.
f.
g.
h.
67
68
f. In the Name tab, click the list and select the queryId report property. Click
Finish.
g. In the Property Value Editor dialog, you should see the following value:
${report:queryId}. Click OK. The same value should also appear in the
QueryId field in your Properties view.
h. Save your report.
7. Preview the report:
a. In the Blox Builder Project Explorer view, right-click the PresentAndData and
select Preview Report. The Preview Report wizard opens.
b. In the Name Server Configuration field, your previously saved named
server configuration appears. Click Finish. Connecting to a server enables
you to preview, trace, deploy, and import your report. By default, your
report displays with the data from the Stores By Time query because you
set the value of the queryId property reference to Stores By Time earlier in
this lesson.
You have customized your query and report.
Lesson checkpoint
You learned how to:
v Create property references and use expressions
v Customize a query by parameterizing it
v Customize a report by parameterizing it
DB2 Warehouse Tutorial, Version 9.5
69
70
Lesson checkpoint
You learned how to:
v Export your application
v Create and preview report links
v Add your report to your application
v Display your application in a Web browser
71
Data mining discovers business insights in your data. You can interactively create
and visualize a mining model in the Design Studio to gain valuable insights about
the data in your organizations warehouse. You can also use the Design Studio to
generate SQL code to compute a mining model or to deploy the models related
scoring function. This SQL can be pasted into Alphablox pages or any BI
application to provide embedded analytics.
The mining flow in this module is similar to the data flows that are described in
Module 2: Designing applications to build a data warehouse. Mining flows use the
same editor and some of the same SQL operators as SQW flows. Preparing data for
mining by using one of the operators in the Preprocessing palette is a case of SQL
Warehousing. Conversely, certain mining operators can be used as part of a data
flow for warehouse building.
Learning objectives
After you complete the lessons in this module, you will be able to:
v Create a mining project
v Create a mining flow
v Define mining steps for a mining flow
Add preprocessing operators to bring the data into a form suitable for data
mining
Add a mining operator
Add a visualizer operator
Add an operator to extract the model information in tabular form
v View the mining model results
Time required
This module should take approximately 75 minutes to complete.
If you already completed the previous modules, do not complete these steps; go to
Lesson 1. If you did not complete the earlier modules and want to start here, you
need to complete these steps.
To start the tutorial here:
1. Complete the initial setup instructions for your platform, as directed in the
Introduction to the DB2 Warehouse Tutorial on page 1. Create the
DWESAMP database but do not run the setupdwesamp script.
2. Create and load the tables in the DWESAMP database by running the following
script:
Windows
C:\Program Files\IBM\dwe\samples\data\setupolapandmining.bat
Linux
/opt/IBM/dwe/samples/data/setupolapandmining.sh
3. Connect to the locally cataloged DWESAMP database.
72
Lesson checkpoint
You completed the prerequisite steps for starting the tutorial here. You can now
continue with Lesson 1.
Lesson checkpoint
In this lesson, you created a data warehouse project.
You learned how to create an empty business intelligence project, which is a
prerequisite for creating a mining flow and subsequently a data mining model.
73
Lesson checkpoint
In this lesson, you created an empty mining flow that will be used to define the
mining model steps.
You learned how to:
v Create a mining flow
v Connect to an existing database
74
as association rules. With the associations function, you can perform market basket
analysis to explore product affinities to understand which products tend to be
bought by the same customers. In the context of market basket analysis, an
example association rule can have the following form:
If product A is purchased, then product B is likely to be also purchased by
the same customer.
In addition to the rule, the associations mining also calculates some statistics about
the rule. In market basket analysis, the following three statistical measures are
usually used to define the rule:
Support
The support of an association rule measures the fraction of baskets for
which the rule is true. For example, if product A and product B are found
in 10% of the baskets, then the support value is 10% (as a percentage
value) or 0.1 (as an absolute value). The percentage value is calculated
from among all the groups that were considered.
Confidence
The confidence in an association rule is a percentage value that shows how
frequently the rule head occurs among all the groups that contain the rule
body. The higher the value, the more often this set of items is associated
together. For example, if product B is present in 50% of the baskets that
contain product A, then the confidence value is 50% (as a percentage
value) or 0.5 (as an absolute value). Expressed another way, if product A is
in a particular basket, then product B will be found in the same basket on
50% of occasions.
Lift
The lift value for the association is the ratio of the rule confidence to the
expected confidence of finding the rule in any basket. For example, if
product B is found in only 5% of all baskets, then the Lift for the rule
would have a value of 10.0. The lift value of 10 means that the association
of A and B is occurring ten times more often than if B were selected by
chance. Lift is therefore a measure of how the rule improves the ability to
predict the rule body.
You can also use taxonomies with the Associations mining function. You can make
the associations that are found among items more meaningful if you group the
items into subcategories, and then group these subcategories into categories. The
result is a hierarchy of categories with the items on the lowest level. This hierarchy
is called a taxonomy.
In this tutorial, the sample data contains a retailers products, organized by
departments, in addition to purchases that are made by customers. The output is a
table that contains the rules in the associations model, which can also be viewed
by an Alphablox report. To perform the associations mining, you must use both the
transaction level data and the product hierarchy data to calculate the required
association rules. In addition, the product hierarchy data is used by the mining
tool to automatically determine associations between individual products, product
subgroups, product subgroups and products, product groups and subgroups, and
so on. Associations at all levels in the product taxonomy are derived.
Tip: The mining steps for the mining flow in this tutorial use the same mining
preprocessing operators as those in SQW.
75
76
Click OK.
The following figure shows the SQL Condition Builder window.
Figure 18. Example of the SQL Condition Builder window and SQL text.
f. In the Properties view, click the Select List tab. The columns that are
required for the tutorial are:
v PD_ID
v CNTPR_ID
Remove all the unnecessary columns by selecting the column and clicking
the Delete button (
). Do not delete the columns listed above that are
required for the tutorial.
g. Verify your partial flow:
1) Right-click the table join operator and select Run to this step.
2) In the Partial Execution of the Mining Flow window, select Execute
generated code and Show sample results. Samples are displayed in
the Execution Status view.
3) Click Finish.
h. Before proceeding to the next step, save your work.
4. Add a table source operator to the mining editor canvas.
a. From the Sources and Targets palette, drag a third table source operator to
the canvas to prepare the taxonomy and name mapping information.
b. In the Select Database Table window, select the PRODUCT table from the
MARTS schema, and click Finish.
DB2 Warehouse Tutorial, Version 9.5
77
5. Add a select list operator to transform columns in a selection list. Add the
operator to the mining editor canvas, connect to a table source operator, and
define the select list parameters.
a. From the Preprocessing palette, drag a select list operator to the canvas.
b. Connect the output port of the PRODUCT table source operator to the
select list operator input port.
c. Right-click the select list operator, and click Show Properties View.
d. In the Properties view, do the following actions:
1) Click the Select List tab. The columns required for the tutorial are:
v PD_ID
v NM
v PD_DEPT_NM
v PD_SUB_DEPT_NM
2) Remove all the unnecessary columns by selecting the column and
) button. Do not delete the columns listed
clicking the Delete (
above that are required for the tutorial.
3) For each of the following columns, select a column and click the
ellipsis (
) push button to modify the expressions.
Type the following expressions (in this example, INPUT_12 is the
internal input table name). Include spaces between the single quotation
marks:
v For NM, type rtrim( INPUT_012.NM )
v For PD_DEPT_NM, type Dept: || rtrim(INPUT_012.PD_DEPT_NM)
v For PD_SUB_DEPT_NM, type Subdept: ||
rtrim(INPUT_012.PD_SUB_DEPT_NM) || in ||
rtrim(INPUT_012.PD_DEPT_NM)
6. Add a distinct operator to remove duplicate rows. This operation is needed to
create unique subdepartment-department pairs for the taxonomy. Place the
operator on the mining editor canvas and connect to the select list operator.
a. From the Preprocessing palette, drag a distinct operator to the canvas.
b. Connect the output port of the select list operator to the distinct operator
input port.
c. Right-click the distinct operator, and click Show Properties View.
d. Click the Column Select tab, and then specify the PD_DEPT_NM and
PD_SUB_DEPT_NM columns as selected. To specify these columns, select
NM and PD_ID in the Selected columns list and click the left arrow
button (
) to move the selections to the Available columns list.
You completed the preprocessing steps for your mining model. The following
figure shows the preprocessing steps.
78
Figure 19. An example of the partial mining flow that includes the preprocessing steps.
79
5) Type 0.1 as the percent value for the Minimum Support field to
specify that the support of each generated rule is at least 0.1 percent.
g. On the Name Maps page, select PD_ID in the Item ID Column field and
NM in the Item Name Column field.
h. On the Taxonomy page, select PD_ID in the Child Column field and
PD_SUB_DEPT_NM in the Parent Column field for the category map. For
the Category1 map, select PD_SUB_DEPT_NM in the Child Column field
and PD_DEPT_NM in the Parent Column field.
i. On the Column Properties page, select Names for the Name Mapping field
and Yes for the Taxonomy field.
j. On the Item Format page, select Default.
8. Add a visualizer and an associations extractor operator to the canvas and
connect the operator ports.
a. From the Sources and Targets palette, drag a visualizer operator to the
canvas to the right of the associations operator.
b. Connect the associations operator model port to the visualizer operator
model port.
c. From the Mining Operators palette, drag an associations extractor operator
(which extracts information from an associations rule model) to the canvas.
d. Connect the associations operator model output port to the associations
extractor operator model input port.
9. Create a target table that is suitable for the rules output of the associations
extractor operator port.
a. Right-click the rule output port of the associations extractor operator and
select Create Suitable Table from the menu. The Required Table
Information window opens.
b. Type RULES in the Table name field.
c. In the Table schema field, select MARTS as the schema in which the table is
created.
d. Select the tablespace in which the table is created.
e. Click Next. The Table Details page opens. Use the default settings on this
page.
f. Click Finish.
g. Save the mining flow.
Tip: You can also create the table manually in a DB2 command window by
connecting to the DWESAMP database and by copying and running the
CREATE TABLE MARTS.RULES statement from the following script:
Windows
C:\Program Files\IBM\dwe\samples\data\mining\mba.sql
Linux
/opt/IBM/dwe/samples/data/mining/mba.sql
If you create the table manually, you also need to drag a target table operator
onto the canvas and connect the rule port of the associations extractor
operator to the input port of the target table operator by selecting Connect by
name from the Select Column Connections dialog.
80
10. Validate the flow that you created. Click the Validate this mining flow (
)
icon on the toolbar. You will see a red circle with a white cross in the top left
corner if anything is wrong with an operator. Double-click that symbol for a
diagnostic window.
The mining flow steps are displayed in the mining editor.
The completed mining flow will look like this:
Lesson checkpoint
You created the mining steps for the model that will analyze product associations.
You learned how to:
v
v
v
v
v
81
a. Click Item Sets to show a list of frequently sold products and product sets.
Click the entry [Dept: ELECTRONICS]. You can see in the Support column
that 5.4303% of all customers bought from this department.
b. If you want to find out which customers buy from the electronics
department more frequently than others, click the Fan In icon (
) on the
toolbar, and then select the Rules tab. You can see rules of this type: If
customer buys A, then the customer is likely to buy also from the
electronics subdepartment, where A can be a product or another department
or subdepartment. The Lift column indicates how much more frequently
this happens compared with all customers. The Absolute Support column
indicates the number of customers involved. A lift of 3.63 means that a
customer who buys from the photography department is nearly four times
more likely to purchase from the electronics department.
c. Select the Graph tab to see the rules in graphical form.
d. Close the associations model. When prompted to save the mining model
results, do not save the results.
In the Execution Status view, you see the status and action of the execution
process. In the Database Explorer, you see that the mining model that was
displayed is stored in the database.
The model that you ran used the associations function. Like the associations
function, the sequential patterns and clustering functions do not need a scoring
function against the model. That is, the model itself can be the end result. If you
want to use a model to make predictions, you need to test the models quality
(operator Tester).
Lesson checkpoint
In this lesson, you ran and viewed the mining model that analyzes product and
customer purchase combination associations.
You learned how to:
v Run your mining flow
v View the results of the associations model
82
After having entered values for these variables the association rules for the stores
of the specified store type and year are computed by using a modified version of
the mining flow presented in Module 6: Creating a mining model on page 71. As
a result the visualization of the rules model is shown.
This module consists of the following lessons:
v Creating the Miningblox Sample project
v Creating the Miningblox application
v
v
v
v
Learning objectives
After completing the lessons in this module, you can perform the following tasks:
v Create the control flows that contain the operations executed by the Web
applications
v Create deployment packages for Miningblox applications (data warehouse and
Web applications)
v Deploy and run the different parts of this application
v Customize your application by modifying the JSP pages
v DB2 Warehouse Administration Console
Time required
This module takes approximately 60 minutes to complete.
Prerequisites
You should have completed Module 6: Creating a mining model on page 71.
83
For this Miningblox Sample application the process control flow contains the
mining flow that computes the association rule model. The cleanup control flow
contains a mining flow that deletes the rule model from the rule model table.
For this lesson, it is assumed that you are familiar with the design of control flows
and mining flows, otherwise, refer to the corresponding module of this tutorial.
To set up the project and define the flows with variables, complete the following
steps:
1. To create a new data warehouse project, select File New Data Warehouse
Project. Specify the new project name: Miningblox Sample.
2. To create variables and variable groups in the Design Studio, select Data
Warehousing Manage Variables, choose the correct project, and click Next.
Note: It is recommended that you define two groups of variables: user
variables and runtime variables. In this example you create the following
groups of variables:
v The inputParams variable group for the variables to be entered by the user.
v The runtimeParams variable group, whose variables are automatically
instantiated by the application when it runs.
3. To create the variables groups:
a. Click New on the left part of the wizard.
b. Name the group inputParams and click OK.
c. Repeat steps 2a) and 2b) to create a second group that is called
runtimeParams.
4. To create the variables:
a. Select the inputParams group and click New on the right part of the
wizard.
b. Type the following information:
v Name: year
v Type: Integer
v Current value: 2002
v Final phase for value change: EXECUTION_INSTANCE
c. Repeat the steps 3a) and 3b) to create a second variable. Type the following
information:
v Name: storeType
v Type: String
v Current value: all
v Final phase for value change: EXECUTION_INSTANCE
d. Select the runtimeParams group and create the following variable:
v Name: modelName
v Type: String
v Current value: MBA_RULES
v Final phase for value change: EXECUTION_INSTANCE
The variables that you have just created can now be used in mining flows.
5. To create the mining flow for the process control flow:
a. Copy the DB2 Warehouse Tutorial mining flow that you created in Lesson
3: Defining mining steps for mining flows on page 74 to the Miningblox
Sample project. Rename it to Tutorial Miningblox Flow.
84
Note: To avoid cut-and-paste errors, make sure that the text you copy
from the PDF format only contains straight single quotes (such as all)
instead of apostrophes.
f. To define the variable for the Association operator. On the Model Name tab
of the Properties view, click on
, choose Use Variable, and click the ...
button to select ${runtimeParams/modelName}.
Figure 21. DB2 Warehouse Tutorial mining flow modified for Miningblox
6. Run the flow. To check if a model has successfully been created, select the
Database Explorer, click on the DWESAMP database, and then select
DWESAMP Data Mining Model Rules. Here you should find the model
named MBA_RULES.
7. To design the cleanup mining process:
a. Create a new mining flow that is called CleanUpFlow.
b. Drag a Custom SQL operator onto the canvas. Type the following
statement as SQL CODE:
delete from IDMMX.RULEMODELS where MODELNAME = ${runtimeParams/modelName}
85
8.
9.
10.
11.
c. Drag a Source Table operator onto the canvas and link it to the Custom
SQL operator. The Custom SQL operator must be linked to a source table,
but the table is not important in this case because the SQL custom code
does not refer to it.
Run the flow and check if the model named MBA_RULES that you have
viewed in 6 on page 85 has been erased:
To define the control flow for the mining process:
a. Create a new control flow that is called Mining Process.
b. Drag a Mining Flow operator to the canvas and link it to the Start
operator.
c. On the Properties view, choose Tutorial mining flow.
d. Drag and drop one End operator and one Fail operator onto the canvas,
then link the End operator to the success output port and the Fail operator
to the fail output port of the mining operator.
Repeat step 9 to construct a control flow that is called CleanUp process for the
cleanup mining flow.
Execute the two control flows and check if a rule model is created or deleted.
You have to refresh the Rules folder in the Database Explorer by right-clicking
the Rules folder and selecting Refresh to browse the latest changes after the
execution of a control flow.
Lesson checkpoint
In this lesson, you created the project and flows, on which your application is
based. You also created variables and applied them in the flows so that they can be
initialized and used by the application.
86
4. On the Control Flow Selection page, click >> to import all of your flows into
the application, and click Next.
5. On the Resource Profile Management, Variable Management, DDL File
Selection, and Saving Application Profile pages you can leave the default
settings, and click Next.
6. Click Next on the Code Generation page which again does not need any
special input, then the Package Generation page is displayed, where you can
specify where to save the data warehouse application .zip file.
Type C:\temp on Windows and/tmp on Linux, and click Next. An
MBA_RULES.zip file is created at the selected location. The pages for
configuring the Miningblox Web application are displayed.
7. On the Miningblox application details page, add DWESAMP as the Alphablox
data source, select Mining Process as the Work Control Flow, and make sure
that ${runtimeParams/modelName}is the Variable for model name. The
remaining fields are filled out correctly because the wizard searches for
keywords, such as model, in the flow and variable names.
8. On the Result Page Template Selection page, select a Miningblox tag to specify
the way in which the mining model is displayed. Select the
AssociationModelVizualiser, and click Next.
9. On the Result Page Editor page, you can see the code of the resulting JSP
page. This allows you to make changes to the tag attributes but you do not
need to do anything here.
10. On the Web Application Generation page, specify again C:\temp on Windows
or /tmp on Linux for the ear file directory, and click Finish.
You now have a data warehouse application .zip file and a Miningblox Web
application EAR file.
DB2 Warehouse Tutorial, Version 9.5
87
Before you can start the application, you must first complete the following tasks,
which are also described in this tutorial.
1. Create the Alphablox data source if it does not yet exist.
2. Deploy the data warehouse application to the DB2 Warehouse Administration
Console.
3. Deploy the Web application to the WebSphere Application Server and start it.
Lesson checkpoint
In this lesson, you used the wizard to prepare your application for deployment.
You learned how to:
v Use the Miningblox Application Deployment Preparation wizard.
v Create a data warehouse application .zip file.
v Create a Web application EAR file.
Lesson checkpoint
In this lesson, you learned how to create an Alphablox data source.
88
89
Lesson checkpoint
In this lesson, you created a data source for your data warehouse application and
deployed it using the DB2 Warehouse Administration Console.
You learned how to:
v Create a data source for a data warehouse application.
v Use the DB2 Warehouse Administration Console to deploy an application.
90
91
You can accept the default values and click on Run. While the task is running, you
see the following window:
When the task has completed successfully, your association model is displayed in
an association visualizer, which is similar to the visualizer in the Design Studio. If
you are only interested in business goals, you can now easily obtain an association
rule model for the year and the store you selected. You do not need any mining
skills for this task.
92
On the first column you can see the rules created by the model. The three next
columns (Support, Confidence, Lift) define the quality of each rule.
You have created a mining application that can be used from any Web browser. If
you want to change the input page, you can customize the applications JSP pages.
Lesson checkpoint
In this lesson, you deployed the MBA_RULES Web application on the WebSphere
Application Server and used it in your Web browser.
You learned how to:
v Choose the correct parameters for deploying a Web application.
v Use your mining application.
93
usability of this page, you add list fields, which automatically contain the
values that are available.
processTask.jsp page
The processTask page is displayed when you click Run on the
inputForm.jsp page. You only have to match the changes made in the
inputForm page.
To customize and use the application, complete the following steps:
1. Open the inputForm.jsp page with a text editor and make the following
changes.
a. Between <DIV class="mbbody"> and <FORM>, add the following code snippet:
<blox:data id="storeTypeDataBlox"
query="SELECT STORETYPE FROM MARTS.STORETYPES"
dataSourceName="DWESAMP"/>
<iminer:memberSelectRDB
id="inputParams_storeType"
valueColumn="STORETYPE"
minimumWidth="200" visible="false"
multiple="true" size="4"
dataBloxRef="storeTypeDataBlox"/>
<iminer:select
id="inputParams_year" size="1"
minimumWidth="200" visible="false">
<bloxform:option label="2002" value="2002" selected="true"/>
<bloxform:option label="2003" value="2003"/>
<bloxform:option label="2004" value="2004"/>
</iminer:select>
This code snippet adds two selection blox: one for the store type, which
uses an SQL query to find the store types; and one for the year, for which
values are manually entered.
b. Replace the input form area (from <FORM>to</FORM>) ) with the following
lines:
<FORM action="processTask.jsp" name="ParameterForm" method="post">
<table width="615" border="0" cellpadding="6" cellspacing="2">
<tr>
<td valign="top"><strong>Storetype:</strong></td>
<td><blox:display bloxRef="inputParams_storeType"/></td>
</tr>
<tr>
<td valign="top"><strong>Year:</strong></td>
<td><blox:display bloxRef="inputParams_year"/></td>
94
</tr>
<tr>
<td valign="top"><strong>Task Name:</strong></td>
<td><INPUT name="taskName" value="Sample Task" type="text" size="30"
maxlength="50"></INPUT></td>
</tr>
<tr>
<td valign="top"></td>
<td><input type="submit" name="run" value="Run"></input></td>
</tr>
</table>
</FORM>
This code snippet adds two display blox that refer to the selection blox.
Save your changes.
2. Open the processTask.jsp page in a text editor:
a. Under // Get parameters from the input form, replace the following lines:
String taskVis = "application";
String inputParams_storeType = request.getParameter("inputParams_storeType");
Note: To avoid cut-and-paste errors, make sure that the text you copy from
the PDF format only contains straight single quotes (such as all) instead of
apostrophes.
3. Save your changes and open your applications main page again to see the
changes.
Lesson checkpoint
You have customized the applications JSP pages to create a user interface that is
easy-to-use.
You learned how to:
v Use JSP tags to customize the user interface of the application.
95
data, which can be analyzed. To achieve this goal, you can use the Dictionary
Lookup operator. This operator uses the user-defined dictionary that contains a list
of IT skills to create annotations. It finds each occurrence of terms that have
previously been defined in the dictionary and marks their positions. You obtain a
table, which shows the terms (skills) found in the dictionary for each job offer.
From this table you first design a star schema, and then build an OLAP cube.
Finally you use this cube in an Alphablox-based report to show the results of the
analysis.
This module consists of the following lessons:
v Understanding the data used in this module
v Using the Text Analysis tools
v Building a star schema
v Defining an OLAP model for the star schema and deploying it to Cubing
Services
v Creating an Alphablox report
Learning objectives
After completing the lessons in this module you understand the concepts and
know how to:
v Use the text analysis tools for analyzing unstructured data
v Use and create a star schema
v Define and deploy a cube model
v Use the Blox Builder function to build an Alphablox report
Time required
This module takes approximately 90 minutes to complete.
If you have already completed the previous modules, skip this section and
continue with Lesson 1. If you have not completed the previous modules, but want
to start here, you must complete the following steps.
To start the tutorial here:
Open the DB2 Command window and run the following script to create the
appropriate version of the DWESAMP database:
Windows
C:\Program Files\IBM\dwe\samples\data\setupolapandmining.bat
Linux
/opt/IBM/dwe/samples/data/setupolapandmining.sh
For more information on this script refer to the Readme.txt or readme_linux.txt file
in the data directory. For general information on the tutorial setup, refer to the
following sections:
v Running the tutorial in a Windows client-server environment on page 7
96
JOB_DESC
The text-form of the job description.
To explore the JOBS table and create your project:
1. To see the sample contents of this table:
a. In the Design Studio, expand the tree in the Database Explorer and
navigate to database DWESAMP, schema TXTANL, table JOBS.
b. Right-click the JOBS table and select Distribution and statistics Data
Exploration.
In the Sample Content tab you can view the sample data. On the bottom of this
view, you can see the full content of the text column JOB_DESC.
2. To create a new project and the mining flow, select File New Data
Warehouse Project. In the creation wizard, name your project Unstructured
Data Tutorial and click Finish.
Note: The Design Studio provides a sample project that already contains most
of the resources (dictionaries, taxonomies, flows) created in this module. If you
want to save some time, create the Text Analysis Sample project as follows:
Select File New Examples Data Warehousing Examples Text Analysis
Sample.
You can now start to analyze the unstructured data in the JOB_DESC column of
the JOBS table.
Lesson checkpoint
You are more familiar with the data that is processed in this module. You have
viewed the JOBS table and created your data warehouse project.
After completing this lesson, you learned how to:
DB2 Warehouse Tutorial, Version 9.5
97
If you are using the Text Analysis Sample project, it already contains the it_skills
dictionary, the Skills taxonomy taxonomy, and the Jobs Dictionary Analysis
flow described in this lesson.
To define the Text Analysis tools and construct the flow:
1. Create the it_skills dictionary:
a. In the Data Project Explorer view, expand the project tree: Unstructured
Data Tutorial Text Analysis Dictionaries
b. Right-click Dictionaries and choose New Dictionary
98
99
Figure 30. Dictionary Entries for the base forms and their variants (left part of the Dictionary
editor)
v In the Entry details section on the right side of the Dictionary editor,
you can edit a dictionary entry or create new entries. Here you can add
or change the variants of the entry.
100
Figure 31. Entry details of the dictionary (right part of the Dictionary editor)
v The Inflections section in the lower part on the right side of the
Dictionary editor shows the inflections of an entry, which are
automatically detected during lookup. For example, if you enter the
singular form of database in the dictionary, the plural form databases
is also found automatically. The inflections are shown for the chosen
language.
v Select English because the texts in the JOBS table are in English.
Note: For the selected language you need not enter case-sensitive
variants for common words, which are also referred to as in-vocabulary
words. Terms are marked as in-vocabulary or out-of-vocabulary in the
inflections table. For example, if you enter database into your
dictionary, the operator automatically detects Database or DataBase.
However, for acronyms or special terms, which are marked as
out-of-vocabulary, the detection depends on the case in which the word
is entered into the dictionary. It is recommended that you enter the
words in lowercase. For example if you enter j2ee, the dictionary finds
J2EE; if you enter the term in uppercase, the lowercase occurrence is
not detected.
2. Define the dictionary. Enter the following list of terms and their variants. The
first word of a list item is the base form of the entry and the following words
are the variants of this entry.
v C# , c#, C #, c #
v C/C++, C, C++, c++, c ++, C ++
DB2 Warehouse Tutorial, Version 9.5
101
v
v
v
v
v
v
v
102
b.
c.
d.
e.
103
The obtained table links informative data about the offers (which company, when,
which offer) with the required analysis data (the IT skills requested in these offers).
You can also explore relationships among TIME, COMPANY, and SKILLS using the
Multivariate Analysis:
1. Right-click the IT_SKILLS_ASKED target table in the flow editor and select
Distribution and statistics Multivariate.
2. In the Input Data Selection dialog, accept the default values and click OK.
3. In the Multivariate Distribution viewer, hold the Ctrl key and select the
columns TIME, COMPANY_NAME, and SKILLS_CAT in the table.
104
You can now use this table for reporting, for example, to:
v Find out which skills are required on the market, and get an idea of the
development activities of the other companies.
v Study the trend for a defined period of time.
v Identify which companies share the same interests.
In the next lesson you are going to build an example with a cube model.
Lesson checkpoint
You obtained a table with the skills requested by each offer. This table is the start
of an OLAP analysis.
You learned how to:
v Define a dictionary
v Define a taxonomy and export it as a table
v Define the parameters of the Dictionary Lookup operator
A star schema is a simple database schema, which is optimized for OLAP queries.
It consists of a fact table and different dimension tables:
v The fact table represents measured values. It is at the center of the schema.
v The dimension tables represent the data description. These tables form the star
branches.
DB2 Warehouse Tutorial, Version 9.5
105
The fact table (the center) contains one row for each skill mentioned for each job
offering. Column NB_SKILLS is a weight assigned to the fact row. Each job
offering should count only once, that is why the NB_SKILLS value is computed as
1 / number of skills for each offering. If a job offering mentions three skills, each
skill fact row gets a weight of 1/3. This allocation scheme is called uniform
allocation. The other columns in the fact table are foreign keys, which refer to the
three dimension tables. The ID column allows to link the fact row back to the job
offering that contained this skill reference.
The SKILL and the TIME dimension tables contain several columns that define
hierarchies in the cube. You can use these hierarchies to drill down from summary
to detail level.
Create the text analysis sample project and explore the star schema diagram. The
Design Studio provides a sample project that contains most of the resources
(physical database model, data flows, control flows, and cube model) used in this
module:
1. In the Design Studio, select File New Example Data Warehousing
Examples Text Analysis Sample.
2. Click Next, accept the default values, then click Finish.
3. In the Data Project Explorer you can now expand the TextAnalysisSample
project.
4. In the Data Models folder, double-click the StarSchemaModel database model.
5. Expand the OLAPANL schema and open the OLAPANL diagram to explore the
physical tables defined for the star schema.
Explore and execute the transformation flows to populate the star schema tables.
The sample project contains several data flows referenced from the control flow
PopulateStarSchemaForJobs.
CleanupStarSchema
Delete all rows from the target tables.
FillCompanyDimensionTable
Create a company dimension entry for each distinct company in
IT_SKILLS_ASKED.
FillTimeDimensionTable
Create a time dimension entry with levels for year, quarter, month, and
day for each distinct time value in IT_SKILLS_ASKED.
FillSkillDimensionTable
Fill the skill dimension using the SKILL_TAXONOMY table prepared in
the previous lesson.
PopulateFactTable
Create a fact row for each row in IT_SKILLS_ASKED, check the
dimension-table foreign keys in the dimension tables, and compute the
NB_SKILLS weight for each fact row.
The dimension table flows use a DB2 sequence object to generate unique IDs,
which are used as primary keys for the dimension tables.
To explore a data flow:
1. Expand the Data Flows subfolder of your TextAnalysisSample project and
double-click a data flow.
106
2. In the flow editor select an operator to explore the operators properties in the
Property view.
In the FillSkillDimensionTable data flow, the SKILL_TAXONOMY table created in
the preceding lesson is joined with the distinct values for column SKILL_DETAILS
from the IT_SKILLS_ASKED table. This enables you to drill down in the skill
dimension of the cube, not only to the SKILL_CAT/TERM_NAME level, but also
to the individual occurrences/variants of the skills mentioned in the text. If you
are just interested in SKILL_CAT level analysis, you can directly use the
SKILL_TAXONOMY table as a dimension table.
The PopulateFactTable data flow creates a fact row for each row in
IT_SKILLS_ASKED, looks up the dimension table foreign keys in the dimension
tables (using a join) and computes the NB_SKILLS weight for each fact row as
follows: The group by operator counts the number of rows in IT_SKILLS_ASKED
for each ID value (for each job offering). The select list property of the Table Join
operator computes the NB_SKILLS weight using the following expression:
DOUBLE(1) / DOUBLE(IN4_07.SKILLS_PER_OFFER)
107
Lesson checkpoint
You explored and populated the star schema for your text analysis results.
You learned:
v What a star schema is
v How to create data flows that transform your text analysis results into a
multidimensional data mart
108
To validate and deploy the cube model and the Jobs Analysis cube to the
DWESAMP database and the cubing services repository:
1. In the Data Project Explorer, right-click the OLAPANL schema and select
Analyze Model.
2. Accept the default parameters and click Finish. In the Console view, check
your errors and correct them.
3. When there are no errors left, you can deploy your metadata to the database:
right-click the STAR_FACT_TABLE cube model in the Data Project Explorer,
and select Deploy to database. The Deploy OLAP Objects dialog opens.
4. Select DWESAMP as the target database and click Finish.
The OLAP metadata objects are now available in the metadata repository. To
invoke multidimensional queries you must assign the cube to a running cube
server and start the cube. You have learned how to define and start a cube server
in Module 4: Designing OLAP metadata on page 43. In this lesson it is assumed
that the cube server is up and running.
To define and start the Jobs Analysis cube on a cube server using the
Administration Console:
1. From the Administration Console, select Cubing Services Manage Cube
Servers. A list of cube servers is displayed.
2. Verify that a cube server, for example DWEREPOS, is started - if not, start the
server.
3. Select the started cube servers link from the list of cube server names. The
cube server Properties page opens.
4. Define the database mapping for the Job Analysis cube.
v Select Cubing Services Manage OLAP Metadata
v Click START_FACT_TABLE
DB2 Warehouse Tutorial, Version 9.5
109
Lesson checkpoint
You explored the cube model for the star schema and deployed the cube model
and cube to the database and a cubing service.
You learned how to:
v Define and modify OLAP metadata
v Deploy OLAP metadata to a database
v Assign and start a cube on a cube server
Figure 38. The Alphablox report for the Jobs Analysis cube
Alphablox needs data source definitions to access the data in the cubes you
defined. In Module 5: Creating Alphablox reports based on IBM cubes on page
58 you have created Alphablox data sources to access your multidimensional cubes
from Alphablox reports. In this lesson it is assumed that Alphablox has already
been configured to access your defined cubes.
110
If you cannot access the defined cubes, define a Cubing Services Adapter data
source on the Alphablox Administrative pages:
1. Log into the DB2 Alphablox Administrative Pages as a user with
administrator rights. From a Web browser, you must define:
http://serverLocation:9080/AlphabloxAdmin/home.
2. Click the Administration tab, click the Data Sources link, and then click
Create.
3. Enter the following information:
v Data Source Name: CSADAPTER
v Adapter: IBM Cubing Services Adapter
v Cubing Services Server Name: host name of the machine where you started
the cube server
v Port: port name of your cube server as defined in the Administration
Console
v Userid and password
Note: To test your reports the Blox Builder needs a valid Alphablox server
configuration. You can verify the Blox Builder settings in the Design Studio under
Window Preferences Blox Builder Alphablox Server Configuration. For
details, refer to Module 5: Creating Alphablox reports based on IBM cubes on
page 58.
To
1.
2.
3.
111
Text
Height
50
Width 500
5. Specify the properties of the DataBlox component. This component defines the
data source to be used in the report and the query to be executed.
a. Select the DataBlox component.
b. In the Properties view specify:
v On the Data Source tab, specify Data Datasource: CSADAPTER (use the
same cubing services adapter name that you specified in the data sources
section in the Alphablox administrative pages)
v On the Data Query tab, specify the following MDX Query:
SELECT
DISTINCT( {{[Jobs Analysis cube].[STAR_TIME].[All years],
AddCalculatedMembers([Jobs Analysis cube].[STAR_TIME].[All years].Children)}} )
ON AXIS(0)
, DISTINCT( {{[Jobs Analysis cube].[STAR_SKILL].[All Skills],
AddCalculatedMembers([Jobs Analysis cube].[STAR_SKILL].[All Skills].Children)}} )
ON AXIS(1)
FROM [Jobs Analysis cube]
WHERE
(
[Jobs Analysis cube].[Measures].[NB_SKILLS (STAR_FACT_TABLE)],
[Jobs Analysis cube].[STAR_COMPANY].[All companies]
)
Note: If you are not familiar with the MDX query language, you can
easily create your own MDX expressions in the Alphablox query builder.
6. Specify the properties of the PresentBlox component. This component defines
the visual appearance of the data and contains a tabular grid showing the data
cells and a chart.
a. Select the PresentBlox component.
b. In the Properties view specify:
v On the Grid Data Display tab, the DefaultCellFormat: ###,##0
v On the Present Property tab, the DataLayoutAvailable,
DataLayoutVisible, PageAvailable, and PageVisible: false
v On the Chart Property tab, the O1AxisTitle: Year
v On the Chart Property tab, the Y1AxisTitle: Number Job Offerings
v
112
113
Lesson checkpoint
You created a new Alphablox report to visualize your text analysis results.
You learned how to:
v Define a Cubing Services Adapter in Alphablox
v Create a report using the Design Studio Blox Builder tool
v Define and test the report model and layout
Summary
You have completed the design and deployment of a business intelligence solution
that expands the capabilities of a DB2 data warehouse for the JK Superstore retail
stores.
Lessons learned
By completing this tutorial, you learned how to:
v Design and update a physical data model for a warehouse database
v Design applications for building warehouse and mart tables by using SQL-based
data flows and control flows
v Deploy and run warehouse building applications in the Administration Console
v Design a complete cube model and deploy performance-enhancing MQTs
v Design a Blox Builder report that is based on OLAP metadata
v Design a mining model that analyzes purchasing trends
v Create a mining web application with Miningblox
v Analyze unstructured data by using text analysis tools, an OLAP cube, and an
Alphablox report
Glossary
General terms
Administration Console
DB2 Warehouse Web client, which provides a browser-based
administration environment for data warehouse applications. The console
is hosted by WebSphere Application Server and recognizes data sources
and system resources that are already defined in the WebSphere
environment. You can use the console to deploy, schedule, and monitor
jobs that load data and run mining analyses. You can also start, stop, and
maintain cube servers and cubes.
BI Perspective
Default perspective in the Design Studio. A perspective defines the initial
set and layout of views and provides a set of functions for accomplishing a
specific type of task, or working with a specific type of resource. The BI
perspective includes functions that are tailored for building information
warehouses and enabling warehouse-based analytics such as OLAP and
data mining.
Data Project Explorer
A tree view that shows the files and metadata associated with your data
114
warehouse projects. You can create and manage data warehouse projects
and objects within a project such as mining flows, data flows, control
flows, and warehouse applications.
Database Explorer
A tree view where you create new database connections and connect to
existing databases; explore database schemas; invoke data exploration
functions such as sample content, value distributions (univariate, bivariate,
multivariate); and explore data mining models.
Design Studio
An integrated graphical development environment for designing various
components of a data warehouse application, including physical data
models, SQL-based data flows and mining flows, control flows, and OLAP
cube models.
editor A visual component that you typically use to edit or browse a resource.
Modifications that you make in an editor are usually not automatically
saved. You must explicitly save your changes. Multiple instances of an
editor can exist within a project.
project
Design Studio containers where you build the objects in your data
warehouse application. Depending on the type of project that you are
working in, you can create different types of objects, such as physical data
models and OLAP objects in a data design project or data flows, control
flows, and mining flows in a data warehouse project.
Properties views
Tabbed pages that allow you to specify the detailed behavior of each
operator in a data flow. Within these pages, for example, you can define
which tables or files your source and target operators represent, and how
each operator will change the data set.
Text analytics
The automatic extraction of structured information from unstructured
textual documents. This structured information can be stored in relational
database tables and used for further analysis by reporting or advanced
analytical tools. Typical subtasks of information extraction are named entity
recognition, terminology extraction, entity resolution, and relationship
extraction. Named entity recognition finds names for people or
organizations in a text, and entity resolution finds out which names,
nouns, or pronouns refer to the same entity.
view (in Design Studio)
A visual component that you typically use to navigate a hierarchy of
information, open an editor, or display properties for the active editor.
Modifications that you make in a view are saved immediately. Normally,
only one instance of a particular type of view can exist within the Design
Studio.
WebSphere Application Server
DB2 Warehouse runtime environment that hosts the Administration
Console and provides the infrastructure for deploying, scheduling, and
monitoring data warehouse applications.
WebSphere DataStage
An enterprise ETL system that is part of the IBM WebSphere Data
Integration Suite.
115
116
Cubing Services
calculated member
Data members that include dynamically generated data derived from
calculations performed against members that exist in your result set.
cube
cube server
A high performance, scalable cubing engine that is designed to support
queries from many users against many different OLAP cubes. The Cubing
Services Cube Server is designed to enable fast multidimensional access to
relational data that is referenced by the OLAP cubes defined in the Cubing
Services metadata database.
dimension
A data category, such as time, accounts, products, or markets. In a
multidimensional database outline, the dimensions represent the highest
consolidation level.
hierarchy
A defined relationship among a set of attributes that are grouped by levels
in the dimension of a cube model. These relationships between levels are
usually defined with a functional dependency to improve optimization.
Multiple hierarchies can be defined for a dimension of a cube model.
level
117
Mining
associations function
A mining algorithm that finds associations, patterns, or rules hidden in
data.
confidence
Anticipated range of an output variable given a set of input variable
values.
lift
Alphablox
Application
A Blox Builder application contain sets of reports that you view in the
browser. The analytic application that you create in Blox Builder is
different from the Alphablox and J2EE applications.
Blox Builder
The Eclipse-based Alphablox tool that can be used by report developers
and Java developers to create reports based on data retrieved from
relational or multidimensional databases.
Blox component
An Alphablox software component, using Web and Java technologies, to
build analytic applications. A Blox component contains a Blox object, such
as a DataBlox or a PresentBlox.
Components
Components are the parts that make up a report. Components define the
data and visual aspects of a report. For example, a report can contain a
PresentBlox component, which displays a grid and a chart, and a checkbox
component, which displays a checkbox.
Layout
A reports layout defines how the reports components appear in the
browser. The layout defines each components size and position when the
report is displayed in the browser. Like the report display name and
description, a report can have a different layout for each locale.
Page member
The member of a dimension that is being displayed; a filter that restricts
the data to the specified member.
Properties
You can create custom report properties, which you use to share
information between the reports components. For example, your report
contains a PresentBlox component and a checkbox component. You want
the PresentBlox to display a toolbar when you check the checkbox. You can
118
design the report so that when you click the checkbox, the checkbox
component sets a report property to true. The PresentBlox accesses the
value of the report property to determine whether to display a toolbar.
Property Expression wizard
Adds a property reference expression to the query string by using the
Property Expression wizard.
Property Reference wizard
Adds a property reference to the query string by using the Property
Reference wizard.
Query A query contains the query string, data, and connection information such
as the name of the data source and the query. A query definition defines
the querys unique ID, display name, and description. A query can then be
accessed from any report, and you can override the properties in a query
in the report.
Query Designer
Generates a query from a visual layout. Query Designer is a version of
Query Builder. The Query Designer displays a PresentBlox with the data.
You can modify what data is displayed in the PresentBlox, and Query
Designer will replace the current query text with the generated query from
the displayed data.
Report
You can create, preview, and deploy reports. After you create your report,
you can create an analytic application that contains the report. A report
may contain multiple visual and non-visual components. A DataBlox is a
non-visual component, whereas a PresentBlox, text, and buttons are visual
components. You can create a simple report containing only a DataBlox
and PresentBlox.
Report catalog
A Blox Builder application displays a navigation tree of available reports
that you can choose to display in the report viewer. The report catalog
defines which reports appear in the navigation tree. In a report catalog,
you can override a reports display name, description, and the values of
the report properties. For example, you create a report that displays the
sales data for a certain time period. A custom report property contains the
value that determines the time period. In an application, you can set the
report property to different values to display reports for different time
periods.
Report link
Each time a report is added to an applications navigation, you can
override these properties to create different reports (called a report link)
from a single report template. You can add links to reports to your Blox
Builder application and specify the reports name and description that will
appear in the application. Your application must be open in the
Application Editor.
Test Query
Displays the query on the default server.
Traffic lighting
Highlighting of data cells based on specified criteria, typically a range of
values. Named after the frequent use of red, yellow, and green colors to
highlight the status of a displayed value.
119
120
Notices
IBM may not offer the products, services, or features discussed in this document in
all countries. Consult your local IBM representative for information on the
products and services currently available in your area. Any reference to an IBM
product, program, or service is not intended to state or imply that only that IBM
product, program, or service may be used. Any functionally equivalent product,
program, or service that does not infringe any IBM intellectual property right may
be used instead. However, it is the users responsibility to evaluate and verify the
operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter
described in this document. The furnishing of this document does not give you
any license to these patents. You can send license inquiries, in writing, to:
IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 10504-1785
U.S.A.
For license inquiries regarding double-byte (DBCS) information, contact the IBM
Intellectual Property Department in your country/region or send inquiries, in
writing, to:
IBM World Trade Asia Corporation
Licensing
2-31 Roppongi 3-chome, Minato-ku
Tokyo 106, Japan
The following paragraph does not apply to the United Kingdom or any other
country/region where such provisions are inconsistent with local law:
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS
PUBLICATION AS IS WITHOUT WARRANTY OF ANY KIND, EITHER
EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS
FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or
implied warranties in certain transactions; therefore, this statement may not apply
to you.
This information could include technical inaccuracies or typographical errors.
Changes are periodically made to the information herein; these changes will be
incorporated in new editions of the publication. IBM may make improvements
and/or changes in the product(s) and/or the program(s) described in this
publication at any time without notice.
Any references in this information to non-IBM Web sites are provided for
convenience only and do not in any manner serve as an endorsement of those Web
sites. The materials at those Web sites are not part of the materials for this IBM
product, and use of those Web sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it
believes appropriate without incurring any obligation to you.
Copyright IBM Corp. 2007
121
Licensees of this program who wish to have information about it for the purpose
of enabling: (i) the exchange of information between independently created
programs and other programs (including this one) and (ii) the mutual use of the
information that has been exchanged, should contact:
Such information may be available, subject to appropriate terms and conditions,
including in some cases payment of a fee.
The licensed program described in this document and all licensed material
available for it are provided by IBM under terms of the IBM Customer Agreement,
IBM International Program License Agreement, or any equivalent agreement
between us.
Any performance data contained herein was determined in a controlled
environment. Therefore, the results obtained in other operating environments may
vary significantly. Some measurements may have been made on development-level
systems, and there is no guarantee that these measurements will be the same on
generally available systems. Furthermore, some measurements may have been
estimated through extrapolation. Actual results may vary. Users of this document
should verify the applicable data for their specific environment.
Information concerning non-IBM products was obtained from the suppliers of
those products, their published announcements, or other publicly available sources.
IBM has not tested those products and cannot confirm the accuracy of
performance, compatibility, or any other claims related to non-IBM products.
Questions on the capabilities of non-IBM products should be addressed to the
suppliers of those products.
All statements regarding IBMs future direction or intent are subject to change or
withdrawal without notice, and represent goals and objectives only.
This information may contain examples of data and reports used in daily business
operations. To illustrate them as completely as possible, the examples include the
names of individuals, companies, brands, and products. All of these names are
fictitious, and any similarity to the names and addresses used by an actual
business enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information may contain sample application programs, in source language,
which illustrate programming techniques on various operating platforms. You may
copy, modify, and distribute these sample programs in any form without payment
to IBM for the purposes of developing, using, marketing, or distributing
application programs conforming to the application programming interface for the
operating platform for which the sample programs are written. These examples
have not been thoroughly tested under all conditions. IBM, therefore, cannot
guarantee or imply reliability, serviceability, or function of these programs.
Each copy or any portion of these sample programs or any derivative work must
include a copyright notice as follows:
Copyright IBM Corp. 2004, 2005. All rights reserved.
122
Trademarks
The following terms are trademarks of International Business Machines
Corporation in the United States, other countries, or both, and have been used in
at least one of the documents in the DB2 UDB documentation library.
The following terms are trademarks of International Business Machines
Corporation in the United States, other countries, or both:
AIX
DB2
DB2 Connect
DB2 Universal Database
IBM
Office Connect
Redbooks
Notices
123
124
Contacting IBM
If you have a technical problem, please review and carry out the actions suggested
by the product documentation before contacting DB2 Data Warehouse Edition
Customer Support. This guide suggests information that you can gather to help
DB2 Data Warehouse Edition Customer Support to serve you better.
For information or to order any of the DB2 Data Warehouse Edition products,
contact an IBM representative at a local branch office or contact any authorized
IBM software remarketer.
If you live in the U.S.A., you can call one of the following numbers:
v 1-800-IBM-SERV (1-800-426-7378) for customer service
v 1-888-426-4343 to learn about available service options
v 1-800-IBM-4YOU (426-4968) for DB2 marketing and sales
Note: In some countries, IBM-authorized dealers should contact their dealer
support structure instead of the IBM Support Center.
Product Information
Information regarding DB2 Data Warehouse Edition is available by telephone or by
the World Wide Web at http://www.ibm.com/software/data/db2/dwe.
This site contains the latest information on the technical library, ordering books,
product downloads, newsgroups, FixPaks, news, and links to Web resources.
If you live in the U.S.A., then you can call one of the following numbers:
v 1-800-IBM-CALL (1-800-426-2255) to order products or to obtain general
information.
v 1-800-879-2755 to order publications.
http://www.ibm.com/software/data/db2/udb/dwe/
Provides links to information about DB2 Data Warehouse Edition.
http://www.ibm.com/software/data/db2/9
The DB2 Web pages provide current information about news, product
descriptions, education schedules, and more.
http://www.elink.ibmlink.ibm.com/
Click Publications to open the International Publications ordering Web site
that provides information about how to order books.
http://www.ibm.com/education/certify/
The Professional Certification Program from the IBM Web site provides
certification test information for a variety of IBM products.
Accessible documentation
Documentation is provided in XHTML format, which is viewable in most Web
browsers.
125
126
Printed in USA
SC18-9801-04