You are on page 1of 9

DB2 Basics: An introduction to materialized query tables 01/02/2006 07:09 PM

DB2 Basics: An introduction to materialized query


tables
Level: Introductory

Roman Melnyk (roman_b_melnyk@hotmail.com), DB2 Information Development, IBM Canada Ltd.


08 Sep 2005

The definition of a materialized query table (MQT) is based upon the result of a query. MQTs can
significantly improve the performance of queries. This article introduces you to MQTs, summary
tables, and staging tables, and shows you, by way of working examples, how to get up and running
with materialized query tables.

A materialized query table (MQT) is a table whose definition is based upon the result of a query. The data that is
contained in an MQT is derived from one or more tables on which the materialized query table definition is
based. Summary tables (or automatic summary tables, ASTs), which are familiar to IBM® DB2® Universal
Database™ (UDB) for Linux, UNIX®, and Windows® (DB2 UDB) users, are considered to be a specialized type
of MQT. The fullselect that is part of the definition of a summary table contains a GROUP BY clause
summarizing data from the tables that are referenced in the fullselect.

You can think of an MQT as a kind of materialized view. Both views and MQTs are defined on the basis of a
query. The query on which a view is based is run whenever the view is referenced; however, an MQT actually
stores the query results as data, and you can work with the data that is in the MQT instead of the data that is in
the underlying tables.

Materialized query tables can significantly improve the performance of queries, especially complex queries. If the
optimizer determines that a query or part of a query could be resolved using an MQT, the query might be
rewritten to take advantage of the MQT.

An MQT can be defined at table creation time as maintained by the system or maintained by the user. The
following sections introduce you to these two types of MQTs, as well as summary tables and staging tables. The
examples that follow require a connection to the SAMPLE database; if you don’t have the SAMPLE database
created on your system, you can create it by entering the db2sampl command from any command prompt.

Maintained by system MQTs


The data in this type of materialized query table is maintained by the system. When you create this type of MQT,
you can specify whether the table data will be a REFRESH IMMEDIATE or REFRESH DEFERRED. The
REFRESH keyword lets you specify how the data is to be maintained. DEFERRED means that the data in the
table can be refreshed at any time using the REFRESH TABLE statement. Neither REFRESH DEFERRED nor
REFRESH IMMEDIATE system-maintained MQTs allow insert, update, or delete operations to be executed
against them. However, REFRESH IMMEDIATE system-maintained MQTs are updated with changes made to
the underlying tables as a result of insert, update, or delete operations.

Listing 1 shows an example of creating a REFRESH IMMEDIATE system-maintained MQT. The table, which is
named EMP, is based on the underlying tables EMPLOYEE and DEPARTMENT in the SAMPLE database.
Because REFRESH IMMEDIATE MQTs require that at least one unique key from each table referenced in the
query appear in the select list, we first define a unique constraint on the EMPNO column in the EMPLOYEE
table and on the DEPTNO column in the DEPARTMENT table. The DATA INITIALLY DEFERRED clause
simply means that data will not be inserted into the table as part of the CREATE TABLE statement. After being
http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0509melnyk/ Page 1 of 9
DB2 Basics: An introduction to materialized query tables 01/02/2006 07:09 PM

simply means that data will not be inserted into the table as part of the CREATE TABLE statement. After being
created, the MQT is in check pending state (see Demystifying table and table space states), and cannot be queried
until the SET INTEGRITY statement has been executed against it. The IMMEDIATE CHECKED clause specifies
that the data is to be checked against the MQT's defining query and refreshed; the NOT INCREMENTAL clause
specifies that integrity checking is to be done on the whole table. A query executed against the EMP materialized
query table shows that it is now fully populated with data.

Listing 1. Creating an MQT that is to be maintained by the system

connect to sample
...

alter table employee add unique (empno)


alter table department add unique (deptno)

create table emp as (select e.empno, e.firstnme, e.lastname, e.phoneno, d.deptno,


substr(d.deptname, 1, 12) as department, d.mgrno from employee e, department d
where e.workdept = d.deptno)
data initially deferred refresh immediate

set integrity for emp immediate checked not incremental

select * from emp

EMPNO FIRSTNME LASTNAME PHONENO DEPTNO DEPARTMENT MGRNO


------ ------------ --------------- ------- ------ ------------ ------
000010 CHRISTINE HAAS 3978 A00 SPIFFY COMPU 000010
000020 MICHAEL THOMPSON 3476 B01 PLANNING 000020
000030 SALLY KWAN 4738 C01 INFORMATION 000030
000050 JOHN GEYER 6789 E01 SUPPORT SERV 000050
000060 IRVING STERN 6423 D11 MANUFACTURIN 000060
000070 EVA PULASKI 7831 D21 ADMINISTRATI 000070
000090 EILEEN HENDERSON 5498 E11 OPERATIONS 000090
000100 THEODORE SPENSER 0972 E21 SOFTWARE SUP 000100
000110 VINCENZO LUCCHESSI 3490 A00 SPIFFY COMPU 000010
000120 SEAN O'CONNELL 2167 A00 SPIFFY COMPU 000010
000130 DOLORES QUINTANA 4578 C01 INFORMATION 000030
...
000340 JASON GOUNOT 5698 E21 SOFTWARE SUP 000100

32 record(s) selected.

connect reset

Maintained by user MQTs


The data in this type of materialized query table is maintained by the user. Only a REFRESH DEFERRED
materialized query table can be defined as MAINTAINED BY USER. The REFRESH TABLE statement (used
for system-maintained MQTs) cannot be issued against user-maintained MQTs. User-maintained MQTs do allow
insert, update, or delete operations to be executed against them.

Listing 2 shows an example of creating a REFRESH DEFERRED user-maintained MQT. The table, which is
named ONTARIO_1995_SALES_TEAM, is based on the underlying tables EMPLOYEE and SALES in the
http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0509melnyk/ Page 2 of 9
DB2 Basics: An introduction to materialized query tables 01/02/2006 07:09 PM

named ONTARIO_1995_SALES_TEAM, is based on the underlying tables EMPLOYEE and SALES in the
SAMPLE database. Again, the DATA INITIALLY DEFERRED clause means that data will not be inserted into
the table as part of the CREATE TABLE statement. After being created, the MQT is in check pending state (see
Demystifying table and table space states), and cannot be queried until the SET INTEGRITY statement has been
executed against it. The MATERIALIZED QUERY IMMEDIATE UNCHECKED clause specifies that the table
is to have integrity checking turned on, but is to be taken out of check pending state without being checked for
integrity violations.

Next, to populate the MQT with some data, we will import data that had been exported from the EMPLOYEE and
SALES tables. The exporting query matches the defining query for the MQT. Then we will insert another record
into the ONTARIO_1995_SALES_TEAM table.
A query executed against the ONTARIO_1995_SALES_TEAM materialized query table shows that it is now
fully populated with the imported and inserted data, demonstrating that user-maintained MQTs can indeed be
modified directly.

Listing 2. Creating an MQT that is to be maintained by the user

connect to sample
...

create table ontario_1995_sales_team as (select distinct e.empno, e.firstnme,


e.lastname, e.workdept, e.phoneno, 'Ontario' as region,
year(s.sales_date) as year from employee e, sales s
where e.lastname = s.sales_person and year(s.sales_date) = 1995
and left(s.region, 3) = 'Ont')
data initially deferred refresh deferred maintained by user

set integrity for ontario_1995_sales_team materialized query immediate


unchecked

export to ontario_1995_sales_team.del of del


select distinct e.empno, e.firstnme, e.lastname, e.workdept, e.phoneno,
'Ontario' as region, year(s.sales_date) as year from employee e,
sales s
where e.lastname = s.sales_person and year(s.sales_date) = 1995
and left(s.region, 3) = 'Ont'
...
Number of rows exported: 2

import from ontario_1995_sales_team.del of del insert into


ontario_1995_sales_team
...
Number of rows committed = 2

insert into ontario_1995_sales_team


values ('006900', 'RUSS', 'DYERS', 'D44', '1234', 'Ontario', 1995)

select * from ontario_1995_sales_team

EMPNO FIRSTNME LASTNAME WORKDEPT PHONENO REGION YEAR


------ ------------ --------------- -------- ------- ------- -----------
000110 VINCENZO LUCCHESSI A00 3490 Ontario 1995
000330 WING LEE E21 2103 Ontario 1995
006900 RUSS DYERS D44 1234 Ontario 1995

3 record(s) selected.

connect reset
http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0509melnyk/ Page 3 of 9
DB2 Basics: An introduction to materialized query tables 01/02/2006 07:09 PM

connect reset

Summary tables
You will recall that a summary table is a specialized type of MQT whose fullselect contains a GROUP BY clause
summarizing data from the tables that are referenced in the fullselect. Listing 3 shows a simple example of
creating a summary table. The table, which is named SALES_SUMMARY, is based on the underlying table
SALES in the SAMPLE database. Once again, the DATA INITIALLY DEFERRED clause means that data will
not be inserted into the table as part of the CREATE TABLE statement. The REFRESH DEFERRED clause
means that the data in the table can be refreshed at any time using the REFRESH TABLE statement. A query
against this MQT right after it was created, but before the REFRESH TABLE statement was issued, returns an
error. After the REFRESH TABLE statement executes, the query runs successfully.

A subsequent insert operation into the SALES table, followed by a summary table refresh and a query against the
summary table, shows that the change to the underlying table is reflected in the summary table: salesperson Lee's
total sales in the Ontario-South region have increased by 100. Similar behavior can be observed in response to
update or delete operations against the underlying SALES table.

Listing 3. Creating a summary table

connect to sample
...

create table sales_summary as (select sales_person, region, sum(sales)


as total_sales
from sales group by sales_person, region)
data initially deferred refresh deferred

select * from sales_summary

SALES_PERSON REGION TOTAL_SALES


--------------- --------------- -----------
SQL0668N Operation not allowed for reason code "1" on table
"MELNYK.SALES_SUMMARY". SQLSTATE=57016

refresh table sales_summary

select * from sales_summary

SALES_PERSON REGION TOTAL_SALES


--------------- --------------- -----------
GOUNOT Manitoba 15
GOUNOT Ontario-North 1
GOUNOT Ontario-South 10
GOUNOT Quebec 24
LEE Manitoba 23
LEE Ontario-North 8
LEE Ontario-South 34
LEE Quebec 26
LUCCHESSI Manitoba 3
LUCCHESSI Ontario-South 8
LUCCHESSI Quebec 3

11 record(s) selected.
http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0509melnyk/ Page 4 of 9
DB2 Basics: An introduction to materialized query tables 01/02/2006 07:09 PM

11 record(s) selected.

insert into sales values ('06/28/2005', 'LEE', 'Ontario-South', 100)

refresh table sales_summary

select * from sales_summary

SALES_PERSON REGION TOTAL_SALES


--------------- --------------- -----------
...
LEE Ontario-North 8
LEE Ontario-South 134
LEE Quebec 26
...

11 record(s) selected.

update sales set sales = 50 where sales_date = '06/28/2005' and


sales_person = 'LEE'
and region = 'Ontario-South'

refresh table sales_summary

select * from sales_summary

SALES_PERSON REGION TOTAL_SALES


--------------- --------------- -----------
...
LEE Ontario-North 8
LEE Ontario-South 84
LEE Quebec 26
...

11 record(s) selected.

delete from sales where sales_date = '06/28/2005' and sales_person = 'LEE'


and region = 'Ontario-South'

refresh table sales_summary

select * from sales_summary

SALES_PERSON REGION TOTAL_SALES


--------------- --------------- -----------
...
LEE Ontario-North 8
LEE Ontario-South 34
LEE Quebec 26
...

11 record(s) selected.

connect reset

http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0509melnyk/ Page 5 of 9
DB2 Basics: An introduction to materialized query tables 01/02/2006 07:09 PM

Staging tables
You can incrementally refresh a REFRESH DEFERRED MQT if it has a staging table associated with it. The
staging table collects changes that need to be applied to synchronize the MQT with its underlying tables. You can
create a staging table using the CREATE TABLE statement; then, when the underlying tables of the MQT are
modified, the changes are propagated and immediately appended to the staging table. The idea is to use the
staging table to incrementally refresh the MQT, rather than regenerate the MQT from scratch. Incremental
maintenance provides significant performance improvement. The staging table is pruned when the refresh
operation is complete.

After it is created, a staging table is in a pending (inconsistent) state; it must be brought out of this state before it
can start collecting changes to its underlying tables. You can accomplish this by using the SET INTEGRITY
statement.

Listing 4 shows an example of using a staging table with a summary table. The summary table, which is named
EMP_SUMMARY, is based on the underlying table EMPLOYEE in the SAMPLE database. You'll recall that the
DATA INITIALLY DEFERRED clause means that data will not be inserted into the table as part of the CREATE
TABLE statement. The REFRESH DEFERRED clause means that the data in the table can be refreshed at any
time using the REFRESH TABLE statement. The staging table, which is named EMP_SUMMARY_S, is
associated with the summary table EMP_SUMMARY. The PROPAGATE IMMEDIATE clause specifies that any
changes made to the underlying table as part of an insert, update, or delete operation are cascaded to the staging
table. SET INTEGRITY statements are issued against both tables to take them out of their pending states.

Not unexpectedly, a query against the summary table at this point returns no data. The REFRESH TABLE
statement returns a warning, a reminder that the "integrity of non-incremental data remains unverified." This, too,
is not unexpected. Another query against the summary table returns no data as well. However, after we insert a
new row of data into the underlying EMPLOYEE table, a query against the staging table EMP_SUMMARY_S
returns one row, corresponding to the data that was just inserted. The staging table has the same three columns
that its underlying summary table has, plus two additional columns that are used by the system:
GLOBALTRANSID (the global transaction ID for each propagated row) and GLOBALTRANSTIME (the
timestamp of the transaction). Another query against the summary table returns no data, but after the REFRESH
TABLE statement executes this time, the query runs successfully.

Listing 4. Using a staging table with a summary table

connect to sample
...

create table emp_summary as (select workdept, job, count(*) as count


from employee group by workdept, job)
data initially deferred refresh deferred

create table emp_summary_s for emp_summary propagate immediate

set integrity for emp_summary materialized query immediate unchecked

set integrity for emp_summary_s staging immediate unchecked

select * from emp_summary

WORKDEPT JOB COUNT


-------- -------- -----------

0 record(s) selected.
http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0509melnyk/ Page 6 of 9
DB2 Basics: An introduction to materialized query tables 01/02/2006 07:09 PM

0 record(s) selected.

refresh table emp_summary


SQL1594W Integrity of non-incremental data remains unverified by the
database manager. SQLSTATE=01636

select * from emp_summary

WORKDEPT JOB COUNT


-------- -------- -----------

0 record(s) selected.

insert into employee


values ('006900', 'RUSS', 'L', 'DYERS', 'D44', '1234', '1960-05-05',
'FIELDREP', 5, 'M', '1940-04-02', 10000, 100, 1000)

select * from emp_summary_s

WORKDEPT JOB COUNT GLOBALTRANSID GLOBALTRANSTIME


-------- -------- ----------- -------------------... -----------------------------...
D44 FIELDREP 1 x'00000000000000CD' x'20050822201344536158000000'

1 record(s) selected.

select * from emp_summary

WORKDEPT JOB COUNT


-------- -------- -----------

0 record(s) selected.

refresh table emp_summary


SQL1594W Integrity of non-incremental data remains unverified by the database
manager. SQLSTATE=01636

select * from emp_summary

WORKDEPT JOB COUNT


-------- -------- -----------
D44 FIELDREP 1

1 record(s) selected.

connect reset

Summary
The SYSCAT.TABDEP system catalog view contains a row for every dependency that a materialized query table
has on some other object. You can query this view to obtain a dependency summary for the MQTs that we have
created (Listing 5). MQTs have a DTYPE value of 'S.' The TABNAME column lists the names of the MQTs, and
the BNAME column lists the names of the database objects on which the corresponding MQTs depend. The
BTYPE column identifies the object type: 'T' for table, 'I' for index, and 'F' for function instance.

http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0509melnyk/ Page 7 of 9
DB2 Basics: An introduction to materialized query tables 01/02/2006 07:09 PM

Listing 5. Querying the SYSCAT.TABDEP system catalog view to see MQT dependencies on other
database objects

connect to sample
...

select substr(tabname,1,24) as tabname, dtype, substr(bname,1,24) as bname, btype


from syscat.tabdep where tabschema = 'MELNYK' and dtype = 'S'

TABNAME DTYPE BNAME BTYPE


------------------------ ----- ------------------------ -----
EMP S DEPARTMENT T
EMP S EMPLOYEE T
EMP S SQL050829104058970 I
EMP S SQL050829104058800 I
EMP_SUMMARY S EMPLOYEE T
ONTARIO_1995_SALES_TEAM S LEFT1 F
ONTARIO_1995_SALES_TEAM S SALES T
ONTARIO_1995_SALES_TEAM S EMPLOYEE T
SALES_SUMMARY S SALES T

9 record(s) selected.

connect reset

We have seen that a materialized query table, whose definition is based upon the result of a query, can be thought
of as a kind of materialized view. MQTs are important because they can significantly decrease the response time
for complex queries. This article has introduced you to the basic concepts around maintained by system MQTs
and maintained by user MQTs, as well as summary tables and staging tables, and these concepts were illustrated
by working examples that you can run yourself. To learn more about materialized query tables, or for more
detailed information about any of the topics covered in this article, see the DB2 Information Center.

Resources
DB2 Universal Database for Linux, UNIX and Windows Support is the ideal place to locate resources such
as the Version 8.2 Information Center and PDF product manuals.

For the latest DB2 information online, including more detailed information about materialized query tables,
visit the DB2 Information Center.

Learn about Demystifying table and table space states in DB2 UDB.

Refer to the IBM DB2 Universal Database SQL Reference, Volume 1 and IBM DB2 Universal Database
SQL Reference, Volume 2 for detailed SQL documentation.

About the author

http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0509melnyk/ Page 8 of 9
DB2 Basics: An introduction to materialized query tables 01/02/2006 07:09 PM

Roman B. Melnyk , Ph.D., is a senior member of the DB2 Information Development team, specializing
in database administration, DB2 utilities, and SQL. During more than nine years at IBM, Roman has
written numerous DB2 books, articles, and other related materials. Roman coauthored DB2 Version 8:
The Official Guide (Prentice Hall Professional Technical Reference, 2003), DB2: The Complete
Reference (Osborne/McGraw-Hill, 2001), DB2 Fundamentals Certification for Dummies (Hungry
Minds, 2001), and DB2 for Dummies (IDG Books, 2000).

http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0509melnyk/ Page 9 of 9