Sie sind auf Seite 1von 249

Retek Batch Development Standards

Training Manual

RTK-BDS-00

07/10/00

Retek 9.0
Copyright Notice
Copyright

2000 Retek Inc.

All Rights Reserved.


No part of this publication may be copied without the express written permission of Retek Inc., 801 Nicollet Mall,
Suite 1100, Minneapolis, MN 55402

Trademarks
Retek 9.0 is a trademark of Retek Inc.
2000 Retek Inc. and its subsidiaries. All rights reserved. Retek and Active Retail Intelligence are registered
trademarks of Retek Inc. Windows is a registered trademark of Microsoft Corporation in the U.S. and/or other
countries. Microstrategy is a trademark of Microstrategy, Inc. All other trademarks and registered trademarks are
the property of their respective holders.
All other product names mentioned are trademarks or registered trademarks of their respective owners and should be
treated as such.

Retek Confidential Information


Access to this information and documentation is permitted only for its authorized business purpose by authorized
personnel subject to confidentiality and nondisclosure provisions.

Printed in the United States of America.

Contents

Contents
Module 1: Course Overview ....................................................................1
Lesson 1: Course Overview........................................................................................................ 2

Module 2: Batch Defined .........................................................................7


Lesson 1: What Is Batch and Why Does It Exist?...................................................................... 9
Lesson 2: The Batch Cycle and the Batch Window ................................................................. 11
Lesson 3: Phases and Functionality.......................................................................................... 12

Module 3: Programming Environment..................................................17


Lesson 1:
Lesson 2:
Lesson 3:
Lesson 4:
Lesson 5:
Lesson 6:
Lesson 7:

Connecting to the Development Server ................................................................... 19


Basic UNIX Commands........................................................................................... 21
Text EditorsVi and Xemacs ................................................................................. 22
UNIX Concepts........................................................................................................ 26
Retek Environment Features .................................................................................... 36
UNIX Advanced Commands ................................................................................... 37
Topics to Explore on Your Own .............................................................................. 38

Module 4: Pro*C Programming ............................................................41


Lesson 1: Getting Started with Pro*C ...................................................................................... 43
Lesson 2: C Variable Types vs. Oracle Variable Types........................................................... 46
Discussion: When does this occur? .......................................................................................... 47
Lesson 3: Executing SQL Statements With Pro*C ............................................................... 48
Lesson 4: NULL Values in Pro*C............................................................................................ 51
Lesson 5: PL/SQL Blocks in C................................................................................................. 54
Lesson 6: Signals Returned by SQL Statements ...................................................................... 56
Exercise 1 .................................................................................................................................. 58

Module 5: Retek Batch Programming...................................................61


Lesson 1: Program Structure .................................................................................................... 63
Lesson 2: Variables .................................................................................................................. 70
Lesson 3: Formatting and Style ................................................................................................ 80

Module 6: Error Handling & Debugging ...............................................87


Lesson 1: Logging Messages to the Daily Log File ................................................................. 89
Lesson 2: Writing Messages to the Program Error File ........................................................... 90
Lesson 3: Error Handling After a SQL Statement.................................................................... 93
Lesson 4: Error Handling After a PL/SQL Block .................................................................... 95
Lesson 5: Fatal vs. Nonfatal Errors .......................................................................................... 98
Lesson 6: Debugging Tools.................................................................................................... 100
Exercise 2 ................................................................................................................................ 103

Contents

ii

Module 7: Array Processing................................................................107


Lesson 1: Introduction to Array Processing ........................................................................... 109
Lesson 2: Arrayed Fetches ..................................................................................................... 110
Lesson 3: Other Arrayed SQL Statements ............................................................................. 117
Lesson 4: Dynamically Allocating Arrays ............................................................................. 119
Exercise 3 ................................................................................................................................ 123

Module 8: Table Based Restart/Recovery ...........................................127


Lesson 1: Restart/Recovery Purpose ...................................................................................... 129
Lesson 2: General Design....................................................................................................... 130
Lesson 3: Implementation Details .......................................................................................... 133
Lesson 4: New Restart Library Description ........................................................................... 147
Exercise 4 ................................................................................................................................ 150

Module 9: Program Multithreading .....................................................153


Lesson 1: Program Multithreading Purpose ........................................................................... 155
Lesson 2: General Design....................................................................................................... 156
Lesson 3: Implementation Details .......................................................................................... 157
Lesson 4: R/R & Multithreading Data Model ........................................................................ 159
Lesson 5: Starting R/R Programs, Restart and Refresh.......................................................... 161
Exercise 5 ................................................................................................................................ 165

Module 10: File Based System Interfaces ..........................................167


Lesson 1: File Interface Introduction...................................................................................... 169
Lesson 2: Interface Library Routines ..................................................................................... 173
Lesson 3: Restart/Recovery for Interface Programs............................................................... 176

Module 11: Course Review and Evaluation........................................183


Lesson 1: Review Topics, Q & A........................................................................................... 184
Lesson 2: Resources ............................................................................................................... 185
Lesson 3: Course Evaluation .................................................................................................. 186

Appendix A: The C Programming Language .....................................187


Lesson 1:
Lesson 2:
Lesson 3:
Lesson 4:
Lesson 5:

Data Types, Control Structures, Strings................................................................. 189


Operators ................................................................................................................ 199
Functions ................................................................................................................ 200
Preprocessor Macros .............................................................................................. 209
Libraries and Header Files ..................................................................................... 210

Appendix B: Programming Best Practices ........................................215


Lesson 1:
Lesson 2:
Lesson 3:
Lesson 4:

Coding in Context .................................................................................................. 217


Approaching a Programming Task ........................................................................ 218
Functions ................................................................................................................ 220
Quality.................................................................................................................... 221

Contents

iii

Appendix C: Solutions to Exercises...................................................223


Exercise 1 ................................................................................................................................ 224
Exercise 2 ................................................................................................................................ 225
Exercise 2 ................................................................................................................................ 226
Exercise 3 ................................................................................................................................ 230
Exercise 3 Part 2 ................................................................................................................... 234
Exercise 4 ................................................................................................................................ 238

Contents

iv

Module 1: Course Overview

This module includes the following lesson:


Lesson 1: Course Overview

07/10/00

Module 1: Course Overview

Lesson 1: Course Overview


Welcome to the Retek Batch Development Standards course. This course
is intended to provide a basic understanding of the technology and tools
used in the Retek batch development environment. The concepts taught in
this course allow you to critically examine and modify existing batch
source code and develop new batch modules from detailed designs.
An important component of being effective in these tasks is to understand
how to use the tools available to you in your current programming
environment to help you create, test, and debug batch programs written in
Pro*C. These tools include your text editor, debugger, the make utility and
the Unix operating system itself. This week you will receive guided
practice with all of these tools as you write and test your own batch
programs.

Course Objectives
After completing this course, you will be able to:
Effectively use Reteks batch programming languages and tools.
Identify how and why Retek uses standards in batch coding.
Identify how Retek uses Pro*C and take advantage of Pro*C functions
that enhance performance and overall efficiency.
Manage error handling.
Improve performance, using array processing.
Understand how to dynamically size arrays.
Use interface download programs to write data to a file that can be
used by other programs.
Use interface upload programs to transfer data from a file to the Retek
database.
Understand how both table based and file based restart recovery
functions are able to restart programs from fatal processing errors.
Understand how multithreading improves the actual processing of
data.
Understand the layout programs in the batch schedule.

Module 1: Course Overview

Prerequisites
Participants in this course are expected to have two prerequisites:
1. Before you begin this training course, you should have a basic
knowledge of Pro*C and C programming. In addition, you should
understand basic SQL, PL/SQL, and database concepts.
2. Experience using a command line operating system interface (e.g.
UNIX, DOS, VMS).

Lesson Structure
The following is a list of the sections that make up a lesson and a
description of each sections purpose.

Module Overview
The Overview section provides a brief description of the topic and its
context within the class.

Objectives
The Objectives section lists lesson goals and expectations.

Lessons
The subject matter is broken down into discreet lessons to assist in
understanding and comprehension.

Exercises
The Exercises section provides coding exercises relevant to each lesson.

Key Information
Key Information lists key fields or provide tables of information about the
topic.

Module 1: Course Overview

Self-Evaluation
The Self-Evaluation chart may be used to assess your understanding and a
to note areas you may wish to review or research.

Summary
This section provides a synopsis of the lesson.

Whats Next
This section briefly describes the next lesson or module.

Module 1: Course Overview

Suggestions for a Successful Experience


This training is for you and its success depends on you! To maximize your
course experience, please follow these guidelines.
= Enter into discussions enthusiastically.
= Give freely of your experiences.
= Confine discussions to the topic.
= Avoid private discussions during lecture periods. Only one person
should talk at a time.
= Say what you think.
= Listen alertly during discussions.
= If you do not understand, ask for further clarification.
= Appreciate other peoples points of view.
= Provide feedback from your experience so we can continually improve
this course.
= Be prompt and regular in attendance, both at the beginning of each day
and after breaks.

Summary
In this module, you learned overall training objectives of this course and
reviewed the components included in this manual.

Whats Next?
In the next module, you will be introduced to the basic vocabulary and
concepts of batch programs and how they contribute to Reteks product
offerings.

Module 1: Course Overview

Module 2: Batch Defined

This module includes the following lessons:


Lesson 1: What Is Batch and Why Does It Exist?
Lesson 2: The Batch Cycle
Lesson 3: Phases and Functionality

07/10/00

Module 2: Batch Defined

Module Overview
Batch programs are only one component of Reteks client server
application architecture. This module describes the batch programming
context by defining some of the basic vocabulary used by batch
programmers and thorough descriptions of actual programs.

Objectives
After completing this module, you will be able to:
= Describe the purpose of batch programs in the context of Reteks
online modules and external systems.
= Identify four general types of Retek batch programs.

Module 2: Batch Defined

Lesson 1: What Is Batch and Why Does It


Exist?
What Is Batch?
The following definition was found at the Free Online Dictionary of
Computing (http://wombat.doc.ic.ac.uk/foldoc).
batch processing: A system that takes a set (a "batch") of commands or
jobs, executes them and returns the results, all without human
intervention. This contrasts with an interactive system where the user's
commands and the computer's responses are interleaved during a single
run.
A batch system typically takes its commands from a disk file (or a set of
punched cards or magnetic tape in the old days) and returns the results to a
file (or prints them). Often there is a queue of jobs that the system
processes as resources become available.
Batch programming creates the executable files containing the commands
for processing. Batch programming at Retek is often contrasted with the
programming of Reteks online product components, which uses Oracle
Forms Designer, for example.

Why Does Batch Exist?


Much of the functionality provided by Reteks products is available
through online forms. So why are batch programs necessary? There are a
variety of valid answers to this question. Perhaps the easiest is speed.
Very large quantities of data can be processed by batch programsmuch
more data processing than would be reasonable during online transactions.
Another answer to the question, why does batch exist? can be found by
answering, what do these programs do? In general, a Retek batch
program serves one of two purposes:
1. Interfacing with external systems
2. Internal maintenance

Module 2: Batch Defined

10

The interfacing with external systems category can be subdivided into


upload and download. Upload programs bring data from external systems
into Reteks database. Download programs extract data from Retek and
format it so it can be used by external systems. For example, RMS
interfaces with a retailers suppliers to manage incoming shipments. The
batch program ediupasn, an upload program, reads in Advance Shipping
Notices (ASN), which provide buyers, warehouse workers, store managers
and stockroom clerks information, before goods physically arrive, about
what and when to expect incoming merchandise.
The internal maintenance category of batch programs can likewise be
subdivided into system maintenance and functional maintenance. System
maintenance programs do things like update the system date, prime tables
so they are prepared for other programs to run, or delete old data.
Functional maintenance programs perform tasks like processing all the
price changes to take effect tomorrow, or building a history of all sales last
week by department.
In some cases, a functional maintenance batch program may perform a
task similar to one that can be accomplished through an online dialog.
However, while the form might process a dozen records per transaction
and a transaction might take several minutes for a person to enter, the
batch program might be capable of processing tens of thousands of records
in less time. Replenishment attribute update (single item verses an item
list) is an example of this situation.
In contrast, there are other tasks that are performed only by batch
programs. Examples of these include the numerous daily, weekly,
monthly, and quarterly upkeep chores, such as Sales History Rollup
(hstbld). As the design says, this weekly program, extracts sales history
information for each SKU from the rag_skus_st_hist and win_store_hist
tables. The history information is rolled up to the subclass, class, and dept
level to be written to: dept_sales_hist, class_sales_hist, and
subclass_sales_hist. For each SKU, the data include sales qty, value,
gross profit, and sales rate.
As you might imagine from this example, processing every SKU from
every store in a large retail organization might easily include millions of
records. This, of course, is not a reasonable task for an online transaction.
Not only would the user be waiting a long for the transaction to finish
processing, but such a large portion of the total system resources might be
consumed by this one request that online performance would deteriorate
unacceptably for other users.

Module 2: Batch Defined

11

Lesson 2: The Batch Cycle and the Batch


Window
It may be apparent at this point that running batch programs during
business hours may not be reasonable. So when and how are batch
programs run?
Typically, there is a window of opportunity during each day (or, more
likely, night) when online systems are not being used. It is this time frame
that is referred to as the batch window.
Imagine a retail organization with stores throughout the continental U.S.
Such an organization might require its online systems to be available from
8AM EST when its New York City offices open until 9PM PST when its
west coast stores close. Accounting for time zones, this allows 8 hours for
processing all batch jobs.
What needs to happen during these hours? On the next page is an example
Retek Enterprise Release 3.0 Batch schedule to give some idea. It is a
diagrammatic representation of all batch programs and how they might be
sequenced.
Notice that there are several phases. Each phase must be completed
before the next can begin. Furthermore, notice that within a phase some
programs are listed in a horizontal sequence separated by vertical bars.
This demonstrates that programs within each phase may need to be run in
a particular order.

Module 2: Batch Defined

12

Lesson 3: Phases and Functionality


Keep in mind that the previous diagram is meant only to be a suggested
starting point for a client installation. Many programs listed are productspecific and not all customers install all of Reteks products. Furthermore,
the timing and order within and between phases may be more malleable
than is suggested by this diagram. As a result, each client installation is
likely to create their own modified batch cycle.
Nevertheless, if you accept for the moment this hypothetical batch cycle,
here are some of its highlights.

Phases
A brief description of the phases:
= Phase 0: A maintenance pass.
= Phase 1: Primes tables for interfacing with external systems.
= Phase 2: Processes external interfaces.
= Phase 3: Processes replenishment, ordering, and stock ledger.
= Phase 4: Outputs external interface files and rebuilds changed system
info.
= Date Set: Increments system date.
= Ad Hoc: Runs as required. May not have phase restrictions.

Some Selected Functionality Descriptions


The following descriptions are intended to give some sense of the
interdependencies among programs in the batch cycle. These are not to be
construed as a complete description, or even the most important aspects of
the batch cycle.

Module 2: Batch Defined

13

Daily Purge
The Daily Record Deletion (dlyprg) module is executed nightly to delete
all of the records in the system that have been marked for deletion by the
online system during the day. Before deletion, all relations are checked to
ensure that the record can be deleted. For example, if a staple SKU has
been marked for deletion, this module will check that the SKU has not
been placed on order later in the day. If relations are found to exist, they
are noted on a report that will itemize any problems found when running
the module, and the record will not be deleted that night.

Sales Posting
The Sales Posting batch component is executed in order to process sales
data brought in from an external system while also maintaining sales
history within the Retek system. The POS Upload (posupld, phase 2)
module is run on a daily basis at the beginning of the batch processing.
The purpose of this batch module is to process sales and return details
from an external point of sale system. Sales history, using Sales History
Rollup by Department, Class, and Subclass (hstbld, phase 3) module, is
then collected for each SKU in the system. The collected history
information is stored at the subclass, class, and department level. For each
SKU, the data to be saved includes sales quantity, value, gross profit, and
sales rate. The Sales History Rollup by Department, Class and Subclass
(hstbld) module should be run with the parameter Weekly. After sales
history is built, the Pre-Post module is run in order to maintain the rebuild
tables.

Clearance
The Clearance function is designed to provide an orderly and efficient
framework for the authorization and control of clearance markdowns. In
order to do this, RMS uses the concept of events, similar to the events used
in Pricing. Clearance events allow multiple markdowns of the same
item/zone. Also, the Clearance price for the item is not permanentthe
retail price will be reset to the regular retail after the reset date specified
for the event has passed. Therefore, the batch processes that run in the
Clearance function of the RMS have two distinct parts processing
markdown prices and processing reset prices.

Module 2: Batch Defined

14

The Clearance Pricing Download (pccdnld) module is run in Phase 1 of


the batch schedule to send clearance markdown retail prices to the point of
sales system. Clearance markdowns that are to take effect within the
predetermined number of days are written out to a table that will be used
as an interface point with the point of sale and the clearance detail records
are updated with the current date as a downloaded date. The number of
days prior to taking effect that clearance markdowns are downloaded is
maintained as a system option.
The Retek Clearance Pricing Extract (pccext) module is run in Phase 3 of
the batch schedule to extract the new clearance event retail pricing from
the clearance event tables and update the retail information in RMS.
Clearance markdowns that are to take place on the following business day
are selected and the appropriate SKU/store tables have their retail prices
updated to the current clearance markdown prices and their clearance
indicators updated to Yes.
The Reset Pricing Download (pccrdnld) module is also run in Phase 1 of
the batch schedule to send clearance reset pricing information to the point
of sales. When a clearance event is within a predetermined number of
days of its reset date, this program will gather current clearance pricing
information and the pricing information to which an item will be reset.
This information will be written out to a table that will be used as an
interface point with the point of sale. Clearance records will be updated to
indicate that they have had reset prices downloaded to the point of sale by
setting a reset downloaded date with the current date.
The Retek Clearance Reset Price Extract (pccrext) also runs in Phase 3 of
the batch schedule to update the SKU/store tables with the regular retail
prices and to change the clearance indicator back to No. The
item/locations are selected from the clearance tables the night before the
pricing reset is to take place.
The Clearance Event Deletions (pccprg) module removes completed,
canceled, and rejected clearance events from the system. Before a
clearance event can be deleted, a pre-determined number of months must
have passed since the reset date of the clearance. The pre-determined
number of months is based on the value entered for the system option
Clearance Retention Months.

Module 2: Batch Defined

Exercises
Having read the example descriptions of batch functionality above, how
would you answer the following questions?
1. What are the relationships among online, batch and external
systems with respect to the functionality described?

2. What contribution does batch make to the relationship?

3. When during the batch cycle do the described programs run?

4. Can you explain the relative position in the batch cycle of any one
program?

15

Module 2: Batch Defined

16

Key Information
Definitions: batch, batch programming, batch cycle, batch window
Four general types of batch programs:
= Upload
= Download
= Functional maintenance
= System maintenance

Whats Next
Before you can begin creating batch programs, you must be familiar with
Reteks batch programming tools and environment. The next module will
introduce you to the UNIX operating system and discuss its suitability as
an environment for C programming.

Module 3: Programming Environment

This module includes the following lessons:


Lesson 1: Connecting to the Development Server
Lesson 2: Basic UNIX Commands
Lesson 3: Text Editors vi and Xemacs
Lesson 4: UNIX Concepts
Lesson 5: Retek Environment Features
Lesson 6: Advanced UNIX Commands
Lesson 7: Topics to Explore on Your Own

07/10/00

Module 3: Programming Environment

Module Overview
To date, Reteks core products have been run exclusively on UNIX
servers. The development environment in Minneapolis is designed with
this in mind, and batch programming is done using a UNIX server, file
system, and toolset. This module introduces many of the concepts and
tools needed to work effectively in this environment.

Objectives
After completing this module, you will be able to:
= Connect to the development server.
= Navigate the file system from a command line interface.
= Create text files using vi or xemacs.
= Know where to find more information about the UNIX environment.

18

Module 3: Programming Environment

19

Lesson 1: Connecting to the Development


Server
Your First Connection
You will be using your Windows based computer to connect to the
development server. A client application is required for this connection.
One choice is telnet, a commonly found application for establishing a
login session on a remote computer.
If you are using a Win/NT based computer, you can select Run from the
start menu, type telnet server_name and click OK The actual server
name will be provided by your instructor. Try this now. If all goes well, a
telnet client window will appear and you will be prompted for your UNIX
username and password.
Your username and password will be provided by the instructor or the IS
department.
Note: This username and password is different than those you use to
connect to the database.
Once connected, take time to change your password. Use the command
passwd as shown. Also note, the prompt shown UNIX> can be set up
to anything you like. At Retek, the prompt is set up to correlate to the
environment you are working in. For example, rmsdev9.0 > is the
default for people working in development, (thats you) and on the 9.0
project.
UNIX> passwd
passwd: Changing password for daviesr
Enter login(NIS) password:
New password:
Re-enter new password:
NIS passwd/attributes changed on mspbac01
UNIX>

Choose a password that is easy to remember and type, but not easy for
someone else to guess. It is recommended that your password contain a
combination of alpha and numeric characters. After you have successfully
changed your password, end this telnet session.

Module 3: Programming Environment

20

Exceed
There are a variety of client applications available to provide access to the
Retek UNIX servers. Some are easier or faster to use than others. The
client most commonly used at Retek and the one that is supported by the
IS department is Exceed. Exceed should already be loaded and configured
on your machine. You should find an entry in the programs submenu of
your start menu.
Exceed is actually a suite of applications that perform a variety of
communications and networking tasks. Notice, for example, there is a
telnet client available. You can try connecting to mspdev01 just as you
did with the Windows telnet client. You may want to explore the various
applications and configurations offered by Exceed on your own time.
For now, launch the application Exceed found within the Exceed menu.
You will be prompted to log in. Use your username, and the new
password you just created.
Logging in to the Exceed X Windows server takes a few minutes in its
initial configuration.
Exceed emulates Solaris Open Windows, desktop environment with X
Windows applications running on it. In its original configuration, Exceed
will appear as a full screen desktop. The file manager window, help
window and toolbar are analogous in many ways to the Windows
Explorer, help and start menu on your Windows desktop.
If you prefer, Exceed can be reconfigured to eliminate the desktop look
and feel and simply present windows floating on your Windows desktop.
To do this select the tools/configuration menu found by right clicking on
the Exceed button on Windows taskbar. Double click on the Window
Mode option and select the multiple radio button. Hint: you may also
want to explore Exceeds passive mode.

Module 3: Programming Environment

21

Lesson 2: Basic UNIX Commands


If a console session was not created for you when you started Exceed, start
one now by right clicking on the Open Windows desktop and selecting
programs/console.
Despite Exceed offering lots of user-friendly GUI functionality, you are
going to use a command line interface to the operating system.
Experience with this type of interface is a prerequisite for this course.
(If you are an experienced UNIX user, this section will provide either an
easy review, or some free time to read man pages and explore the Retek
UNIX environment.)
ls ( -a -l )
pwd
mkdir
rm ( -i )
rmdir
mv
cp
cd
cat
cmp
diff ( -c -w )
passwd
cal
find . name
tail
more
echo

List files in a directory


Where am I?
Make a new directory
Remove file(s)
Remove directory
Move file(s)
Copy file(s)
Change directory (no arg, relative or
absolute path)
Concatenate ( a file to stdout, for
example )
Compare two files
Compare two files
Change password
Calendar
Search recursively for a file
Display last 10 lines of a text file
Display a text file one page at a time
(ctrl-f,b,c)
Repeat text ( from stdin to stdout, for
example )

Module 3: Programming Environment

22

Lesson 3: Text EditorsVi and Xemacs


Vi is a text editor found on nearly every UNIX machine. For that fact
alone it is worth learning. Learning vi can be frustrating at first because
every operation requires learning a keystroke sequence. However, once a
programmer is comfortable with it, vi is a powerful, productive tool.

Vi Cheat Sheet
Two secrets to success
1. Know what mode you are in.
2. Dont forget caps lock!
Three modes
1. Insert
2. Edit
3. Command line
Changing modes:
Esc
I,i,a,A

Get out of insert modenever hurts


to escape a few times.
Enter insert mode: beginning of
line, before cursor, after cursor, end
of line
Enter command line mode

Module 3: Programming Environment

23

Edit mode:
h,j,k,l
ctrl-f, ctrl-b
x
dw, dd, D
cw
yy

p
u
/
n,N

Back, down, up, forward


Page up, down
Delete one character
Delete word, line, to end of
line
Change word
Yank(copies one or more
lines 6yy would copy six
lines)
Paste do right after copy
or delete
Undo
Search (forward)
Repeat search forward,
backward

Command line:
w
q!
x
<a line number>

Save (write to file) dont


quit
Quit without saving
Quit and save changes
Go to line

Xemacs
Emacs and xemacs are not found as widely in the UNIX universe. Still
they are in wide enough distribution that knowing this family of text
editors is a highly transferable skill. For most people, learning xemacs is a
much more pleasant experience than learning vi because it supports many
familiar mouse-driven operations (cut & paste, menus, buttons, etc.).
Dont be fooled by how easy it is to begin using, however. Xemacs is a
very powerful and programmable environment. If xemacs GUI interface
is relied upon entirely, then it is certainly less efficient to use than vi. But
as one learns keystroke commands and programming for xemacs,
productivity increases dramatically.
Key features of xemacs for working at Retek include integrated help text,
C mode, auto-tabbing, syntax highlighting, multiple frames, panes and
buffers, and parenthesis matching.

Module 3: Programming Environment

24

Two frequently used terms in emacs documentation are the control-, and
meta-. Emacs documentation refers to these keystrokes as C- and M-.
C- denotes that the control key and some other key be depressed
simultaneously (often followed by another keystroke). For example, C-x
C-s (control-x, control-s) saves changes in the current file to disk.
M- requires first hitting the metakey (mapped to the esc key on your
keyboard), then some other sequence of keys. For example, M-x % (meta,
x, % one at a time) begins a search and replace operation.

Some Xemacs Keystroke Commands


Navigation:
Right, left, up, down arrows,
and page up, page down
C-n,p,b,f,a,e

C-right, left, up, down arrows

C-page up, C-page down


F9 <number>

As advertised
Move cursor next line,
previous line, back, forward,
beginning of current line,
end of current line
Move cursor forward,
backward one word, up,
down paragraph
Move cursor to beginning or
end of buffer
Go to line <number>

Cut, copy, paste, delete:


C-spacebar

M-w

Begin highlighting text.


Move cursor using any
method above
Copy

C-y

Paste

C-d

Delete

Module 3: Programming Environment

25

Keyboard macros:
C-x ( , C-x )
C-e
C-<number> <command>

Begin and end defining


keyboard macro
Run keyboard macro
Repeat <command>
<number> times. This works
for any command, but is
particularly useful with
macros

Module 3: Programming Environment

26

Lesson 4: UNIX Concepts


There is an expression popular among Retek batch programmers. It
describes UNIX as a set of, small, sharp tools. The longer one works in
UNIX, the more insightful this quip becomes. More importantly, the
longer one works in UNIX while failing to see the truth in this
observation, the longer one works inefficiently. It is the goal of this lesson
to lay a foundational understanding on which you can build proficiency
quickly.
If you are new to UNIX, you may have wondered about the usefulness of
commands like cat or echo presented in the previous lesson. You may
have asked yourself when exploring cat, How useful can it be to type
some text and have it repeated on the screen after you hit enter? The
answer to this question and the key to working well with this set of small
sharp tools is in understanding some of the simple yet powerful design
features of UNIX.
As Kernighan and Pike put it, What makes [UNIX] effective is the
idea that the power of a system comes more from the relationships among
programs than from the programs themselves. Many UNIX programs do
quite trivial tasks in isolation, but, combined with other programs, become
general and useful tools. (The UNIX Programming Environment, Brian
W. Kernighan and Rob Pike, 1984, Prentice Hall, pp. viii)
So, what is UNIX?

What Is an Operating System?


In the narrowest sense, an operating system is software that controls a
systems resources and processes. It protects system critical memory from
being mistakenly written over, and provides access to physical file storage
systems from within a program. In this sense, an operating system has
two primary responsibilities, managing software and managing hardware.
In UNIX, these functions are performed by the kernel.
In a broader sense, an operating system can include programs like
compilers, editors, command interpreters and programs to perform tasks
like copying files, creating new directories, printing, etc. Similarly, an
operating system may provide a set of libraries and applications for use by
developers. These afford applications portability by providing a common
interface across machines.

Module 3: Programming Environment

27

More broadly still, an operating system may be thought to include user


written programs to perform often repeated tasks. An extensible operating
system is one that allows a user to easily create programs, thus creating an
efficient, customized working environment.

The Layers of UNIX


Other Applications
Standard
Applications
ls

User interface

ksh
Standard
Libraries
fopen( )

printf( )

kernel

Process
Control

File
System
Hardware

Shell
So, if the operating system is all these layers and the user interface
includes programs and libraries, what does a user interact when typing
commands in a command line interface? The short answer is a shell. Put
simply, a shell is a program that interprets commands and passes them on
to the kernel for execution.
There are a variety of shells to choose from. Each has slightly different
features and its own zealous followers. The Retek programming
environment is set up to use the Korn shellalso known as the k-shell, or
ksh. If you have no experience with another shell, then learn the k-shell.
If you are accustomed to using some other shell, feel free to use it but
realize that Reteks scripts are written for ksh.

Module 3: Programming Environment

28

Shell Variables
Like all good programs a shell has variables. Use echo to view the current
value of a shell variable. The dollar sign, $, is used to signify a variable
name. Try:
echo $ORACLE_HOME

A useful feature of a shell is the user can create new variables and assign
them values. This often serves the purpose of holding a path to a directory
to be repeatedly revisited during a session.
For example, one day you are writing a C program that uses
communication protocols. This is not a familiar job for you so you need
to refer often to the library header files in the directory
/usr/include/protocols. You could type
cd/usr/include/protocols every time you want to visit the directory,
but this would quickly grow tiresome. Knowing about shell variables you
decide to create a one letter variable name, n, to hold this directory. For
the remainder of the session you can access the protocols directory by
simply typing cd $n.

Command History
Another convenience provided by your shell is that it remembers the
commands you type. When you are performing a task, or sequence of
tasks over and over, you can use this command history to avoid retyping
the command(s). Your system should be configured to use the esc and the
vi editor commands to allow you to scroll through and edit your command
history.

Filename Completion
Another keystroke-saving feature is filename completion. This should be
mapped to esc-\ on your system. Use it whenever your are typing
filenames and paths.

Module 3: Programming Environment

29

Pattern Matching
Regular expressions are expressions using metacharacters to specify
patterns in text. In the context of the shell they are used to either specify
directory or filenames when the exact name is unknown, or to specify
multiple files or directories with similar names.
For example, the * character is used to replace any set of zero or more
characters. If a directory contains hundreds of files and you want to view
information about only those that begin rs and end .pc you could
either list all the files then scroll up and down to find those you are
interested in, or you could issue the command
ls rs*.pc

to list only the subset of interest. Here is a list of popular shell


metacharacters.
Metacharacter
*
?
[abc123]

Meaning
Zero or more instances of any
character
Matches any single character
Matches any one character listed.
Ranges may be listed as [a-c1-3]

Subshells
Interestingly, when a command is issued at the command line it is not run
in the current shell. Rather, a new shell program is started, the new shell
runs the command, and then the new shell is discarded leaving the user
back at the original shell when control returns. The new shell is a child
process of the original shell and inherits many of its parents properties.
Most of the time this behavior is transparent to the user and can be
ignored. However, occasionally it is necessary to run a program in the
current shell. This is accomplished by preceding a command with a
period-space.
. /home/daviesr/.profile

This example shows the .profile executable being run in the current shell.
Running a command in the current shell may be desirable if it establishes
values for shell variables. If such a file were run in a subshell, the
variables would only have the lifespan of the subshell.
In order for a subshell to inherit the value of its parent shells variables the
export keyword must be used in the variables declaration. Consider these
four examples.

Module 3: Programming Environment

30

Setup:
UNIX> echo echo $hi > echotest
UNIX> chmod 755 echotest
UNIX> hi=hello
1. UNIX> echotest
UNIX>
2. UNIX> . echotest
hello
UNIX>
3. UNIX> export hi
UNIX> echotest
hello
UNIX>

Can you explain what is happening in each example?


Answer: Setup creates an executable shell script to print to stdout the
value of a shell variable hi, then it declares the variable and gives it a
value. Example 1 demonstrates that issuing the command echotest runs
the command in a new shell which does not have a variable hi. Example
2 uses the . operator to run the command in the current shell, and the
value of the variable is displayed. Example 3 shows the export command
forces subshells to inherit the parent shells variable.
One final point about subshells. The parent shell usually waits for the
subshell to finish executing for before continuing. If a process will take a
long time then the & character can follow a command to prevent the
parent from waiting.
UNIX> xemacs my_letter.txt &

Module 3: Programming Environment

31

What Is a File?
In UNIX there are two answers:
= A stream of bytes
= Everything
One of the elegant design decisions made by the creators of UNIX is that
files have no internal structure. Files of all types are merely stored, read
and written as a sequence of bytes.
Furthermore, everything is considered to be a file. It is up to the programs
that use a file to interpret its bytes in an appropriate way. For example,
the kernel does not know the difference between a programs source code
text file and its executable binary file.
Below is a list of some files. Do you normally think of these objects as
files?
= A text file
= An program executable
= A keyboard
= A terminal window
= A disk drive
= A communication session
Experiment: use od c to view the contents of a text file, a directory and
an executable.

Module 3: Programming Environment

32

Standard Files
Every process has three files defined for it, standard in, standard out and
standard error. These are commonly written stdin, stdout, and stderr.
File
Shorthand
Default

stdin
0
Keyboard

stdout
1
Terminal

stderr
2
Terminal

In light of the small sharp tools philosophy of UNIX, perhaps its most
powerful feature redirection. Redirection allows stdin, stdout and stderr to
be specified, rather than defaulted. The greater-than >, less-than < and
pipe | characters are used for this purpose. Try these examples:
1. UNIX> ps ef
UNIX> ps ef > processes.txt
2. UNIX> cat
UNIX> cat < processes.txt
3. UNIX> cat < processes.txt > junk.txt
4. UNIX> ps -ef | more

Another example shows a useful way to save compiling errors for viewing
with a text editor. This can be useful when a program is generating many
pages of compiler errors and warnings.
UNIX> hcomp8 my_program 1>comp.out 2>&1

Now that the concept of the UNIX kernel treating everything as a file is
familiar, terms like file, directory, keyboard, terminal, etc. can revert back
to their more pedestrian meanings as long as you can, when necessary,
recall how UNIX handles them.

Module 3: Programming Environment

33

File Hierarchy
All directories and files are organized in a tree-like hierarchy with a single
starting point (root). Any directory can have many subdirectories but only
one parent directory. For example, a users directory structure might look
like this:
/
/home
/daviesr
/bin

/letters

/scripts /binary

/projects

/sirs

/rtk70 /rms80 /rss20

Note the root directory (by definition the only directory without a parent)
is signified by a single forward slash, /. Notice too that each directory is
shown with a / before it. This character serves to separate directory
names when listed together. For example, this users letters directory can
be written
/home/daviesr/letters

Three additional important symbols are a single period . , a double


period .., and the tilde ~, which signify the current directory, the
current directorys parent, and the users home directory.

Symbol
/
.
..
~

Meaning
Root, or separator
Current directory
Parent directory
Home

As a result of the UNIX file hierarchy structure any file or directory can be
stated unambiguously by naming its entire path from root. This is known
as an absolute path. For example, in the above set of directories, the
absolute path to the rtk70 project directory is:
/home/daviesr/projects/rtk70

Module 3: Programming Environment

34

Furthermore, using the three special symbols { / . ..} any file or directory
can be specified relative to any other. This is known as a relative path.
For example, in the directory structure shown above, the relative path
from binary to sirs is:
../../sirs

A path may also be specified relative to ones home directory. The


following path is valid regardless of the location of the current directory:
cp afile.txt ~/bin/scripts

Search Path
If everything in UNIX is a file, what do you suppose the commands you
have been typing in are? Files, of course! There is no special object type
command in UNIX that has special properties and magically performs
some behavior when typed. Commands are simply programs (usually
written in C or as a shell script). A program is executed when its name is
typed at the command line.
But how does the shell know where to find the program? There are
thousands of directories in a typical file system. Is it reasonable to search
all of them until a file with the name of the command is encountered?
Probably not, but for a moment, assume that it is. What if there are more
than one file with the same name? How does the shell know which is
desired by the user?
The answer to all these questions is in your $PATH . Try
UNIX> echo $PATH

$PATH is a colon-separated list of directories. When a command is


entered each directory is searched in order to find a file with the name of
the command. The shell then tries to execute it.

Module 3: Programming Environment

35

Control Files
There are a number of files in your home directory that control aspects of
your environment. Here is a listing of some you may see in your home
directory:
-rw-r--r-1 you dev
.DISPLAY
-rw------1 you dev
.Xauthority
-rw-r--r-1 you dev
.Xdefaults
-rw-r--r-1 you dev
.ab_library
-rwxr-xr-x
1 you dev
-rw-r--r-1 you dev
.default.wst
drwxr-xr-x 12 you dev
-rwxr-xr-x
1 you dev
.dtprofile
-rwxr-xr-x
1 you dev
-rw-r--r-1 you dev
.eserve-options
-rwxr-xr-x
1 you dev
.profile
-rw------1 you dev
.rms_user
-rw------1 you dev
.sh_history
-rw-r--r-1 you dev
.workshop-options
-rw-r--r-1 you dev
.workshoprc
-rw-r--r-1 you dev
.xemacs-options

15 May

7 11:41

735 May

4 17:48

300 Aug

1998

186 Jan 16

1998

198 Jan 27 14:04 .dbxrc


1533 Apr 22 15:20
512 May 4 17:50 .dt
5400 Oct 28 1997
2338 Jun 18 1998 .emacs
59 Apr 22 15:20
1530 Apr 19 14:26
63 Apr

6 19:03

1452 May

7 12:21

759 Apr 22 15:20


1807 Apr 22 15:20
15023 Jun 10

1998

As you might guess from their names, several control or customize a


particular application. Among these are .Xdefualts, .emacs, .xemacsoptions, .workshoprc, .workshop-options, .dtprofile, and .dbxrc
Recall the command history feature provided by your shell? Browsing the
file .sh_history will give you an idea of how this feature works.
Whenever you start a login shell, your .profile script will be run. It
establishes a variety of aspects of your working environment:
= environment variables, notably, PRINTER, RETEK_HOME,
DISPLAY and PATH
= Optionally, you may have alias, stty and other shell variable
declaration commands.

Module 3: Programming Environment

36

Lesson 5: Retek Environment Features


Directory Structure
Each Retek project has a directory structure set up for it. The structure is
the same for every project.
Development Directories
$MMHOME/
error/

$h

oracle/

log/

Output directory
for batch error files

sqlplus/

$s

Scripts used for


installation of product
bin/
Library object
files and archives

lst/
Library object
files

Log of batch activity.

proc/

lib/

$c

$l
src/
Pro*C
library
source and
header files

bin/
Executable
Pro*C
programs

etc/
Miscellaneous
schedule
documentation

lst/
Pro*C
object files

src/
Pro*C
source
files

*** dev, tst, and prd all have the identical structure

For ease of navigation several environment variables are set up for you by
.profile. Of them, $h, $s, $l, and $c are shown.

SQL*PLUS
With the assistance of some shell variables, sql*plus sessions can be
started from with the UNIX environment with the command, sp. If you
fail to connect directly with this command or if you connect to the wrong
schema, you may want to change the values of the appropriate shell
variables.
The variables used to create your connect string are:
$MMUSER/$PASSWORD@$ORACLE_SID.

Module 3: Programming Environment

37

Lesson 6: UNIX Advanced Commands


Use the xman application to read about each of the commands listed.
Also, experiment with using them. Explore their arguments, and use
redirection to combine multiple commands.
grep
chmod
set
man
which
script
dos2unix
unix2dos
alias
ps

Search for patterns in text files


Change permissions on a file or directory
(recall ls l).
List all shell variables and their values
Display UNIX manual for programs,
libraries, concepts
Finds the executable run when a command
is entered
Create a text file log of terminal session
Converts DOS text file to UNIX. Replaces
control characters.
Converts UNIX text file to dos.
Creates a shorthand name for a command.
List formatted data about processes being
managed by the kernel.

Two useful control sequences used in the shell:


ctrl-d
ctrl-c

End of file. Used to end input from


keyboard, and to end a login shell.
Stops processing the latest command,
returns control to the shell.

Module 3: Programming Environment

38

Lesson 7: Topics to Explore on Your Own


Some Powerful Utilities
Lint
Sed

Awk

Analyzes C source files for syntax and structure


errors.
The sed utility is a stream editor that reads one
or more text files, making editing changes
according to a script of editing commands, and
writes the results to standard output.
The awk utility is similar to sed but allows for
more sophisticated programming of action to
take place when pattern is matched in input file.

Shell Programming
It is sometimes convenient to create a new command out of a sequence of
existing commands, or even one long command. If you find yourself
typing the same command(s) repeatedly, you may want to consider
creating a new command.
Shell programs are text files containing one or more commands to be
interpreted and processed by the shell just as if they had been entered at a
command line. Control structures like for, while, and if are available.
Many shell programs are written by individual programmers for their own
use. Others (usually more complex) are written for a project team.
Examples of the latter might include change environment or compilation
scripts.

Module 3: Programming Environment

39

Exercises
Basic commands:
1. Print to stdout your current position in the file hierarchy.
2. Change directories to the top of the hierarchy one level at a time.
3. Go home.
4. Go back to root with one command.
5. Locate a poem about winters in Wyoming somewhere in the
instructors files.
6. Create a directory to hold poetry.
7. Copy the second poem in the same directory as the one found in #5
into your new directory.
Text Editors:
Choose a text editor (or practice both) and play with your copy the poetry
file (#7 above). Search for text, move paragraphs, insert text, append text,
replace all occurrences of existing text, etc
Advanced commands
1. Make sure your .rms_user has the correct permissions. If not,
change them.
2. Print to stdout all environment variables that contain ORACLE.
3. List all the processes running with your user as the process owner.
Can you explain each?
4. Find the version of refresh you are configured to use. Create your
own bin directory. Copy refresh into your bin directory. (Hint:
`<command>` can be used to run one command inside another).
Change your environment so your local version of refresh is used.

Whats Next
You now are familiar with the operating environment and at least one text
editor used in Retek batch programming. Next, you will explore some
important Pro*C programming extensions to the C Programming
language.

Module 3: Programming Environment

40

Module 4: Pro*C Programming

This module includes the following lessons:


Lesson 1: Getting Started with Pro*C
Lesson 2: C Variable Types vs. Oracle Variable Types
Lesson 3: Executing SQL Statements with Pro*C
Lesson 4: NULL Values in Pro*C
Lesson 5: PL/SQL Blocks in C
Lesson 6: Signals Returned by SQL Statements

07/10/00

Module 4: Pro*C Programming

42

Module Overview
In this module, you will learn about Pro*C, Oracles precompiler that
allows programmers to embed SQL statements within C code. You will
learn how to embed the SQL statements, how C variables interact with
Oracle statements, and how Pro*C handles constructs that exist in Oracle
but not in C (such as NULL values). Finally, error codes returned by SQL
will be discussed.

Objectives
After completing this module, you will be able to:
= Write a C program with SQL statements embedded in it.
= Define the differences between Oracle and C variables and know the
correct method to overcome these differences.
= Handle Oracle NULL values in a C program three different ways.
= Embed a PL/SQL block in a C program.
= Interpret signals that SQL calls return correctly.

Module 4: Pro*C Programming

43

Lesson 1: Getting Started with Pro*C


What is Pro*C?
Oracle Pro*C is used to provide Oracle SQL functionality within the C
programming language. Using Pro*C directives, SQL statements and
PL/SQL blocks, can be embedded within C code.
Pro*C language syntax is not understood by the C compiler; the source
code with the embedded SQL statements must first be translated into C by
the Pro*C precompiler. The Pro*C precompiler runs before the normal C
precompiler and converts all SQL statements (which are preceded by the
flag EXEC SQL to make detection easier) to C function calls that access
an Oracle database. This converted code is then compiled by a normal C
compiler into an executable.
To make the distinction between C programs that contain embedded SQL
and C programs that have been run through the Pro*C precompiler (and
contain only straight C), the former is given the .pc extension, and the
latter is given .c.
The Pro*C precompiler is called by the command proc. C programs are
compiled with the command cc. The current method Retek is using to
compile is using the make utility. For example, to compile my_prog.pc,
you would type:
>make f $c/rms.mk my_prog

This generates my_prog.c, which is the straight C program, and my_prog,


which is the executable. Do NOT put .pc or .c on the end.
The -f tells make which file to use. For RMS, we will be using the rms.mk
file. ReSA will use resa.mk. This is for an EXISTING program. If you
are working with a program that does not exist in PVCS (it's brand new or
you're testing something with your own private program), you should use
newrms.mk instead (replace rms.mk above with newrms.mk--or
newresa.mk for a new ReSA program).

Module 4: Pro*C Programming

44

If you try compiling and you just get a message saying that dependencies
are up to date instead of getting the actual executable, it is because there is
a more recent version of the program that you are trying to compile in the
$c directory. (This will not normally happen--usually you compile a
program locally after you've just made changes to it). To get around this
problem, use the UNIX command "touch" to update the file date, like this:
>touch programname.pc

Then use the makefile again; it should compile fine.


When you have finished coding a new batch program, make sure that it
gets added to the makefile in addition to being added to the restart script.
Have PVCS archives created for the program and the design (this will be
done by the tech team and can be combined into one TR).

Module 4: Pro*C Programming

45

Requirements for all Pro*C Programs


Two lines of code are required at the beginning of any program that has
SQL statements in it:
EXEC SQL INCLUDE SQLCA.H;

This line tells the Pro*C precompiler that the program has SQL statements
that will need to be interpreted in it.
long SQLCODE;

SQLCODE is necessary for any handling errors within SQL statements.


Before any SQL statements can be executed, the program must actually
connect to the database. For this, you have a Retek-made function called
LOGON. LOGON takes two arguments, the argc and argv from the
command line, and returns an int. If its return value is less than 0, the
LOGON failed, and the program should exit immediately, because nothing
else can happen unless the program is logged in. For example:
if (LOGON (argc, argv) < 0)
exit(-1); /* exit with an error flag */

Module 4: Pro*C Programming

46

Lesson 2: C Variable Types vs. Oracle


Variable Types
The table below shows which C variable types can be associated with
which Oracle variable types:

Oracle Type
NUMBER
VARCHAR2
DATE
ROWID

C Type
float, double, int, short,
long, char*
char*
char*
char*

Note that because DATE and ROWID types do not have any direct
counterparts in C, C strings are used to hold the data. Appropriate
conversion functions should be used when fetching or inserting values of
these types. Note that no explicit conversions are needed between
NUMBER and char* and vice versa, Oracle does these conversions
implicitly.
Character strings in C (char*) are not exactly like those in Oracle
(VARCHAR2). The difference is in how the length of a character string is
determined: C places a null character (\0) at the end of the string, while
Oracle keeps track of a separate field that contains the length. If a
VARCHAR2 is fetched directly into a C string, the C string wont have a
null terminator. Likewise, if a C string is inserted directly into a
VARCHAR2, the VARCHAR2s length field wont have a value. To
solve this problem, the program must explicitly tell the Pro*C precompiler
to create the code to correctly convert between these two types. The way
this is done is with the following statement:

EXEC SQL VAR <C string name> IS STRING(<declared length of


string>);

Module 4: Pro*C Programming

47

For example:
char ls_my_string[18];
char ls_store_string[NULL_STORE];
EXEC SQL VAR ls_my_string IS STRING(18);
EXEC SQL VAR ls_store_string IS STRING(NULL_STORE);

Note: The IS STRING statement uses parentheses around the length, not
brackets.

The IS STRING statement has been used throughout Retek batch code.
But familiarity breeds contempt, however, and it was soon discovered
there was a way to do without the IS STRING by using a switch during
the pre-compile process. This switch is invisible to the coder, but its
effects are not. It simply tells Oracle to treat ALL character arrays as
strings. This means you do not need to add the IS STRING statements -with one very important exception: if the string has not been explicitly
given a length and you are fetching into the variable.
Discussion: When does this occur?

Module 4: Pro*C Programming

48

Lesson 3: Executing SQL Statements With


Pro*C
Embedding a SQL Statement into a C Program
To execute a SQL statement from a C program, the SQL statement should
be written normally, except that it is preceded by the flag EXEC SQL.
This flag tells the Pro*C precompiler that a SQL statement is coming up,
and the precompiler should convert the statement into code that the C
compiler will understand.
Examples:
int change_store(int ii_new_store)
{
if ( ii_new_store == -999 )
{
EXEC SQL DELETE FROM alloc_header;
}
else if ( ii_new_store == 1000 )
{
EXEC SQL UPDATE alloc_header
SET store = 1000;
}
return 0;
}

Module 4: Pro*C Programming

49

Using C Variables in SQL Statements


To use a C variable inside a SQL statement, a preceding colon (:) must be
used to tell the Pro*C precompiler that the variable is a C variable, rather
than a SQL or PL/SQL value. For example:
int change_store(int ii_new_store)
{
if ( ii_new_store == -999 )
{
EXEC SQL DELETE FROM alloc_header;
}
else if ( ii_new_store == -1 )
{
EXEC SQL UPDATE alloc_header
SET wh = :ii_new_store;
}
else
{
EXEC SQL UPDATE alloc_header
SET store = :ii_new_store;
}
return 0;
}

Using Cursors in Pro*C


A cursor declaration in Pro*C is as follows:
EXEC SQL DECLARE <cursor name> CURSOR FOR <select
statement>;

For example:
EXEC SQL DECLARE c_get_vdate CURSOR FOR
SELECT TO_CHAR(vdate,YYYYMMDD)
FROM period;

Module 4: Pro*C Programming

50

Pro*C cursors are opened, fetched, and closed, just like in PL/SQL:
/* This function gets the date ii_days_ahead after the
vdate */
int get_future_date(char *os_future_date, int
ii_days_ahead)
{
EXEC SQL VAR os_future_date IS STRING(NULL_DATE);
/* notice that C variables can be used in select */
/* statements, too. */
EXEC SQL DECLARE c_get_vdate CURSOR FOR
SELECT TO_CHAR(vdate + :ii_days_ahead,YYYYMMDD)
FROM period;
EXEC SQL OPEN c_get_vdate;
EXEC SQL FETCH c_get_vdate INTO :os_future_date;
EXEC SQL CLOSE c_get_vdate;
return (0);
}

There are a few important things to note about cursors in Retek batch
programming. First, implicit cursors are never used. Second, it is
standard to close cursors only in the case where the cursor is fetched from
only once as is the case most often in the init() routine. When the program
ends, the cursors will be closed automatically and more efficiently, so the
practice has been to not explicitly close them except in init() or where they
are fetched from once. Lastly, C variables used in the where clause of
the cursor are known forever more as bind variables.
The PRO-C precompiler and the Oracle database can be very helpful in
handling bind variables. As is done most often in Retek batch programs,
many number fields in the Oracle database are handled as strings. This
increases the robustness of the software, as it is less vulnerable to errors
of precision. Oracle will convert these strings when used as a number in
a where clause for you. An example:
EXEC SQL DECLARE c_item_info CURSOR FOR
SELECT dept,
system_ind
FROM desc_look
WHERE sku = :is_item;

Pro*C and Oracle will make the conversion between numbers and strings
automatically. Dates, on the other hand, should always be explicitly
converted to a date variable using the TO_DATE function.

Module 4: Pro*C Programming

51

Lesson 4: NULL Values in Pro*C


Oracles NULL value has no corresponding value in C, so if a program
attempts to put a NULL value into a C variable directly, an error will be
raised. After all, how do you represent NULL in an int? But NULL
values are legitimate values, and have to be handled somehow. There are
three possible solutions to this problem: indicator variables, NVL, and
DECODE.

Indicator Variables
Indicator variables are simply C variables of type short that indicate
whether the variable theyre attached to is NULL. They are attached to a
variable by being listed directly after said variable in a SQL statement.
For example:
{
char ls_default_wh[NULL_WH];
short li_wh_null_ind;
EXEC SQL VAR ls_default_wh IS STRING(NULL_WH);
EXEC SQL DECLARE c_store_info CURSOR FOR
SELECT default_wh,
store_name
FROM store
WHERE store = :is_store;
...
/* default_wh is a nullable column on store, and so
ls_default_wh */
/* needs an indicator variable, li_wh_null_ind */
FETCH c_store_info INTO :ls_default_wh:li_wh_null_ind,
:ls_store_name;
...
}

If the value fetched back into ls_default_wh in the example above is


NULL, then li_wh_null_ind, the associated indicator variable, will have a
value of 1 and ls_default_wh will contain garbage. If the value fetched is
not NULL, then li_wh_null_ind will not be 1 and ls_default_wh will
contain the fetched value. So, a simple check of li_wh_null_ind will tell
the program whether ls_default_wh contains a NULL value.

Module 4: Pro*C Programming

52

The same method can be used when inserting NULL values into the
database. The program simply must manually set the indicator variable to
1 if the value is NULL or 0 if it isnt before inserting or updating. For
example:
{
...
/* in this example, were assuming that if
ls_default_wh is */
/* empty, it means we want to put NULL on the database
*/
if (strcmp(ls_default_wh,) == 0)
li_wh_null_ind := -1;
else
li_wh_null_ind := 0;
EXEC SQL UPDATE store
SET default_wh =
ls_default_wh:li_wh_null_ind
WHERE store = :is_store;
...
}

NVL
Another option for dealing with variables that may be NULL is to use
Oracles NVL function. Selecting NVL(database column, new value) will
return the database column value if it is not NULL and the new value if
the database column is NULL. The new value can be either a number or a
character string. For example:
EXEC SQL DECLARE c_info CURSOR FOR
SELECT NVL(dept,1),
NVL(TO_CHAR(orig_approval_date,YYYYMMDD),:ps_vdate)
FROM ordhead;
...
EXEC SQL FETCH c_info INTO :ls_dept,
:ls_approve_date;

Here, ls_dept will be 1 if dept is NULL (and have the value of dept
otherwise), and ls_approve_date will be the same as ps_vdate if the
orig_approval_date was NULL.

Module 4: Pro*C Programming

53

DECODE
The Oracle DECODE function can also be used to change the value of
variables.
The usual use is
DECODE(<value to decode>,
<match value>, <return value if decode value
matches>,
<return value if decode value doesnt match>)

For example:
DECODE(alloc_detail.wh,
-1, alloc_detail.store,
alloc_detail.wh)

will set the variable being selected into to alloc_detail.store if


alloc_detail.wh is -1 and to alloc_detail.wh otherwise.
You can also choose multiple options:
DECODE(s.fill_priority,
M,1,
S,2,
0)

will set the variable being selected into to 1 if s.fill_priority is M, 2 if it is


S, and 0 otherwise.
It is possible to use NULL for the value being matched. For example:
EXEC SQL DECLARE c_info CURSOR FOR
SELECT DECODE(dept,
NULL,1,
dept),
DECODE(orig_approval_date,
NULL,:ps_vdate,
TO_CHAR(orig_approval_date,YYYYMMDD)
FROM ordhead;
...
EXEC SQL FETCH c_info INTO :ls_dept,
:ls_approve_date;

Module 4: Pro*C Programming

54

Lesson 5: PL/SQL Blocks in C


PL/SQL code can also be executed in C programs. It only needs to be
bracketed by the statements
EXEC SQL EXECUTE and END-EXEC;

For example:
if (strcmp(ps_vdate,) == 0)
{
EXEC SQL EXECUTE
DECLARE
L_plsql_variable DATE;
BEGIN
L_plsql_variable := GET_VDATE;
:ps_vdate :=
TO_CHAR(L_plsql_variable,YYYYMMDD);
END;
END-EXEC;
if (strcmp(ps_vdate,19971225) == 0)
...

PL/SQL blocks are used in Retek Pro*C programs for only one
reason: to call stored PL/SQL functions or procedures. A large
amount of time overhead is involved in the context switch between C and
PL/SQL, which greatly reduces the efficiency of a program, so try to avoid
PL/SQL blocks unless absolutely necessary, such as when calling a
PL/SQL stored procedure.

Module 4: Pro*C Programming

55

The problems with C and Oracles NULL value also occur in PL/SQL
blocks. To solve the problem, rather than using indicator variables (which
tend to be unwieldy), it is simpler to create a PL/SQL variable to receive
the potentially NULL value, and then check it and fill in the C variable
appropriately. For example:
long ll_dept_no;
char ls_order_no[NULL_ORDER_NO];
EXEC SQL VAR ls_order_no IS STRING(NULL_ORDER_NO);
...
/* GET_ORDER_DEPT is a function that will return */
/* the department associated with the inputted order, */
/* if one exists. If not, it will return NULL. */
EXEC SQL EXECUTE
DECLARE
L_dept ordhead.dept%TYPE;
BEGIN
/* We have to use a PL/SQL variable to hold the
department */
/* because a bind variable would fail if
GET_ORDER_DEPT */
/* returned NULL. */
L_dept := GET_ORDER_DEPT(:ls_order_no);
if L_dept is not NULL then
/* This statement would fail if L_dept was NULL
*/
:ll_dept_no := L_dept;
else
:ll_dept_no := -1;
end if;
END;
END-EXEC;

Module 4: Pro*C Programming

56

Lesson 6: Signals Returned by SQL


Statements
SQLCODE
After executing a SQL statement or a PL/SQL block in a C program,
Oracle fills in the SQLCODE variable with a long integer. This integer
will indicate whether the SQL call was successful. If it was not
successful, SQLCODE will contain an error code. Some common values
for SQLCODE are listed below.
SQLCODE
0
-1
1001
1400
1401
1403
1405

Message Text
Successful completion.
Unique key constraint
violation.
Invalid cursor.
Cannot insert NULL into
NOT NULL column.
Inserted value too large
for column.
No data found.
NULL fetched into a
bind variable with no
indicator variable.

Several of these signals are common enough that Retek has set up macros
to check for certain conditions. Here are the definitions from the header
std_err.h:
#define NO_DATA_FOUND
#define SQL_ERROR_FOUND
1403)
#define DUP_VAL_FOUND

(SQLCODE == 1403)
(SQLCODE != 0 && SQLCODE !=
(SQLCODE == -1)

Notice that SQL_ERROR_FOUND does not catch SQLCODE = 1403.


This is because you rarely want to raise an error if no data was returned by
a cursor. Usually, the program will continue as normal.

Module 4: Pro*C Programming

57

To find the meaning of a particular SQLCODE value, Oracle has provided


a UNIX tool called oerr, which will display the meaning of a SQLCODE
value. Its usage is simple:
oerr ora <absolute value of SQLCODE>

For example:
> oerr ora 1
00001, 00000, "unique constraint (%s.%s) violated"
// *Cause: An update or insert statement attempted to insert
a duplicate key
// *Action: Either remove the unique restriction or do not
insert the key

NUM_RECORDS_PROCESSED
When a SQL statement is called, another variable that receives a value is
sqlca.sqlerrd, which is an array of integers defined by the Pro*C
precompiler. Only one element of the array is of particular interest:
sqlca.sqlerrd[2] contains the cumulative number of records processed by
the SQL statement so far. This value is very important when dealing with
array processing, and will be discussed further in Module 9. For now, it is
sufficient to know that the value exists and that a Retek macro has been
defined for it:
#define NUM_RECORDS_PROCESSED sqlca.sqlerrd[2]

Module 4: Pro*C Programming

Exercise 1
Use the tmp_00.pc in your home directory as a starting point.
Use your initials to name each program (xxx_01.pc, where xxx = your
initials).
Run the scripts to create the trn_win_store table provided by your
instructor
The program for exercise 1 should do the following:
= Declare a cursor and select vdate from the table trn_period
= Declare a cursor and select the sku/store columns from the table
trn_win_store
= Print the date and the columns to the screen
For further practice:
= Select all columns from the table trn_win_store

58

Module 4: Pro*C Programming

Evaluation Criteria
Comfortable
You are able to understand key processes and concepts relating to the
listed topic.

Not Comfortable
You are unable to understand key processes and concepts relating to the
listed topic.

59

Module 4: Pro*C Programming

60

Suggestions for More Work


If necessary, your Instructor will offer suggestions to you regarding where
to find additional information or practice exercises to further your
understanding of the listed topic.
Objective

Comfortable

Not
Comfortable

Suggestions for
More Work

Write a C program with


SQL statements
embedded in it.
Define the differences
between Oracle and C
variables and know the
correct method to
overcome these
differences.
Handle Oracle NULL
values in a C program
three different ways.
Embed a PL/SQL block
in a C program.
Interpret signals that
SQL calls return
correctly.

Summary
In this module, you learned how to embed SQL statements and PL/SQL
blocks into C code with Oracles Pro*C precompiler. You learned how to
use C variables in these statements correctly, making sure that all Oracle
values are interpreted correctly, including the NULL value, which can be
handled by indicator variables, NVL, or DECODE. Finally, you learned
how to interpret the signals that Oracle sends back to the C program after a
SQL call has been completed.

Whats Next
In the next module, you will focus on Retek error handling and simple
debugging techniques.

Module 5: Retek Batch Programming

This module includes the following lessons:


Lesson 1: Program Structure
Lesson 2: Variables, Naming, Data Types, and Scope
Lesson3: Formatting and Style

07/10/00

Module 5: Retek Batch Programming

62

Module Overview
Now that the C programming language and its extension Pro*C have been
introduced, you need to take a look at how they are used at Retek. This
module introduces the structure, syntax and data handling conventions
used in Retek batch programs.

Objectives
After completing this module, you will be able to:
= Declare variables for use in batch programs using data typing and
naming standards.
= Write the skeleton structure of a Retek batch program.
= Identify and use Retek C/Pro*C style standards.

Module 5: Retek Batch Programming

63

Lesson 1: Program Structure


The basic structure of a Retek batch program consists of four functions:
main, init, process, final. The calling structure is shown below.
main( )

init( )

process( )

final( )

?
In general, from one program to the next there is very little variety in the
contents of the four base functions. Init( ) does one-time initialization,
final( ) cleans up any loose ends, and process( ) fetches data to be
processed, and main( ) controls the calling of the others. What varies
between programs is the processing done by functions called from
process( ), represented in the diagram as the question mark module(s).

main()
The main() function is the starting point of any C program. In Retek batch
programs, the only things that happen in the main() function are:
= A connection is made to the database.
= Init( ), process( ), and final( ) are called to perform the work of the
program.
= Messages are written to the daily log file in order to indicate the
beginning and the end of the program's run.

Module 5: Retek Batch Programming

64

init()
The init( ) function is where one-time tasks, which must happen before
actual data processing, are performed:
= Restart/recovery is initialized and any outstanding bookmarks are
retrieved.
= System-level variables and options are fetched from the database.
= Input and output files are opened for reading and writing.

Driving Cursor
The driving cursor is a SQL cursor that defines the data to be processed by
a given batch program. The tables queried from in the driving cursor and
the conditions used to gather data from them determine much of the
behavior of a program and influence decisions regarding the use of the
restart/recovery libraries and multithreading views. The new 9.0 batch
standard is to put the driving cursor inside the process function unless the
cursor is referenced in serveral places within the batch program.

process()
The process( ) function is where the bulk of the work of a batch program
is controlled:
= The driving cursor is opened and fetched from.
= Supporting functions are called to perform program-specific functions.
= Restart/recovery is maintained by writing bookmarks to the database.
Ideally, significant processing does not occur in the process function itself,
rather it occurs in functions called by process( ). Perhaps a better name
for this function would be process_control( ).

Module 5: Retek Batch Programming

final()
The final( ) function is where loose ends in the program are tied up:
= Restart/recovery is closed down.
= Input and output files are closed.

65

Module 5: Retek Batch Programming

Basic Outline of a Retek Batch Program


#include <retek_2.h>
EXEC SQL INCLUDE SQLCA.H;
long SQLCODE;
main(int argc, char *argv[])
{
char *function = "main";
char ls_logmessage[255];
strcpy(PROGRAM,"get_date");
if (argc < 2)
{
fprintf(stderr,"Usage: get_date
userid/password\n");
exit(-1);
}
if (LOGON(argc, argv) < 0)
exit(-1);
if (init() < 0)
{
sprintf(ls_logmessage,"Aborted in init...");
LOG_MESSAGE(ls_logmessage);
exit(-1);
}
if (process() < 0)
{
sprintf(ls_logmessage,"Aborted in process...");
LOG_MESSAGE(ls_logmessage);
exit(-1);
}

if (final() < 0)
{
sprintf(ls_logmessage,"Aborted in final...");
LOG_MESSAGE(ls_logmessage);
exit(-1);
}
else
{
sprintf(ls_logmessage,"Program terminated OK");
LOG_MESSAGE(ls_logmessage);
exit(0);
}
/* end of main */

int init()
{
char *function = "init";
return(0);
} /* end of init */
int process()
{
/* declare driving cursor */
char *function = "process";

66

Module 5: Retek Batch Programming


/* open driving cursor */
while ( 1 )
{
/* fetch driving cursor until no more data */
/* call sub functions to process data */
}
return(0);
} /* end of process */
int final()
{
char *function = "final";
return(0);
} /* end of final */

67

Module 5: Retek Batch Programming

68

Function Length and Modularity


The C language is designed to encourage long, complex programs being
divided into several short, simple functions. The basic structure of Retek
batch programs takes the first step by calling init( ), process( ), and final( )
to do work that might otherwise be done directly in main( ). However,
these functions only provide a very high-level outline of the flow of a
program. The process( ) function is the most misused. It is meant to act
as a controlling function for fetching records from the driving cursor and
processing them; it is not meant to contain all the actual work of the
program. As with main( ), it should look essentially like an outline,
mostly calling other functions that perform the actual work.
A single function should perform a single task. For functions engaged in
more complex activities, the task may be to call a series of lower-level
functions, essentially acting as another controlling function in the manner
of main( ) and process( ). If a function exceeds 100-200 lines (or 1-2
printed pages), or if its logic is difficult to trace and understand, it is
probably a candidate for division into smaller functions. Using small,
modular functions not only increases readability and maintenance, but also
encourages code reuse.

Return Values
Most functions in Retek batch programs return an integer. Return values
are interpreted as a code indicating whether an error occurred during
execution of the function according to the following rules:
= 0 The function completed without error, and processing should
continue normally.
= 1 A fatal error occurred, and the calling function should also return
1, as should its calling function, and so on up to main( ), where the
final error messages are logged and the program is halted.
= 1 A non-fatal error occurred (such as validation of an input record
failed), and the calling function should either pass this error up another
level or handle the exception.

Module 5: Retek Batch Programming

69

Occasionally, smaller functions performing mathematical operations that


do not perform any database interaction or validation may have a noninteger return type:
/* Strip VAT from a retail amount. */
double remove_vat(double id_amt,
double id_vat_rate)
{
return(id_amt (id_amt / (1 + id_vat_rate));
} /* end remove_vat */

Module 5: Retek Batch Programming

70

Lesson 2: Variables
Naming
Descriptive identifiers
Variable names should not only indicate what information is stored by the
variable, but also suggest its purpose:
/* These variable names give no indication of their
usage. */
char x[5];
char y[5];
char a[9];
double d = 0;
/* These variable names give some indication of their
usage, */
/* but still rely on context for interpretation.
*/
char ls_wh[NULL_LOC];
char ls_store[NULL_LOC];
char ls_sku[NULL_SKU];
double ld_qty = 0;
/* These variable names give a clear indication that
*/
/* they are being used within a transfer routine.
*/
char ls_source_wh[NULL_LOC];
char ls_destination_store[NULL_LOC];
char ls_transfer_sku[NULL_SKU];
double ld_transfer_qty = 0;

Prefixes
Variable names in Retek batch programs are given prefixes that allow the
programmer or reader to identify their type and use without having to
reference their declarations.

Module 5: Retek Batch Programming

71

C Variable Prefixes
C language variables are prefixed with characters to indicate scope and
type in accordance with the rules below. For example,
ll record count /* a long integer local to a function
*/
gs_username
/* a globally declared string
*/
if_reject_file
/* an function input argument of
type file pointer
*/

Scope
g Global variables declared externally to the program, usually in a
library or header file.
p Variables global to the program.
l Variables local to a function.
i Parameters to a function passed with information to be used by the
function.
o Parameters to a function passed with the purpose of being modified
and passed back.
io Parameters to a function passed with information that will be used by
the function, changed within the function, and passed back to the parent
function.
Type
i Integer or short.
l Long integer.
d Double (the C float type should never be used).
c Single character.
s Character string.
a Array or structure of arrays.
f File pointer.
Cursor Prefixes
Embedded SQL cursors are always prefixed with a lowercase c. No
scope prefix is needed:

Module 5: Retek Batch Programming


EXEC SQL DECLARE c_vdate CURSOR FOR
SELECT TO_CHAR(vdate,YYYYMMDD)
FROM period;

Embedded PL/SQL Variable Prefixes


Variables local to an embedded PL/SQL block are prefixed with an
uppercase L to indicate a local variable, in accordance with Retek
standards for PL/SQL:
L_err_msg VARCHAR2(255);

72

Module 5: Retek Batch Programming

Capitalization
Variables
C variables are lowercase:
FILE *pf_final_file;
strcpy(ls_sku,20002116);

Embedded PL/SQL variables have their prefixes in uppercase and the


descriptor in lowercase:
L_error_flag VARCHAR2(1);
L_found := N;

Macro Substitutions
Macro substitutions should always be in uppercase:
#define TRAN_DATA_RETURN_CODE 4
char ls_location[NULL_LOC];

Oracle Reserved Words


Oracle reserved words, and PL/SQL built-in function names are
uppercase. All other identifiers are in lowercase:
EXEC SQL DECLARE c_test CURSOR FOR
SELECT osk.sku,
olo.loc,
olo.loc_type,
SUM(olo.qty_ordered),
SUM(NVL(olo.qty_cancelled,0))
FROM ordsku osk,
ordloc olo
WHERE osk.order_no = :ls_order_no
AND osk.order_no = olo.order_no
AND osk.sku = olo.sku;
EXEC SQL FOR :ll_ins_ctr
INSERT INTO ordhead (order_no,
order_type,
supplier,
not_before_date)
VALUES (:la_insert.s_order_no,
:la_insert.s_order_type,
:la_insert.s_supplier,
TO_DATE(:la_insert.s_nbf,YYYYMMDD));

73

Module 5: Retek Batch Programming

74

Data Types
Constants and Macros
Batch programs should not contain any hard-coded numbers. If a program
must use a constant value, that constant should be defined by a macro.
The macro should be defined at the top of the program (use #define). If
the value of the constant changes, it will be much easier to modify the
macro definition rather than in many instances where the constant is being
used. Named constants are also beneficial in writing self-documenting
code.
LEN_* vs. NULL_*
Two macros are used to define the width of each string based on a
database column or a field in a file. One is has the value of the maximum
width of the field, and starts with the prefix 'LEN_'. The other adds room
for the null-terminating character and starts with the prefix 'NULL_':
#define LEN_LOC
4
#define NULL_LOC
5
#define LEN_LOC_TYPE
1
#define NULL_LOC_TYPE
2
...
int get_loc(char *os_location)
{
char ls_location[NULL_LOC];
char ls_loc_type[NULL_LOC_TYPE];
...
zero_pad(LEN_LOC,ls_location);
...
}

Strings
IS STRING
Prior to version 9.0, in order for Oracle to map C strings to its internal
string types, all strings that will be used as bind variables in a SQL
statement or PL/SQL block need a two-part declaration. The first part is
the normal C declaration. The second is a statement identifying the string
as an Oracle STRING type. The IS STRING statement must declare the
string width, but it uses parentheses rather than square brackets:
char
EXEC
char
EXEC

ls_sku[NULL_SKU];
SQL VAR ls_sku IS STRING(NULL_SKU);
ls_location[NULL_LOC];
SQL VAR ls_location IS STRING(NULL_LOC);

Module 5: Retek Batch Programming

75

If you are compiling using make or hcomp81 or later (ask if youre not
sure) you will not need the IS STRING declaration to let Oracle know
that your character array is a string. However, for the vast majority of the
batch code in existence IS STRING declarations exist in the code and are
still required when compiling earlier Retek versions. One difference when
using the new compiler is when strings are passed into functions. The
string argument in the function declaration should be declared as a
character array instead of a character pointer. This lets Oracle know how
long the string is.
strcmp() vs. MATCH()
strcmp( ) is an ANSI C function used to lexigraphically compare two
strings. MATCH( ) is a macro defined by Retek. The strcmp( ) function
returns a zero (logical FALSE) if the two given strings are equivalent and
returns a non-zero value (logical TRUE) if they are different, based on the
difference between the first non-matching characters in the strings.
MATCH( ) works in almost the opposite manner. It returns 1 (TRUE) if
the strings match and 0 (FALSE) if they don't. In fact, MATCH( ) is
defined as !strcmp( ). Compare the readability of the following two
statements:
if( ! strcmp(is_ord_sku, ls_invc_sku) )
the '!' */
{
...

/* Note

if( MATCH(is_ord_sku, ls_invc_sku) )


{
...

Use either strcmp or MATCH in your programs. But, for the sake of
readability, use one or the other.

Numbers
Integer
C has three types of integer types: short, int, and long. Shorts are the
smallest form of integer, and are used mostly for NULL-indicator
variables in SQL statements and for yes/no (1/0) flags.
Longs offer the most precision of any integer type and are used as
counters.
Ints are not normally used. Values from the database should not be
fetched into integer types, since most values in tables are either floatingpoint numbers or identifiers.

Module 5: Retek Batch Programming

76

Floating-Point
C has two types of floating-point numbers: float and double. Doubles are
used to hold true numeric values when arithmetic must be performed on
them. Floats offer only half the precision of doubles and should never be
used.

Numbers as Strings
Because C and UNIX place limitations on the maximum precision of a
number, it is possible for a numeric value in Oracle to be too long to fit
into a long or double. In order to minimize this possibility, numbers
should be held in C strings as much as possible. As much arithmetic as
possible should be kept in SQL cursors. If a number must be manipulated
in C, it should be stored as a double in order to provide the maximum
precision.

Identifiers vs. Quantities


Many table columns defined as numeric in Oracle are not actually
numbers. That is, they do not represent a quantity or amount of any sort,
and no arithmetic will ever be performed on them. Most identifiers in the
system are numeric (such as item, location, and department), but not
numbers. These identifiers should always be fetched into strings rather
than numeric types. Because Oracle can convert a number automatically to
a string and vice versa, this requires no additional code.

Module 5: Retek Batch Programming

77

Dates
Because C has no date type, Oracle dates are converted into strings for use
in batch programs. Date strings should be in the Oracle format
'YYYYMMDD'. If a timestamp is part of the date, the format should be
'YYYYMMDDHH24MISS'. The ordering of fields (year, month, then
day) allows date comparisons to be done in C by using a simple strcmp( ):
EXEC SQL DECLARE c_ord_dates CURSOR FOR
SELECT TO_CHAR(not_before_date,'YYYYMMDD'),
TO_CHAR(not_after_date,'YYYYMMDD'),
FROM ordhead;
WHERE order_no = :is_order_no
AND not_after_date >
TO_DATE(:is_yesterday,'YYYYMMDD');
...
EXEC SQL FETCH c_ord_dates INTO :ls_not_before_date,
:ls_not_after_date;
...
/*
* If the not-before-date is "less than" (earlier than) the
* not-after-date, strcmp() will return a number less than
zero:
*
ls_not_before_date == "19990722" (July 22, 1999)
*
ls_not_after_date == "19990903" (September 3, 1999)
* ('7' is lexigraphically less than '9', so strcmp() will
return 2.)
*
* However, if the not-before-date is later than the
* not-after-date, strcmp() will return a number greater
than zero:
*
ls_not_before_date == "19990722" (July 22, 1999)
*
ls_not_after_date == "19980903" (September 3, 1998)
* ('9' is lexigraphically greater than '8', so strcmp()
will return 1.)
*/
if(strcmp(ls_not_before_date,ls_not_after_date) > 0)
{
sprintf(err_data,"not-after-date is earlier than notbefore-date");
...

Module 5: Retek Batch Programming

78

Scope
Global
The only global variables in Retek batch programs should be system-level
options, dates, and other static entities. These are set primarily in the init(
) function. If you are tempted to declare a global variable consider
whether the reason is because you need static storage. If this is the case,
consider declaring a static local variable that can be passed to other
functions when necessary.

Local
Nearly all variables in Retek batch programs should be declared locally.
Where multiple functions need to share information, that information
should be passed between them as parameters. Groups of variables that
are related to each other and used together should be gathered together
into a single struct. Structs should be defined as a type at the top of the
program so they can be declared locally and passed as parameters. All
cursors should be defined locally within the function that opens them.
Variables that are changed by a PL/SQL function (output variables) must
be declared locally (within the PL/SQL block) and then copied into C
variables declared outside the block.

Parameters
Input
Input variables are passed from the parent function to the child to use in its
processing. These parameters should be passed by value whenever
possible; however, this will not always be possible. Character strings are
an obvious example, as they are always passed by reference. It is
important to note that parameters passed by reference will not necessarily
be changed by the child function. Input parameters will have the scope
prefix i.

Module 5: Retek Batch Programming

79

Output
Output variables are passed into a function so that it can populate them
with a value that will later to be used in some way by the parent. Output
parameters must be passed by reference so that the child function can
change their value. These parameters have the scope prefix 'o'.

Input/Output
Some variables are passed by a parent into a function with a value that is
used by the child. The child then performs actions that change the value
of the variable. This change affects the later behavior of the parent
function. These variables are referred to as input/output variables (similar
to the IN OUT parameter type in PL/SQL). These parameters must be
passed by reference and have the scope prefix 'io'.

Module 5: Retek Batch Programming

80

Lesson 3: Formatting and Style


White Space
C compilers pay no attention to spaces between symbols or breaks
between lines. Humans, on the other hand, find text much easier to read
when there are breaks between words and lines. When writing code, you
should space symbols for readability and break up long or complex
statements into multiple lines. Take care, however, not to spread code out
too far:
/* Overly compressed code is difficult to read. */
ld_vat_amt=id_amt-(id_amt/(1+id_vat_rate));
ld_total_qty=get_tsf_qty(id_in_qty)+ld_receipt_qty(ld_rtv_packs*ld_pack_qty);
/* Properly spaced code is much easier to parse. */
ld_vat_amt = id_amt (id_amt / (1 + id_vat_rate));
ld_total_qty = get_tsf_qty(id_in_qty)
+ ld_receipt_qty
- (ld_rtv_packs * ld_pack_qty);
/* Gratuitously spaced code takes up
/*
t o o
m u c h
r o o m
/*
and
/*
loses
/*
the
/*
reader.
ld_vat_amt = id_amt
(id_amt /
(1 +
id_vat_rate
)
);
ld_total_qty =
get_tsf_qty(id_in_qty)
+ ld_receipt_qty
- (ld_rtv_packs *
ld_pack_qty);

*/
*/
*/
*/
*/
*/

Module 5: Retek Batch Programming

81

Indentation
Indentation helps you keep track of the depth of a statement within
functions and multiple levels of conditional or looping clauses. When
nesting lines of code, indent all new lines three spaces. DO NOT use tabs
to indent code:
/* Inconsistent indentation makes code very difficult to
read, */
/* and can mislead the reader trying to determine the
depth
*/
/* of a statement.
*/
int get_promotion(int pi_multi_prom_ind,
char *is_store,
char *os_promotion)
{
char *function = "get_promotion");
char ls_promotion[NULL_PROM];
EXEC SQL DECLARE c_promotion CURSOR FOR
SELECT promotion
FROM promstore
WHERE store = :is_store;
if(pi_multi_prom_ind)
{
EXEC SQL OPEN c_promotion;
if(SQL_ERROR_FOUND)
{
sprintf(err_data,Open c_promotion);
strcpy(table,promstore);
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1);
}
EXEC SQL FETCH c_promotion INTO :ls_promotion;
if(SQL_ERROR_FOUND)
{
sprintf(err_data,Fetch c_promotion);
strcpy(table,promstore);
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1);
}
else if(NO_DATA_FOUND)
{
sprintf(err_data,Can't find promotion);
strcpy(table,promstore);
if(WRITE_ERROR(SQLCODE,function,table,err_data))
return(-1);
return(1);
}
else
{
strcpy(os_promotion,ls_promotion);
}
}
else
{
strcpy(os_promotion,);
}
return(0);
} /* end get_promotion */
/* Proper indentation is much easier to read. */
int get_promotion(int
pi_multi_prom_ind,
char *is_store,

Module 5: Retek Batch Programming

82
char *os_promotion)

{
char *function = "get_promotion);
char ls_promotion[NULL_PROM];
EXEC SQL DECLARE c_promotion CURSOR FOR
SELECT promotion
FROM promstore
WHERE store = :is_store;
if(pi_multi_prom_ind)
{
EXEC SQL OPEN c_promotion;
if(SQL_ERROR_FOUND)
{
sprintf(err_data,Open c_promotion);
strcpy(table,promstore);
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1);
}
EXEC SQL FETCH c_promotion INTO :ls_promotion;
if(SQL_ERROR_FOUND)
{
sprintf(err_data,Fetch c_promotion);
strcpy(table,promstore);
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1);
}
else if(NO_DATA_FOUND)
{
sprintf(err_data,Can't find promotion);
strcpy(table,promstore);
if(WRITE_ERROR(SQLCODE,function,
table,err_data))
return(-1);
return(1);
}
else
{
strcpy(os_promotion,ls_promotion);
}
}
else
{
strcpy(os_promotion,);
}
return(0);
} /* end get_promotion */

Brackets
Brackets begin on the line following the statement triggering their use.
Indent them to the same depth as that statement. A line containing a
bracket should not contain any other statements, although comments are
acceptable:
/* Improper bracketing. */
int get_promotion() {
...
if(pi_multi_prom_ind)
{
...
}

Module 5: Retek Batch Programming

83

...
if(SQL_ERROR_FOUND) { sprintf(err_data,"SQL error");
strcpy(table,"promstore");
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1); }
...
return(0);
} /* end get_promotion */
/* Proper bracketing. */
int get_promotion()
{
...
if(pi_multi_prom_ind)
{
...
}
...
if(SQL_ERROR_FOUND)
{
sprintf(err_data,"SQL error");
strcpy(table,"promstore");
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1);
}
...
return(0);
} /* end get_promotion */

Module 5: Retek Batch Programming

84

Comments
Source code is meant to explain a process in the clearest and most explicit
manner possible, so that it can be reliably translated into machine code for
the computer to execute. However, a machine's concept of clarity often
differs from that of a human reader. Furthermore, situations often arise in
which the actions being performed by a program may be so complex or
subtle (from both a technical and business standpoint) that they may
require additional explanation. Comments in a program exist to help the
reader of that program to understand sections of code when the function or
logic may not be obvious.

Format
Because comments are meant to help a reader gain a better understanding
of the program, it is important for them to be formatted for clarity.
Most comments should be placed on the line before the code they
describe:
/* Calculate the share of the retail amount that is VAT.
*/
ld_vat_amt = id_amt id_amt / (1 + id_vat_rate);

Very short comments can be inserted at the end of the line of code they
describe:
ld_vat_excl_amt = id_amt / (1 + id_vat_rate); /* Strip
out VAT. */

If a comment spans multiple lines, break it up and format it to keep it


distinct from the code. Do not use one set of comment characters to
define multiple line comments.
/* A long comment can start to bleed into the code. */
/* Determine if the current fetch has returned any rows for
processing by comparing the number of records returned by
the current fetch with the number returned by previous
fetches. */
if(ll_recs_to_process = (NUM_RECS_PROCESSED
ll_previous_recs_fetched))
{
...
/* A comment broken up into multiple lines keeps itself
distinct. */
/* Determine if the current fetch has returned
*/
/* any rows for processing by comparing the
*/
/* number of records returned by the current fetch */
/* with the number returned by previous fetches.
*/
if(ll_recs_to_process = (NUM_RECS_PROCESSED
ll_previous_recs_fetched))
{
...

Module 5: Retek Batch Programming

85

/* There are different ways of formatting comments. */


/*
* Determine if the current fetch has returned
* any rows for processing by comparing the
* number of records returned by the current fetch
* with the number returned by previous fetches.
*/
if(ll_recs_to_process = (NUM_RECS_PROCESSED
ll_previous_recs_fetched))
{
...

If a comment needs special emphasis, you should format it in such a way


as to distinguish it from code and other comments:
/* Sometimes, important comments might be missed. */
if(ll_recs_to_process = (NUM_RECS_PROCESSED
ll_previous_recs_fetched))
{
...
}
else break; /* If no records were returned, processing is
finished. */
/* "Embellishing" comments attracts the reader's */
/* attention to important points in the code.
*/
if(ll_recs_to_process = (NUM_RECS_PROCESSED
ll_previous_recs_fetched))
{
...
}
/***********************************************************
*/
/*** If no records were returned, processing is finished.
***/
/***********************************************************
*/
else break;
/* There are many ways to highlight a comment. */
if(ll_recs_to_process = (NUM_RECS_PROCESSED
ll_previous_recs_fetched))
{
...
}
/*------------------------------------------------------*\
| If no records were returned, processing is finished. |
\*------------------------------------------------------*/
else break;
if(ll_recs_to_process = (NUM_RECS_PROCESSED
ll_previous_recs_fetched))
{
...
}
/****
****
**** IF NO RECORDS WERE RETURNED, PROCESSING IS FINISHED.
****
****
****/
else break;

However you decide to format comments, take care to maintain consistent


formatting throughout a program so the reader is not confused.

Module 5: Retek Batch Programming

86

Content
Comments are meant to make the code more readable and clarify complex
passages. In order for comments to be effective, they must be written
clearly.
Do not include comments to reiterate the obvious:
/* Add the fetched quantity to the total quantity.
*/
ld_total_qty += ld_fetched_qty;

When maintaining code (fixing, customizing, enhancing) be careful to


maintain all associated comments as well. Incorrect or misleading
comments can lead to serious misunderstandings and errors. This is the
primary reason why self-documenting code is desirable. Well named
variables, functions and constants as well as precise parameter passing all
contribute greatly to self-documenting code.

Module 6: Error Handling & Debugging

This module includes the following lessons:


Lesson 1: Logging messages to the daily log file
Lesson 2: Writing messages to the program error file
Lesson 3: Error handling after a SQL statement
Lesson 4: Error handling after a PL/SQL block
Lesson 5: Fatal vs. nonfatal errors
Lesson 6: Debugging Tools

Module 6: Error Handling & Debugging

88

Module Overview
In this module, you will learn how to add error messaging to your Pro*C
programs, and will be introduced to some common debugging tools used
at Retek.
Most programs will write to a daily log file to record program information
and to a program error file to store errors experienced by the programs.
This module will discuss when these messages should be written and
where the messaging routines should be placed in a program.
Finally, a brief discussion of three types of debugging tools used by Retek
is included.

Objectives
After completing this module, you will be able to:
Find the daily log file and write messages to it with LOG_MESSAGE.
Find a programs error file and write messages to it with
WRITE_ERROR.
Insert proper error handling routines after SQL statements.
Insert proper error handling routines after PL/SQL blocks that call
batch-enabled or non-batch-enabled PL/SQL functions.
Define the differences between fatal and non-fatal errors and the
differences in dealing with them.
Identify three different types of debugging tools and describe their
uses.

Module 6: Error Handling & Debugging

89

Lesson 1: Logging Messages to the Daily


Log File
The Daily Log File
Every batch program should write a message to the daily log file when it
starts and when it finishes. These messages are logged in the main( )
function.
The daily log file is kept in a directory off of the Retek batch home
directory at $MMHOME/log/. The name of the log file is the three-letter
abbreviation of the month, an underscore, and the day of the month with
two digits, with '.log' as an extension. If the date is January 5, the location
and the name of the log file will be:
$MMHOME/log/Jan_05.log

A message written to the log file has a date stamp, the name of the
program, and a message stating either that the program has started, or that
it has finished (successfully or not):
Mon Jan 25 18:17:26 Program: posupld: Started by
rmsdev80user
Mon Jan 25 18:17:47 Program: posupld: Thread [1] Terminated OK.

LOG_MESSAGE( )
Messages are written to the log file by the LOG_MESSAGE( ) function.
This function takes a single string as a parameter and automatically adds
the timestamp and program name. If a variable needs to be written to the
log file, it should be printed into a formatted string, and the formatted
string should be sent into the LOG_MESSAGE( ) function:
sprintf(logmessage, "Thread [%d] - Terminated OK.",
pl_commit_max_ctr);
LOG_MESSAGE(logmessage);
exit(0);

Module 6: Error Handling & Debugging

90

Lesson 2: Writing Messages to the Program


Error File
The Program Error File
In addition to the daily log file, to which each program writes a starting
and finishing message, each program also needs to write its own error
messages. Rather than clutter the daily log file with these messages, each
program writes errors out to its own daily file.
The program error file is kept in a directory off of the project's batch home
directory at $MMHOME/error/. All errors for a given program on a given
day go into a single file. The name of the program's error file is the prefix
'err.', the program name, the thread number (where applicable), and the
date as a suffix. On January 5, all errors for the second thread of the
posupld program would be placed in the following file:
$MMHOME/error/err.posupld_2.Jan_05

A message written to the program error file has the program name and
thread number, a time stamp, the function where the error occurred, any
related database tables, an error code (usually the Oracle server error
number), the Oracle error message, and a program error message:
posupld_1~19981222101405~validate_promotion~promhead~S~14
03~ORA-1403: No Data Found~Record 0000000433: fetch
c_promotion where promotion: 5197

Messages are written to the error file using the WRITE_ERROR function.

Module 6: Error Handling & Debugging

91

WRITE_ERROR( )
WRITE_ERROR(SQLCODE,function,table,err_data);

Parameters
Error code: A numeric value identifying the error. It should never
contain a hard-coded literal value. Instead, SQLCODE or RET_*_ERR
should be passed in here.
Value: SQLCODE (sqlca.sqlcode)
Use: The most commonly used error code. Declared as a long at the
top of the program, preferably just after the inclusion of SQLCA.H.
SQLCODE is populated by Oracle after every SQL statement and
holds the Oracle server error number.

Value: RET_FUNCTION_ERR (#defined as 103)


Use: For generic, non-SQL-related errors.

Value: RET_FILE_ERR (#defined as 104)


Use: For file I/O errors.

Value: RET_PROC_ERR (#defined as 105)


Use: For a PL/SQL block that has failed.
Function: The name of the function in which the error occurred. Function
is a local variable declared as a string at the beginning of each function:
int get_vat_rate()
{
char *function = "get_vat_rate";
...

Table: A list of the tables on which the error occurred. Table is a global
string variable. It is set whenever WRITE_ERROR() is called after a SQL
statement. It should be populated with the name of the tables accessed by
the SQL statement:
strcpy(table,"ordsku, ordloc");

Module 6: Error Handling & Debugging

92

If no tables are associated with the SQL statement use a null string, .
Error data: A formatted string describing the error condition. This
should provide reasonable detail to users to help them figure out where the
error occurred. This should not repeat any information in the program,
function, or table parameters. Use the err_data global variable.
This string should describe the action taken, the cursor involved (if
applicable), and any associated bind variables. The error message is the
only evidence of a problem and the only guide to finding that problem, so
it should be as helpful as possible:
if (SQL_ERROR_FOUND)
{
sprintf(err_data,"Fetch c_ordloc for order_no %s, sku %s",
ls_order_no, ls_sku);

It should be noted that the WRITE_ERROR function was modified for


RMS 9.0 and all future releases. It merely removed function parameters
that were deemed no longer useful. This is only presented as a reminder
for when having to work with code before the 9.0 version. The old
WRITE_ERROR had eight arguments, as opposed to four, and looked like
this:
WRITE_ERROR(SQLCODE,S, PROGRAM,
function,table,err_data, , );

Module 6: Error Handling & Debugging

93

Lesson 3: Error Handling After a SQL


Statement
Every embedded SQL statement should be followed by an error handling
clause. This clause should trap fatal SQL errors. If any are found, a
message should be written to the program error file, and the function
should return a 1. This signals to all parent functions that the program
should be halted:
EXEC SQL DECLARE c_item_info CURSOR FOR
SELECT dept,
system_ind
FROM desc_look
WHERE sku = :is_item;
...
EXEC SQL OPEN c_item_info;
if(SQL_ERROR_FOUND)
{
sprintf(err_data,"Open c_item_info for item
%s",is_item);
strcpy(table,"desc_look");
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1);
}
EXEC SQL FETCH c_item_info INTO :ls_item_dept,
:ls_system_ind;
if(SQL_ERROR_FOUND)
{
sprintf(err_data,"Fetch c_item_info for item
%s",is_item);
strcpy(table,"desc_look");
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1);
}

Module 6: Error Handling & Debugging

94

NO_DATA_FOUND
Depending on the situation, a cursor that returns no rows (or an UPDATE
or DELETE that affects no rows) may be an error, or it may have some
other significance. For this reason, a NO_DATA_FOUND
(SQLCODE==1403) condition is excluded from the
SQL_ERROR_FOUND condition and must be trapped in a different way:
...
EXEC SQL FETCH c_item_info INTO :ls_item_dept,
:ls_system_ind;
/* If there is a problem with the FETCH statement itself,
*/
/* log an error message and send the error up to the
caller. */
if(SQL_ERROR_FOUND)
{
sprintf(err_data,"Fetch c_item_info for item
%s",is_item);
strcpy(table,"desc_look");
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1);
}
/* If the item does not exist, simply exit the function
without error. */
else if(NO_DATA_FOUND)
{
return(0);
}

Module 6: Error Handling & Debugging

95

Lesson 4: Error Handling After a PL/SQL


Block
The success or failure of an embedded PL/SQL block can be assessed
using SQLCODE. However, SQLCODE can only be used to evaluate
whether the block as a whole has executed properly. The success of a
stored procedure, package, or function that is called in an embedded
PL/SQL block cannot be determined from SQLCODE.
The return value of a stored procedure should be assessed inside the block
and a flag should be set if the return value indicates failure. Furthermore,
if the package fails, the error message that is returned from the function
should be copied into a bind variable for writing to the error file. Code
should be placed after the block to evaluate the flag in order to determine
if WRITE_ERROR should be called.
In addition to checking the return value of the package, any errors in the
embedded PL/SQL block itself must also be trapped:
EXEC SQL EXECUTE
DECLARE
L_return
BOOLEAN
:= FALSE;
L_message VARCHAR2(255) := NULL;
BEGIN
L_return := PACKAGE.FUNCTION(L_message,:var1,
:var2);
if L_return = FALSE then
:li_plsql_flag := 1;
:ls_plsql_msg := L_message;
else
:li_plsql_flag := 0;
end if;
END;
END-EXEC;
/* See if the PL/SQL block itself failed. */
if (SQL_ERROR_FOUND)
{
sprintf(err_data, "Error in call to
PACKAGE.FUNCTION");
WRITE_ERROR(SQLCCODE,function,"",err-data);
return(-1);

}
/* See if there was an error inside the package. */
if (li_plsql_flag)
{
sprintf(err_data, "Error in PACKAGE.FUNCTION -%s",
ls_plsql_msg;
WRITE_ERROR(RET_PROC_ERROR,function,"",err_data);
return(-1);
}

Module 6: Error Handling & Debugging

96

Notice the PL/SQL block does not have an exception section. Although it
would not be wrong to add an exception section here, it is considered
unnecessary in Retek batch programming because the only thing you use
PL/SQL block for is to call stored functions. The only possible problems
encountered calling functions are caught by the function returning false, or
by SQL_ERROR_FOUND.
As an example and a warning, here is a sample of awful error-handling
from an existing batch program (now corrected!). Note that this code
CAN get through the cracks, i.e. people in charge of reviewing code will
miss things. Can you find whats wrong with this?
if( strncmp(is_rev_no, "-1", NULL_SA_REV_NO) == 0)
{
EXEC SQL OPEN c_sa_tran_tender;
if (SQL_ERROR_FOUND)
{
sprintf( err_data,
"OPEN c_sa_tran_tender ||
c_sa_tran_tender_rev");
strcpy( table, "sa_tran_tender ||
sa_tran_tender_rev");
WRITE_ERROR(SQLCODE,function,table,err_data);
return( FATAL);
}
}
else
{
EXEC SQL OPEN c_sa_tran_tender_rev;
if (SQL_ERROR_FOUND)
{
sprintf( err_data,
"OPEN c_sa_tran_tender || c_sa_tran_tender_rev");
strcpy( table,
"sa_tran_tender || sa_tran_tender_rev");
WRITE_ERROR( SQLCODE,function,table,err_data);
return( FATAL);
}
}
*oa_sa_tran_tender = NULL;
*ol_num_sa_tran_tender = 0;
for (;;)
{

Batch-Enabled
Many packages are batch-enabled. This allows error-trapping logic
outside of the embedded PL/SQL block to provide users with more
informative and descriptive error messages. For batch-enabled packages,
an additional function call should be made to the

Module 6: Error Handling & Debugging

97

SQL_LIB.BATCH_MSG function. This will format an error message and


provide table information that makes it easier for users to track exceptions
packages:
EXEC SQL EXECUTE
DECLARE
L_return
BOOLEAN
:= FALSE;
L_message VARCHAR2(255) := NULL;
BEGIN
L_return := PACKAGE.FUNCTION(L_message, :var1,
:var2);
if L_return = FALSE then
li_plsql_flag := 1;
ls_plsql_msg := L_message;
--- Convert the error message to a "batch--- friendly format and populate the table
--- variable for WRITE_ERROR().
SQL_LIB.BATCH_MSG(:ll_sql_holder,
:table,
:ls_plsql_msg);
else
li_plsql_flag := 0;
end if;
END;
END-EXEC;
/* See if the PL/SQL block itself failed. */
if (SQL_ERROR_FOUND)
{
sprintf(err_data,"Error in PACKAGE.FUNCTION call");
WRITE_ERROR(SQLCODE,function,"",err_data);
return(-1);
}
/* See if there was an error inside the package. */
if (li_plsql_flag)
{
sprintf(err_data,"Error in PACKAGE.FUNCTION - %s",
ls_plsql_msg);
WRITE_ERROR(RET_PROC_ERR,function,table,err_data);
return(-1);
}

Module 6: Error Handling & Debugging

98

Lesson 5: Fatal vs. Nonfatal Errors


Most errors in Retek batch programs are severe enough to warrant a
complete halt of the program. These are referred to as fatal errors. Some
business-level or validation errors, however, may not require such drastic
measures and are called non-fatal errors. These errors require only a
certain amount of work to be rejected or rolled back before continuing
with processing. When a function encounters a fatal error, it should
always return 1, which will then be propagated back up to main( ),
bringing the program to a halt. If a non-fatal error is trapped, the function
should return a 1 (positive), and the calling function should have code
handling this alternate error code:
...
/* Set li_err to the error code returned by the
function. */
li_err = validate_item(ls_item,
ls_system_ind);
/* If the error was fatal, propagate the error
upwards. */
if(li_err < 0) return(-1);
/* If the error was non-fatal, reject the current
record */
/* and continue to the next one.
*/
else if(li_err > 0)
{
write_to_reject_file();
continue;
}
...

Module 6: Error Handling & Debugging

99

int validate_record(char *is_item,


char *os_system_ind)
{
...
EXEC SQL DECLARE c_find_item CURSOR FOR
SELECT system_ind
FROM desc_look
WHERE sku = :is_item;
...
/* Try to find the item in the system. */
EXEC SQL FETCH c_find_item into :ls_system_ind;
/* If the FETCH statement failed, register a */
/* fatal error and halt the program.
*/
if(SQL_ERROR_FOUND)
{
sprintf(err_data,"Fetch c_find_item for item %s",
is_item);
strcpy(table,"desc_look");
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1);
}
/* If there was no SQL error, but the item wasn't
found,
*/
/* it was invalid. A non-fatal error should be
registered. */
else if(NO_DATA_FOUND)
{
sprintf(err_data,"Item %s was not found",
is_item);
strcpy(table,"desc_look");
if(WRITE_ERROR(SQLCODE,function,table,err_data) <
0)
return(-1);
return(1);
}
} /* end validate_record */

Note: In the example above, it is possible for the WRITE_ERROR( )


function to fail. When logging a non-fatal error, its return code should be
checked and any fatal errors in the statement itself trapped and propagated.
However, when writing a fatal error, you do not need to check its return
code, since the function will return 1 regardless of whether
WRITE_ERROR( ) succeeds.

Module 6: Error Handling & Debugging

100

Lesson 6: Debugging Tools


A thorough discussion of debugging and debugging techniques would
easily take up a whole book (maybe more), and is drastically beyond the
scope of this class. So, this discussion will be limited to common
debugging tools used in Retek.

Output Flags
The simplest method for debugging is simply putting output statements
into your code in order to mark when the program has performed a certain
task, such as executed a certain statement, entered a function, or taken a
specific logic branch. Output statements are also used to display the
runtime values of variables.
Overall, output flags can be replaced by even the simplest of debuggers,
which routinely have methods of showing the flow of the program and
displaying the values of variables in much more convenient ways. The
only advantage that output flags have is that they can be used in any
environment with any programming language, as long as the program has
access to some sort of output stream.
So, in the absence of any actual debuggers, using output flags is a useful
way to trace the flow of your program and the data moving through it.
However, if any sort of debugger is available, theres no real reason to use
output flags.

Debuggers: Dbx and Workshop


Debuggers are useful in that they let you step through your code line by
line, following which logical paths the program takes, and allow you to
see the values of any variables in the program as well. In its simplest
form, you can see exactly whats going on in your program line by line.
Debuggers also allow you to specify conditions for stopping the programs
(i.e., stop at line 100 if x > 45) run, making your task of monitoring the
data even simpler. Debuggers are highly useful in understanding the
program flow and the data flow even better.

Module 6: Error Handling & Debugging

101

There are two C debuggers available for use: dbx and Workshop. Both
debuggers allow the user to do all of the tasks mentioned above.
However, Workshop has a GUI interface while dbx does not. This means
that Workshop is more intuitive, but must be run through Exceed. If using
a telnet client, you must use dbx to debug. dbx is provided in most UNIX
environments, while Workshop is specific to Exceed. Also, dbx tends to
be a little more stable than Workshop.
To begin workshop type
> workshop

If you have problems with your display, make sure your PCs IP address is
in the file ~/.DISPLAY.
If you are using xemacs, workshop can be started within xemacs.
To start dbx type
> dbx <executable filename>

Both dbx and Workshop take an executable and a C file as input (usually,
only the executable has to be explicitly named, unless the C file is in
another directory). Neither of the debuggers understand Pro*C.

lint
The C compiler catches syntactic errors in a programit makes sure the
program has the correct number of opening and closing parentheses, warns
if functions dont have return values, finds typos, and so on. However,
this leaves a wide range of errors that dont get caught.
Lint is a program that catches less obvious errors in the codeit finds
variables that arent initialized before theyre used, looks for functions that
are called incorrectly (or never called at all), and warns if you may be
using = when you mean ==.
To run lint, you must have a C file (lint just wont understand a Pro*C
file). You must also tell lint the location of the library files that your
program uses (for most programs, thats $MMHOME/oracle/lib/src, also
known as $l). So, a run of lint would look like this:
> lint I$l my_prog.c

Module 6: Error Handling & Debugging

102

Lint returns many more warnings besides those that actually apply to your
programfor example, it will point out all the functions in retek.h that
arent used in your program (and theres a lot of them). So, there is a
certain amount of filtering that must be done to lints output, but the
results that do pertain to your code almost always prove useful.

Module 6: Error Handling & Debugging

Exercise 2
Make a copy of exercise 1 and rename it for exercise 2 (cp xxx_01.pc
xxx_02.pc, where xxx = your initials)
This program should do the following:
= Calculate the gross product.
= Instead of printing information to the screen, insert the information
into the table trn_win_store_hist.
= Update the program to include error handling and proper naming
conventions.
= See handout for additional information.

103

Module 6: Error Handling & Debugging

104

Evaluation Criteria
Comfortable
You are able to understand key processes and concepts relating to the
listed topic.

Not Comfortable
You are unable to understand key processes and concepts relating to the
listed topic.

Suggestions for More Work


If necessary, your Instructor will offer suggestions to you regarding where
to find additional information or practice exercises to further your
understanding of the listed topic.
Objective
Find the daily log file
and write messages to
it with
LOG_MESSAGE.
Find a programs
error file and write
messages to it with
WRITE_ERROR.
Insert proper error
handling routines
after SQL statements.
Insert proper error
handling routines
after PL/SQL blocks
that call batchenabled or non-batchenabled PL/SQL
functions.
Define the differences
between fatal and
non-fatal errors and
the differences in
dealing with them.
Identify three
different types of
debugging tools and
describe their uses.

Comfortable

Not
Comfortable

Suggestions for More Work

Module 6: Error Handling & Debugging

105

Summary
In this module, you learned how to add error messaging to your program
and were introduced to some common debugging tools that Retek uses.
You learned to write to the daily log file with LOG_MESSAGE and to the
program error file with WRITE_ERROR. You then learned where your
program these two functions should be called and what sort of signals
should trigger the routines that call them. Finally, a brief discussion of
three types of debugging tools used by Retek was included.

Whats Next
In the next module, you will explore ways to use a powerful feature of the
C programming language to improve the performance of your batch
programs.

Module 6: Error Handling & Debugging

106

Module 7: Array Processing

This module includes the following lessons:


Lesson 1: Introduction to Array Processing
Lesson 2: Arrayed Fetches
Lesson 3: Other Arrayed SQL statements
Lesson 4: Dynamically Allocating Arrays

Module 7: Array Processing

108

Module Overview
In this module, you will learn how to use arrays in SQL statements, such
as FETCH, INSERT, UPDATE, and DELETE, to make their contact with
the database more efficient. Adding array processing to a batch program
somewhat alters the flow of the program, and the proper modifications
will be discussed in detail.
Finally, this module will discuss dynamically sizing arrays, which is
Reteks standard practice.

Objectives
After completing this module, you will be able to:
Describe why arrayed SQL statements are often better than nonarrayed statements.
Write a FETCH statement that will fetch records into an array.
Perform arrayed updates, inserts, and deletes.
Understand the restrictions on arrayed SQL statements.
Allocate memory to arrays dynamically, according to the Retek
standard.

Module 7: Array Processing

109

Lesson 1: Introduction to Array Processing


One of the biggest problems for Retek batch performance is the overhead
associated with communication between the batch server and the database
server. Every time a batch program executes a SQL statement, the
statement needs to be sent to the database and the program needs to wait
for a response. There is significant overhead associated with this
communication. This is why you want to minimize the number of SQL
statements in a batch program.
The answer to this conundrum is using Pro*Cs array processing features.
Though rather intimidating-sounding, the concept is very simple: Instead
of performing one SQL call with one set of bind variables at a time, you
perform many SQL calls, using bind arrays (arrays of bind variables).
For example, instead of simply fetching back one record from a cursor
into single bind variables, you can fetch back 10,000 records from the
cursor into bind arrays, all sized to have 10,000 elements each. This
arrayed fetch involves only one connection once to the database to retrieve
all 10,000 records, while the non-arrayed version involves 10,000 separate
connections over time to do the same work. Obviously, given the
overhead involved in connecting to the database, a batch program with the
arrayed fetch will perform much better than one without.

Module 7: Array Processing

110

Lesson 2: Arrayed Fetches


Setting Up an Arrayed Fetch
As mentioned above, arrays can be used in FETCH statements. The
process for doing so is quite simple. First, instead of declaring a bind
variable, you declare a bind array instead. For example,
double ld_total_cost;
char ls_sku[NULL_SKU];
EXEC SQL VAR ls_sku IS STRING(NULL_SKU);

becomes:
double lad_total_cost[10000];
char las_sku[10000][NULL_SKU];
EXEC SQL VAR las_sku IS STRING(NULL_SKU);

Notice that you can apply IS STRING to an array of strings exactly the
same way you apply it to a single string.
After declaring the arrays, the FETCH statement needs to be fixed. SQL
commands need to be told when arrays are being used, and how big those
arrays are (after all, a FETCH statement has to know how many records
its bringing back). This is accomplished with the FOR clause:
ll_array_size = 10000;
EXEC SQL FOR :ll_array_size
FETCH c_my_cur INTO :lad_total_cost,
:las_sku;

Notice that the addition of the FOR clause telling the SQL statement to
bring back 10,000 records is the only adjustment necessary.

Module 7: Array Processing

111

How Arrayed Fetches Affect Program Flow


Consider the following fairly typical piece of code:
{
double ld_total_cost;
char ls_sku [NULL_SKU];
EXEC SQL VAR ls_sku IS STRING(NULL_SKU);
...
while(1)
{
EXEC SQL FETCH c_my_cur INTO :ld_total_cost,
:ls_sku;
if (NO_DATA_FOUND)
break;
/* Process ld_total_cost and ls_sku */
} /* end of while loop */
}

To add array processing to this code, you perform the steps discussed
above:
{
double lad_total_cost[10000];
char las_sku[10000][NULL_SKU];
EXEC SQL VAR las_sku IS STRING(NULL_SKU);
long ll_array_size = 10000;
...
while(1)
{
EXEC SQL FOR :ll_array_size
FETCH c_my_cur INTO :lad_total_cost,
:las_sku;
if (NO_DATA_FOUND)
break;
/* Process lad_total_cost and las_sku */
} /* end of while loop */
}

Module 7: Array Processing

112

Obviously, you have to add an inner loop to process all the records
brought back, because you probably cant process then en masse:
{
double lad_total_cost[10000];
char las_sku[10000][NULL_SKU];
EXEC SQL VAR las_sku IS STRING(NULL_SKU);
long ll_array_size = 10000;
long ll_ctr;
...
while(1)
{
EXEC SQL FOR :ll_array_size
FETCH c_my_cur INTO :lad_total_cost,
:las_sku;
if (NO_DATA_FOUND)
break;
for (ll_ctr=0; ll_ctr < ll_array_size; ll_ctr++)
{
/* Process lad_total_cost[ll_ctr] and
las_sku[ll_ctr] */
}
} /* end of while loop */
}

This looks fine, but consider this scenario: c_my_curs cursor brings back
a total of 20,120 records. Heres what happens:
= The first time through the while loop, 10,000 records are fetched and
processed.
= The second time through, 10,000 records are fetched and processed,
bringing your total to 20,000 records processed.
= The third time through, only 120 records are fetched. Because it ran
out of records before it was finished fetching, SQL returns the
NO_DATA_FOUND signal. You break out of the loop, never having
processed the last 120 records.

Module 7: Array Processing

113

Its a fairly easy matter to fix this: when NO_DATA_FOUND is signaled,


simply use (!li_ndf) as the while condition. The old standard was to use
while (1) and then set a flag to indicate that this is the last time through
the while loop, and break after the processing. The new method is
cleaner:
{
double lad_total_cost[10000];
char las_sku[10000][NULL_SKU];
EXEC SQL VAR las_sku IS STRING(NULL_SKU);
Short li_ndf= 0;
long ll_array_size = 10000;
long ll_ctr;
...
while(!li_ndf)
{
EXEC SQL FOR :ll_array_size
FETCH c_my_cur INTO :lad_total_cost,
:las_sku;
if (NO_DATA_FOUND)
li_ndf = 1;
for (ll_ctr=0; ll_ctr < ll_array_size; ll_ctr++)
{
/*Process lad_total_cost[ll_ctr] and
las_sku[ll_ctr] */
}
} /* end of while loop */
}

But now that the NO_DATA_FOUND problem is fixed, another is


immediately apparent: When the last 120 records are fetched, the for loop
will still try to loop through 10,000 records, making for 9,880 records of
pure gibberish!
The solution to this problem is the NUM_RECORDS_PROCESSED
macro discussed back in Module 5. Recall that
NUM_RECORDS_PROCESSED will contain number of total records
processed by the FETCH. So, after the first fetch, it would be 10,000;
after the second, it would be 20,000; and after the third, it would be
20,120.

Module 7: Array Processing

114

To figure out how many records were brought back by the last call to
FETCH, you need only subtract the previous total number of records
retrieved from NUM_RECORDS_PROCESSED. So, you have to create a
variable to keep track of the previous total:
{
double lad_total_cost[10000];
char las_sku[10000][NULL_SKU];
EXEC SQL VAR las_sku IS STRING(NULL_SKU);
short li_ndf= 0;
long ll_array_size = 10000;
long ll_ctr;
long ll_num_records_fetched = 0;
long ll_prev_total = 0;
...
while(!li_ndf)
{
EXEC SQL FOR :ll_array_size
FETCH c_my_cur INTO :lad_total_cost,
:las_sku;
if (NO_DATA_FOUND)
li_ndf= 1;
ll_num_records_fetched = NUM_RECORDS_PROCESSED
ll_prev_total;
ll_prev_total = NUM_RECORDS_PROCESSED;
for (ll_ctr=0;
ll_ctr < ll_num_records_fetched;
ll_ctr++)
{
/* Process lad_total_cost[ll_ctr] and
las_sku[ll_ctr] */
}
} /* end of while loop */
}

To convince yourself that this is correct, walk through it:


= After the first fetch, NUM_RECORDS_PROCESSED = 10,000,
ll_prev_total = 0, and ll_num_records_fetched = 10,000 0 = 10,000.
ll_prev_total is set to be 10,000. You then process the 10,000 records
fetched.
= After the second fetch, NUM_RECORDS_PROCESSED = 20,000,
ll_prev_total = 10,000 and ll_num_records_fetched = 20,000 10,000
= 10,000. ll_prev_total is set to be 20,000. You then process the
10,000 records fetched.

Module 7: Array Processing

115

= After the third fetch, NUM_RECORDS_PROCESSED = 20,120,


ll_prev_total = 20,000 and ll_num_records_fetched = 20,120 20,000
= 120. ll_prev_total is set to be 20,120. The NO_DATA_FOUND
signal was sent, so the end indicator is set. You process the 120
records fetched, and then exit the loop.
So, finally, you have the proper flow for an arrayed fetch.
An important fact to note is that if you use arrays in a FETCH statement,
all bind variables must then be bind arraysusing array processing is allor-nothing.

Arrays and Cursors


It is important to note that arrays may not be used in the WHERE clause
of a cursor. Why not? Consider this: when arrays are used in SQL, all
arrays being used must be the same size (theres only one FOR clause,
after all). Most cursors, however, can bring back more than one row for a
single set of input. In fact, many cursors can bring back hundreds of rows
for a single set of input. Its very rare indeed that a cursor brings back one
and only one record for one setup of input. So, the chances that the array
being used in the WHERE clause will have the same size as the array in
the FETCH statement is so unlikely, arrays are simply disallowed in a
cursors WHERE clause.

Module 7: Array Processing

116

Indicator Arrays
Just as bind variables can have associated indicator variables, bind arrays
can have associated indicator arrays. Indicator arrays are simply arrays of
shorts and act exactly the same way as indicator variables: to find out if
my_arr[i] is NULL, simply check the value in my_arr_ind[i] for 1. The
syntax of an indicator array is exactly the same as a regular indicator
variable:
{
double lad_total_cost[10000];
short lai_total_cost_ind[1000];
char las_sku[10000][NULL_SKU];
EXEC SQL VAR las_sku IS STRING(NULL_SKU);
...
while(!li_ndf)
{
EXEC SQL FOR :ll_array_size
FETCH c_my_cur INTO
:lad_total_cost:lai_total_cost_ind,
:las_sku;
...

Of course, an indicator array must be the same size as its associated bind
array.

Module 7: Array Processing

117

Lesson 3: Other Arrayed SQL Statements


Arrayed inserts, deletes, and updates work almost exactly the same,
following a few simple rules:
1. All bind variables in the statement must be bind arrays.
2. The bind arrays must be of the same size.
3. A FOR clause is needed directly after the EXEC SQL.
When an arrayed SQL statement is called, the effect is as if many separate
SQL calls were made, with each set of bind array members with the same
index forming the data for one of the calls. In other words, this statement:
EXEC SQL FOR :ll_array_size
UPDATE win_store
SET currency_code = :las_curr_code
WHERE store = :las_store
AND sku
= :las_sku

is functionally equivalent to:


for (i=0; i < ll_array_size; i++)
{
EXEC SQL UPDATE win_store
SET currency_code = :las_curr_code[i]
WHERE store = :las_store[i]
AND sku
= :las_sku[i]
}

but the first statement does it all in one call to the database, incurring a lot
less overhead.
Here is an example of an arrayed delete, which is almost exactly the same
as an arrayed update:
EXEC SQL FOR :ll_num_deletes
DELETE FROM desc_look dl
WHERE dl.sku=:las_sku;

Module 7: Array Processing

118

Finally, an arrayed insert. Note that you can mix constants and bind
arrays, but you still may not mix in bind variables.
EXEC SQL FOR :ll_num_inserts
INSERT INTO desc_look (sku,
dept,
desc_up,
system_ind,
image_name,
image_type)
VALUES (:l_insert_arrays.as_sku,
:l_insert_arrays.as_dept,
:l_insert_arrays.as_desc_up,
S,
NULL,
NULL);

Notice that in this example, the bind arrays are members of a struct.
Oracle allows members of structs to be used as bind variables and bind
arrays with no problem whatsoever. A common Retek practice is to group
of related arrays into a struct to keep them organized.
It is important to note that the restriction of no bind arrays in a SELECT
statement still holds, even when used in an INSERT-SELECT statement.
PL/SQL blocks cannot be arrayed.

Module 7: Array Processing

119

Lesson 4: Dynamically Allocating Arrays


In all of the examples so far, the array sizes have been statically set to a
size of 10,000. However, you rarely know how big the arrays should be
when you write a program. If the program receives a large amount of
data, then you may want arrays to have more than 10,000 elements to
make for more efficient use of database access. On the other hand, if the
program receives very little data, allocating arrays of 10,000 elements is
unnecessary and wasteful. For this reason, most of Reteks arrays are
dynamically allocated at runtime.
Basically, this change in when the arrays are sized only changes two
elements: the arrays declarations are changed and a function is created to
allocate the memory necessary.
The change to the declarations is fairly simple: the hard-coded sizing is
removed and replaced with a pointer. So, your declarations in the
previous example would be:
{
double *lad_total_cost;
short *lai_total_cost_ind;
char (*las_sku)[NULL_SKU];
EXEC SQL VAR las_sku IS STRING(NULL_SKU);

Note that for strings, parentheses need to be around the pointer and the
variable name for Oracle to correctly understand the arrays of strings.
To make things a little simpler down the line, were going to group these
arrays in a struct to make passing them through parameters much simpler:
typedef struct
{
double *ad_total_cost;
short *ai_total_cost_ind;
char (*as_sku)[NULL_SKU];
} array_holder;
...
{
array_holder l_fetch_arrays;
/* Note that you have to use IS STRING when you declare
the variable, not when you define the type. */
EXEC SQL VAR l_fetch_arrays.as_sku IS STRING(NULL_SKU);

Module 7: Array Processing

120

The sizing function is basically a list of calloc statements, allocating space


for each array. But what are these arrays sized to, when all is said and
done? Since the primary situation where you use these dynamically sized
arrays is in programs that implement restart/recovery, you want the size of
your arrays to be the same as pl_commit_max_ctr. (See the next module
for details on pl_commit_max_ctr. For right now, suffice it to say that the
value comes from the database, so it is easily changed).
So, a simple allocation for ints, longs, shorts, or doubles would look
like this:
<array name> =
(<type>*)calloc(pl_commit_max_ctr,sizeof(<type>));

This statement allocates a chunk of memory that contains


pl_commit_max_ctr blocks that are each of the size required to store one
element of type <type>. calloc returns a pointer to that memory, which
the statement casts to be a pointer of the same type as the array.
For arrays of strings, the allocation is a little more complicated. The
parentheses are more convoluted in the casting. Also, since sizeof(char) is
1, you can simply list the string length as the a single strings size.
<array name> = (char(*)[<string length>])
calloc(pl_commit_max_ctr,<string
length>);

So, the actual function allocating memory for the variables declared above
would look like this:
/* Remember that we have to pass the struct by reference
here */
int size_fetch_arrays(array_holder *o_fetch_arrays)
{
char *function = size_fetch_arrays;
o_fetch_arrays->ad_total_cost =
(double*)calloc(pl_commit_max_ctr,sizeof(double));
o_fetch_arrays->ai_total_cost_ind =
(short*)calloc(pl_commit_max_ctr,sizeof(short));
o_fetch_arrays->as_sku =
(char(*)[NULL_SKU])calloc(pl_commit_max_ctr,NULL_SKU);
return 0;
} /* end of size_fetch_arrays */

Module 7: Array Processing

121

Of course, this function has absolutely no error handling and it needs it. If
the calloc call cannot find enough memory, then it will return NULL. So,
you should trap for that error and return an error message if it occurs:
int size_fetch_arrays(array_holder *o_fetch_arrays)
{
char *function = size_fetch_arrays;
short li_no_mem = 0;
if ((o_fetch_arrays->ad_total_cost =
(double*)calloc(pl_commit_max_ctr,sizeof(double)))
== NULL)
li_no_mem = 1;
if ((o_fetch_arrays->ai_total_cost_ind =
(short*)calloc(pl_commit_max_ctr,sizeof(short))) ==
NULL)
li_no_mem = 1;
if ((o_fetch_arrays->as_sku =
(char(*)[NULL_SKU])calloc(pl_commit_max_ctr,NULL_SKU)) ==
NULL)
li_no_mem = 1;
if (li_no_mem)
{
sprintf(err_data,"Unable to allocate memory");
WRITE_ERROR(RET_FUNCTION_ERR,function,"",err_data);
return(-1);
}
return 0;
} /* end of size_fetch_arrays */

Module 7: Array Processing

122

And finally, for completeness, you present the rest of the code from the
example, now using a struct and dynamically allocating memory for the
arrays:
{
array_holder l_fetch_arrays;
EXEC SQL VAR l_fetch_arrays.as_sku IS
STRING(NULL_SKU);
short
long
long
long

li_ndf= 0;
ll_ctr;
ll_num_records_fetched = 0;
ll_prev_total = 0;

...
/* Make a call to the sizing function, passing by
reference */
/* Make sure to call it before referencing the arrays
*/
if (size_fetch_arrays(&l_fetch_arrays) < 0)
return(-1);
while(!li_ndf)
{
EXEC SQL FOR : commit_max_ctr
FETCH c_my_cur
INTO :l fetch arrays.ad total cost
:l_fetch_arrays.ai_total_cost_ind,
:l_fetch_arrays.as_sku;
if (NO_DATA_FOUND)
li_ndf= 1;
ll_num_records_fetched =
NUM_RECORDS_PROCESSED ll_prev_total;
ll_prev_total = NUM_RECORDS_PROCESSED;
for (ll_ctr=0;
ll_ctr < ll_num_records_fetched;
ll_ctr++)
{
/* Process l_fetch_arrays.ad_total_cost[ll_ctr]
and*/
/* l_fetch_arrays.as_sku[ll_ctr]
*/
}
} /* end of while loop */
}

Module 7: Array Processing

Exercise 3
Make a copy of exercise 2 and rename it for exercise 3 (cp xxx_02.pc
xxx_03.pc, where xxx = your initials)
Modify the program as follows:
= Declare a struct for the arrays.
= Add array processing to the fetch and insert statements.
= Fetch 1000 rows at a time, using arrays.
= See handout for additional information.

Part 2
Make a copy of exercise 3 and rename it for the optional exercise (cp
xxx_03.pc xxx_03dym.pc, where xxx = your initials)
Modify Exercise 3 to implement dynamic array processing.
= Declare pointers for each array.
= Create a new function size_arrays() to calloc memory.
= Pass in the size of the arrays as a parameter at the command line.
= See handout for additional information.

123

Module 7: Array Processing

124

Evaluation Criteria
Comfortable
You are able to understand key processes and concepts relating to the
listed topic.

Not Comfortable
You are unable to understand key processes and concepts relating to the
listed topic.

Suggestions for More Work


If necessary, your Instructor will offer suggestions to you regarding where
to find additional information or practice exercises to further your
understanding of the listed topic.
Objective
Describe why arrayed
SQL statements are
often better than nonarrayed statements.
Write a FETCH
statement that will fetch
records into an array.
Perform arrayed updates,
inserts, and deletes.
Understand the
restrictions on arrayed
SQL statements
Allocate memory to
arrays dynamically,
according to the Retek
standard.

Comfortable

Not
Comfortable

Suggestions for More


Work

Module 7: Array Processing

125

Summary
In this module, you learned how to use arrays in SQL statements using the
FOR clause. As you saw, adding array processing to a batch program
alters the flow of the program, due to the addition of a for loop which
loops through the arrays.
There are several restrictions on arrayed SQL statements, most notably
that if bind arrays are used in a statement, all bind variables must be bind
arrays, and these bind arrays must be of the same size.
Finally, you learned how to dynamically size the arrays, according to
Reteks standards.

Whats Next
In the next module, you will focus on Reteks Restart/Recovery
procedures.

Module 7: Array Processing

126

Module 8: Table Based Restart/Recovery

This module includes the following lessons:


Lesson 1: Restart/Recovery Purpose
Lesson 2: General Design
Lesson 3: Implementation Details

Module 8: Table Based Restart/Recovery

128

Module Overview
Retek batch programs must balance a number of conflicting requirements.
They must be robust enough to recover from all types of failure while
remaining efficient. They must process large quantities of data within the
confines of reasonably sized rollback segments. The Retek
Restart/Recovery and Multithreading API provides solutions to these
challenges.
The two facets of this API, restart/recovery and multithreading, are
conceptually distinct, yet inextricably bound in Reteks implementation.
A program may implement multithreading and not restart/recovery, for
example. Yet, as you will see, an understanding of both is necessary to
implement either in a Retek batch program.
This module and the next introduce these two distinct topics; first at a
conceptual level, then more practically.

Objectives
After completing this module, you will be able to:
Explain the purpose of restart/recovery in batch programs.
Understand the variables and functions provided in the API.
Describe how a programs design affects commit logic.

Module 8: Table Based Restart/Recovery

129

Lesson 1: Restart/Recovery Purpose


Imagine a batch program that every night must process millions of
records. Assume that this processing takes two precious hours of the
batch window. Furthermore, consider that there are many reasons why
during these two hours the program might fail; power failure, lost
communication session, Oracle server error, and data errors to list a few.
What should happen if, after an hour and a half of processing, this
program does indeed fail?
One solution might involve rolling back all the database changes, then an
operator fixes whatever stopped the program and then restarts it from the
beginning. This is unacceptable for two reasons. One, it would require an
enormous rollback segment. Two, repeating all that work would take far
too much time given the limited batch window.
A second solution might entail a commit after each record is processed.
This would solve the former problems; completed work would be saved,
and rollback segment size would not be an issue. However, a couple new
problems are presented. Committing a few million times would add too
much communication and database overhead to be acceptable. Also, how
would the program know which records had been processed and which
had not if it had to restart?
What is required then is a solution that allows for intermittently saving
work and restarting wherever a program leaves off for any reason. This
functionality is, in fact, what Retek restart/recovery provides.

Module 8: Table Based Restart/Recovery

130

Lesson 2: General Design


Logical Unit of Work
In order to prevent work from being lost in the event of a fatal program
error, Retek batch programs occasionally commit work. Systematic and
meaningful commits require data to be ordered so that it can be kept track
of easily. This ordering of data is done using a set of variables known as
the logical unit of work (LUW).
The driving cursors order by clause, in every program that implements
restart/recovery, uses the LUW variables to achieve this ordering of data.
Imagine a program that processes win_store data at the store-SKU level.
The driving cursor will bring back data ordered by each store, and within
each store ordered by SKU.
The choice of LUW is often, but not always, obvious given the functional
requirements of the program. Ideally, the LUW would always provide a
unique key to the data records. That is, like a primary key, specific values
for the LUWs elements would be guaranteed to specify exactly one
record. However, this is not always possible and non-unique LUWs do
exist in Retek batch programs. A non-unique LUW adds complexity to
the commit logic of a batch program.

Restart String
Ordering data by the LUW variables provides a means, when restarting, to
get back to the exact record where the program left off, but it is not the
whole solution. The position within the cursors records must somehow
be stored persistently whenever data is successfully processed and
committed.
To accomplish this, functions have been created that store a string
containing the LUW elements current value on a restart table. This value
is historically known as the restart string, but is stored on
RESTART_BOOKMARK.BOOKMARK_STRING. Suppose that in a
program with a store-sku LUW, the current record is for store 1012 and
SKU 10004356. Then the bookmark string would contain:
;1012;10004356

Now each time that work is committed to the database the restart string
can be saved as well. When a restart is required, retrieving bookmark

Module 8: Table Based Restart/Recovery

131

strings saved values and using them in the where clause of the driving
cursor will limit the programs fetched data to only unprocessed records.

Application Image
Similar to the restart string, the application image is a set of strings
necessary for the program to return to its pre-abort state when restarting.
Unlike the restart string, an application image may or may not be
necessary for a particular program.
The specific details of the application image are entirely specific to each
program. Consider it a general purpose variable to hold any information
required for the program to pick up where it left off.
A common use of the application image is to hold line number counters in
programs that write output to files. Without an application image it
would, of course, still be possible to restart a download program, open a
cursor to select the correct data and open the partially completed output
file ready to append the next line. But, the next line number and other
specific data that may be necessary to begin creating the next output line
will have been lost when the program failed. These data are stored in the
application image.

Commit Frequency
In an ideal world, a commit would take place after each LUW so that no
repetition of work is required in the event of a restart. However,
committing after every LUW would compromise the efficiency of a
program. A balance must be struck between committing often enough to
avoid redoing too much work and committing occasionally so as not to
unreasonably slow down processing.
Yet, where the balance is found will be specific to many runtimeenvironment factors unknown to programmer during development. It
must be left to the system administrator at each installation to determine
the frequency of commits for each batch program.
To allow for this flexibility, a value known as the commit_max_ctr is
stored on a database table for each batch program implementing restart
and recovery. The value can be changed before each run of a program as a
performance tuning tool. The commit max counter is fetched into a public
variable during the initialization of each run, and used in the main
processing loop to determine the frequency of actual commits.

Module 8: Table Based Restart/Recovery

132

Each time through the process loop, a function called retek_commit( ),


increments a counter. A commit will take place only when this internal
counter is greater than or equal to the value of commit_max_ctr.
Furthermore, if the LUW is not unique, the commit will not occur until the
current LUW is done. If a commit were permitted in the middle of a LUW
and the program were to subsequently fail, then upon restart the program
would skip the rest of the records in that unit, and data would be lost.
This is why it is desirable to keep the LUW as close to unique as possible.
A small unit of work decreases the overrun that occurs when the commit
max counter is reached in the middle of a unit. In the ideal situation, a
unique LUW ensures that a commit will take place whenever the commit
max counter is reached.
In summary, retek_commit( ) is called after each record is processed. It
maintains a counter variable and the previous restart string. It will
perform a commit when two conditions are true; at least commit_max_ctr
records have been processed, and the LUW has changed

Module 8: Table Based Restart/Recovery

133

Lesson 3: Implementation Details


Variable Declarations
Variables Declared in retek_2.h
int gi_error_flag;
This is the only variable declared for you and of a global scope. All the
other restart variables that are populated in retek_init() are of program
scope and should be named beginning with a p. Within each batch
program this global variable is set ( = 1) only in main() if process() fails,
and then checked after final() has been called. An appropriate message is
logged, depending on whether gi_error_flag has a value not equal to 0.
Retek_close() uses this flag to update fields on restart _control. An
illustrative example follows:
if (process() < 0)
gi_error_flag = 1;
if (final() < 0)
{
sprintf(ls_log_message, "Thread [%d] - Aborted in Final", pi_thread_val);
LOG_MESSAGE(ls_log_message);
exit(-1);
}
else
{
if (!gi_error_flag)
{
sprintf(ls_log_message, "Thread [%d] - Terminated OK", pi_thread_val);
LOG_MESSAGE(ls_log_message);
exit(0);
}
else
{
sprintf(ls_log_message,"Thread[%d]-Aborted in process", pi_thread_val);
LOG_MESSAGE(ls_log_message);
exit(-1);
}
}

Module 8: Table Based Restart/Recovery

134

Variables Declared in Each Program


All other variables needed to restart the program are the responsibility of
the coder to declare. In earlier versions of restart recovery, many global
variables that were necessary for implementing restart recovery were
declared within the include file and many others that were also necessary
were not. This has been modified in the current version so that variables
that have no meaning no longer are declared and passed into functions
making the code much cleaner.

Initialization
All batch programs implementing restart/recovery will call retek_init().
This function is called from init() and takes the place of the older version,
restart_init(). Its purpose is to initialize all the restart variables, including
the variables which allow the program to pick up where it left off in the
case of a failure. Retek_init is designed to handle a variable number of
arguments since each program will have a unique set of restart variables.
This is communicated through the first two arguments to the function, the
first of which is a pre-compiler macro telling the function how many
arguments to expect, e.g. #define NUM_INIT_PARAMETERS 7. The
second argument is named parameter and is declared globally. It tells
what type of variables retek_init is expected to populate. It is defined to
be of type init_parameter which was created for the new restart-recovery
API. An example of how to declare the parameter variable follows:
init_parameter parameter[NUM_INIT_PARAMETERS] =
{
/* NAME ---------- TYPE ------ SUB_TYPE */

"commit_max_ctr",
"num_threads",
"thread_val",
"sku",
"supplier",
"origin_country",
"start_date",
};

"long",
"int",
"int",
"string",
"string",
"string",
"string",

"",
"",
"",
"S",
"S",
"S",
"S"

Module 8: Table Based Restart/Recovery

135

Each row of the parameter array describes a variable that will be passed
into retek_init(). The first 3 are standard and should be used exactly as
they are here if the program is multi-threaded and has restart/recovery
functionality. The next 4 values are for the start string which are used in
the where clause of the driving cursor. The name should describe what
variable is being returned. The type in this example are all strings
with a subtype of S which indicates a Start string variable (start string
is synonymous to restart string). With parameter defined as such, the call
to retek_init would look like this:
li_init_return = retek_init(NUM_INIT_PARAMETERS,
parameter,
&pl_commit_max_ctr,
&pi_num_threads,
&pi_thread_val,
ps_restart_sku,
ps_restart_supplier,
ps_restart_origin_country,
ps_restart_start_date);

Note all the variables are defined to be of program-scope and are named as
such. This last example covers the case of a table-based program. For a
file-based program, additional information is required to restart the
program. This information, known as the application image, contains
anything else besides the start string to restart the program. Also, the file
itself is handled differently. A new type of struct has been created for this
purpose called rtk_file. The structure includes the file pointer, the
filename, and a flag used internally to restart-recovery. For restartrecovery functions, only the address of this struct is required. An example
illustrating these concepts follows.
Declarations
#define

NUM_INIT_PARAMETERS 7

init_parameter parameter[NUM_INIT_PARAMETERS] =
{
"commit_max_ctr", "long",
"",
"restart_lc",
"string",
"S",
"line_seq",
"string",
"I",
"transaction_seq", "string",
"I",
"detail_seq",
"string",
"I",
"total_seq",
"string",
"I",
"out_lc_download", "rtk_file", "O"
};
long
long
char
EXEC
char
char
char
char

pl_commit_max_ctr;
pl_current_ctr = 0;
ps_restart_lc[NULL_LC_REF_ID];
SQL VAR ps_restart_lc IS STRING(NULL_LC_REF_ID);
ps_line_seq[NULL_REC_NO];
ps_transaction_seq[NULL_REC_NO];
ps_detail_seq[NULL_REC_NO];
ps_total_seq[NULL_REC_NO];

/* 1 output file */
rtk_file pf_out;

Module 8: Table Based Restart/Recovery

136

inside init()
li_init_return = retek_init(NUM_INIT_PARAMETERS,
parameter,
&pl_commit_max_ctr,
ps_restart_lc,
ps_line_seq,
ps_transaction_seq,
ps_detail_seq,
ps_total_seq,
&pf_out);
if (is_new_start())
{
strcpy(ps_line_seq, 1);
}

This example shows both the declarations of the global variables, and how
they are initialized. In addition to the new file type, there are two new
subtypes introduced here: I and O. The first designates a string
variable as an Image-string variable. However, it can also be used with
the rtk_file type to indicate a file as being Input. A subtype of O
designates the file as being Output.

In the case of a restart, all the variables will be populated. The start-string
variables and the rtk_file structure will contain the values where the
program left off. If the program is not restarting, only the
commit_max_ctr variable is filled. This condition can be trapped by a
new function, is_new_start(). This is used to initialize variables that need
a value even for a new start, such as the line sequence variable in the
previous example.

Module 8: Table Based Restart/Recovery

137

Driving Cursor
As alluded to in lesson 2, two changes are made to the driving cursor for
implementing restart/recovery:
1. An order by clause must be added. It must include the columns
corresponding to the LUW elements. (mandatory)
2. The where clause must be amended to select only those values
greater than the restart start string values. (mandatory)
Using the store-SKU example program again, the driving cursor might
look like:
EXEC SQL DECLARE c_sales_data CURSOR FOR
SELECT
sku,
store,
NVL(unit_cost,0),
NVL(sales_type,R),
NVL(sales_units,0),
NVL(sales_value,0),
FROM
win_store
WHERE
(store > NVL(:ps_restart_store, -999) OR
(store = :ps_restart_store AND sku >
:ps_restart_sku))
ORDER BY store, sku;

And the fetch might resemble:


EXEC SQL FETCH c_sales_data INTO :ls_sku,
:ls_store,
:ls_unit_cost,
:ls_sales_type,
:ls_sales_units,
:ls_sales_value;

Module 8: Table Based Restart/Recovery

138

Committing
Now that the what to commit has been decided, the how and when
will be discussed. Nowhere in a batch program should there be ANY line
with EXEC SQL COMMIT or ROLLBACK. This job is handled by 2
new functions: retek_commit() and retek_force_commit(). Which to
choose and how to use them depends mainly on the type of LUW chosen
for the program. Retek_commit() should be used with a non-unique LUW.
For that case, retek_commit() will be called after each record is processed.
It will maintain a counter and ensure that the number of records processed
is > or = to RESTART_CONTROL.COMMIT_MAX_CTR and that a
new LUW has been reached. On the other hand, if the LUW is unique,
retek_force_commit() should be used. It commits, as the name implies,
everytime it is called and should be called only after an array of size =
COMMIT_MAX_CTR has been processed.
Both of these functions have the same variable parameter list. In much the
same way as retek_init(), the first parameter for these two functions is a
pre-compiler macro,
e.g. #define NUM_COMMIT_PARAMETERS 4. Next, pass in all the
strings that make up the start-string and the image-string, in the same
order as was passed into retek_init().

Another related function is commit_point_reached(). This function can


be used to check if an appropriate time to commit has been reached. An
illustrative example follows:

for (ll_cur_rec = 0; ll_cur_rec < ll_num_cur_records;


ll_cur_rec++)
{
if (commit_point_reached(NUM_COMMIT_PARAMETERS,
pa_fetch.sku[ll_cur_rec],
pa_fetch.supplier[ll_cur_rec],
pa_fetch.origin_country_id[ll_cur_rec],
pa_fetch.start_date[ll_cur_rec]))
{
/* Array insert records */
if (post_insert_delete_records() < 0)
return(-1);
}

if (retek_commit(NUM_COMMIT_PARAMETERS,
pa_fetch.sku[ll_cur_rec],
pa_fetch.supplier[ll_cur_rec],
pa_fetch.origin_country_id[ll_cur_rec],
pa_fetch.start_date[ll_cur_rec]) < 0)
return(-1);
}

Module 8: Table Based Restart/Recovery

139

In this example, using commit_point_reached() allows one to call an


inserting routine once before actually calling retek_commit(). The other
possibility occurs with retek_force_commit(). In this case, its understood
it is time to commit after completing the for-loop, making the
commit_point_reached() function unnecessary.
for (ll_cur_rec = 0; ll_cur_rec < ll_num_cur_records;
ll_cur_rec++)
{
/* process all the records */
if (process_transaction(&pa_fetch, ll_cur_rec) < 0)
return(-1);
}
if (retek_force_commit(NUM_COMMIT_PARAMETERS,
pa_fetch.sku[ll_cur_rec - 1],
pa_fetch.supplier[ll_cur_rec - 1],
pa_fetch.origin_country_id[ll_cur_rec - 1],
pa_fetch.start_date[ll_cur_rec - 1]) < 0)
return(-1);

For file-based programs, an additional function rtk_print() is needed to


work with the new rtk_file type struct. Its used exactly the same as
fprintf() except the address to the rtk_file struct is passed instead of the
file pointer. An example follows:
if (rtk_print(&pf_out, ps_format_fhead,
LEN_LINE_ID,
"FHEAD",
LEN_REC_NO,
atol(ps_line_seq),
LEN_FILE_TYPE, "LCAP",
LEN_DATE,
is_vdate) < 0)
return(FATAL);

Closing

The function retek_close() is used to close file streams, commit work, and
update restart-recovery tables. It has no parameters.

Module 8: Table Based Restart/Recovery

140

Function Declarations
In retek_2.h:
int retek_init
(int num_args,
init_parameter parameter[],);

Used for initializing


parameters for
restart/recovery. If program
is being restarted, it
returns the values needed to
start the program without
duplicating work.

int retek_commit(int num_args,);

Commits changes to the


database if the current count
has been exceeded and a new
LUW is being processed.

int commit_point_reached
(int num_args,);

This function works


equivalently to
retek_commit() without the
committing to the database.

int retek_force_commit
(int num_args, ...);

As the name implies, forces a


commit to the database.

int retek_close(void);

Cleans up restart/recovery
tables, commits/rolls back
last changes to the database,
and closes files.

void increment_current_count(void);

Increments pl_current_count.

int is_new_start(void);

Checks if current run is a


new start; if yes, return 1;
otherwise 0.

int parse_name_for_thread_val
(char* name);

Parses thread value from the


extension of the specified
file name.

Module 8: Table Based Restart/Recovery

141

1. retek_init
= Pass in num_args as the number of elements in the init_parameter array, then the
init_parameter array, then variables a batch program needs to initialize in the order and
types defined in the init_parameter array.|
= NOTE: all variables need to be passed by reference (incl. int and long);
= Get all global and module level values from databases;
= Initialize records for RESTART_PROGRAM_STATUS and RESTART_BOOKMARK;
= Parse out user-specified initialization variables (variable arg list);
= Return NO_THREAD_AVAILABLE if no qualified record in RESTART_CONTROL or
RESTART_PROGRAM_STATUS;
= Commit work.

2. retek_commit
= Pass in num_args, then variables for start_string first, and those for image string (if
needed) second. The num_args is the total number of these two groups. All are string
variables and are passed in the same order as in retek_init();
=

Concatenate start_string either from passed in variables (table-based) or from ftell of


input file pointers (file-based);

Check if commit point reached (counter check and, if table-based, start string
comparison);

= If reached, concatenated image_string from passed in variables (if needed) and call
internal_commit() to get out_file_string and update RESTART_BOOKMARK;
=

If table-based, increment pl_current_count and update ps_cur_string.

3. commit_point_reached
= Pass in num_args, then all string variables for start_string in the same order as in
retek_init(). The num_args is the number of variables for start_string. If no start_string
(as in file-based), pass in NULL;
= For table-based, if pl_curren_count reaches pl_max_counter and if newly concatenated
bookmark string is different from ps_cur_string, return 1; otherwise return 0;
= For file-based, if pl_curren_count reaches pl_max_counter return 1; otherwise return 0.

Module 8: Table Based Restart/Recovery

142

NOTE: The difference between this function and the check in retek_commit() is that here
the pl_current_count and ps_cur_string are not updated. This checking function is
designed to be used with retek_force_commit(), and the logic to ensure integrity of LUW
exists in user batch program.

4. retek_force_commit
= Pass in num_args, then variables for start_string first, and those for image string (if
needed) second. The num_args is the total number of these two groups. All are string
variables and are passed in the same order as in retek_init();
= Concatenate start_string either from passed in variables (table-based) or from ftell of
input file pointers (file-based);
= Concatenated image_string from passed in variables (if needed) and call
internal_commit() to get out_file_string and update RESTART_BOOKMARK;
= If table-based, increment pl_current_count and update ps_cur_string.

5. retek_close
= If gi_error_flag or NO_COMMIT command line option is TRUE, rollback all database
changes;
= Update RESTART_PROGRAM_STATUS according to gi_error_flag;
= If no gi_error_flag, insert record into RESTART_PROGRAM_HISTORY with
information fetched from RESTART_CONTROL,
RESTART_PROGRAM_BOOKMARK and RESTART_PROGRAM_STATUS tables;
= If no gi_error_flag, delete RESTART_BOOKMARK record;
= Commit work;
= Close all opened file streams

Module 8: Table Based Restart/Recovery

143

Array Processing
Since a majority of Retek batch programs implement array processing, it is
important to note how array processing affects restart/recovery
implementation. A key determinant of the exact details is the LUW. Is it
unique or non-unique?
When the LUW is unique, given that the arrays are sized using
commit_max_ctr and given that the restart_cur_string and the
restart_new_string will differ for every record, a commit may be forced
after each set of arrays is processed (i.e. after the for loop, inside the while
loop).
With a non-unique LUW, as discussed above, there is the possibility that
the commit_max_ctr will be reached in the middle of a LUW. When
processing arrays sized equal to the commit_max_ctr, this means a LUW
may span two array fetches, thus you cannot force a commit after each set
of arrays as you did with a unique LUW. As a result, the commit logic
must take place inside the for loop (i.e. a commit can happen in the middle
of an array).
The following examples progress through four levels of complexity of a
process( ) functions while loop:
= No array processing, no restart/recovery
= Array processing, no restart/recovery
= Array processing & restart/recovery with unique LUW
= Array processing & restart/recovery with non-unique LUW
1. No array processing, no restart/recovery
while(1)
{
EXEC SQL FETCH c_win_store INTO :ls_sku,
:ls_store;
if NO_DATA_FOUND
break;
if (calculate_price(ls_sku, ls_store, ls_price) < 0)
return (-1);
EXEC SQL UPDATE win_store
SET price = :ls_price
WHERE sku
= :ls_sku
AND store = :ls_store;
}

2. Array processing, no restart/recovery

Module 8: Table Based Restart/Recovery

144

Note: There are two sets of arrays, one for fetching and one for updating, and they share
the same index counter, li_current_rec.
li_data_to_process = 1;
char *ls_sku, *ls_store;
while(li_data_to_process)
{
EXEC SQL FETCH c_win_store INTO :pa_fetch_array.s_sku,
:pa_fetch_array.s_store;
li_recs_returned = NUM_RECORDS_PROCESSED li_recs_completed;
li_recs_completed = NUM_RECORDS_PROCESSED;
if NO_DATA_FOUND
li_data_to_process = 0;
for(li_current_rec = 0;
li_current_rec < li_recs_returned;
li_current_rec++)
{
ls_sku = pa_fetch_array.s_sku[li_current_rec];
ls_store = pa_fetch_array.s_store[li_current_rec];
if (calculate_price(ls_sku, ls_store, ls_price)< 0)
return(-1);
strcpy(pa_update_array.s_price[li_current_rec],
ls_price);
}
EXEC SQL FOR :li_recs_returned
UPDATE win_store
SET price = :pa_update_array.s_price
WHERE sku
= :pa_fetch_array.s_sku
AND store = :pa_fetch_array.s_store;
}

3. Array processing & restart/recovery with unique LUW


If you know your logical unit of work is unique (SKU, store is the primary key of
win_store), then it is valid for retek_force_commit() to commit after any fetch.
while(!cursor_empty)
{
EXEC SQL FOR :pl_commit_max_counter
FETCH c_win_store INTO
:la_fetch.s_sku,
:la_fetch.s_store,
:la_fetch.s_price;
if(NO_DATA_FOUND)
cursor_empty = 1;
la_fetch.l_num_recs = NUM_RECORDS_PROCESSED total_recs
total_recs = NUM_RECORDS_PROCESSED;
if(la_fetch.l_num_recs == 0) break;
for(la_fetch.l_cur_rec = 0;
la_fetch.l_cur_rec < la_fetch.l_num_recs;
la_fetch.l_cur_rec++)
{
if(calc_price(la_fetch.s_sku[la_fetch.l_cur_rec],
la_fetch.s_store[la_fetch.l_cur_rec],
ls_price)<0)return(-1);

Module 8: Table Based Restart/Recovery

145

strcpy(la_update.s_price[la_fetch.l_cur_rec],
ls_price);
}
EXEC SQL FOR :la_fetch.l_num_recs
UPDATE win_store
SET price = :la_update.s_price
WHERE sku
= :la_fetch.s_sku
AND store = :la_fetch.s_store;
/* force a commit */
if(retek_force_commit(2,
la_fetch.s_sku[la_fetch.l_num_recs - 1],
la_fetch.s_store[la_fetch.l_num_recs 1 ])<0)
return(-1);
}

4. Array processing & restart/recovery with non-unique LUW


When the logical unit of work is non-unique (in this example, win_store data is
being processed by SKU), the implementation is not nearly this nice. It cannot be
assumed that a commit will be permitted between array fetches (LUW may span
two fetches). Thus, the UPDATE and the call to retek_commit() must occur
inside the for loop.
Also, because an update/commit sequence may occur in the middle of fetched
arrays, the update-array index must be counted separately from the fetch-array
index.
la_update.l_num_recs = 0;

/* update array index counter */

while(!cursor_empty)
{
EXEC SQL FETCH c_win_store INTO :la_fetch_array.s_sku,
:la_fetch_array.s_store;
if(NO_DATA_FOUND)
cursor_empty = 1;
la_fetch.l_num_recs = NUM_RECORDS_PROCESSED
total_recs;
total_recs = NUM_RECORDS_PROCESSED;
for(la_fetch.l_cur_rec = 0;
la_fetch.l_cur_rec < la_fetch.l_num_recs;
la_fetch.l_cur_rec++)
{
/* if time to commit, then do update */
if(commit_point_reached(2,
la_fetch.s_sku[la_fetch.l_cur_rec],
la_fetch.s_store[la_fetch.l_cur_rec]))
{
EXEC SQL FOR :la_update.l_num_recs
UPDATE win_store
SET price = :la_update.s_price
WHERE sku
= :la_fetch.s_sku
AND store = :la_fetch.s_store;
la_update.l_num_recs = 0;
} /* end if time to commit, then UPDATE */
if(retek_commit(2,
la_fetch.s_sku[la_fetch.l_cur_rec,

Module 8: Table Based Restart/Recovery

146

la_fetch.s_store[la_fetch.l_cur_rec])<0)
return(-1);
ls_sku = la_fech.s_sku[la_fetch.l_cur_rec];
ls_store = la_fetch.s_store[la_fetch.l_cur_rec];
if(calculate_price(ls_sku, ls_store, ls_price)< 0)
return(-1);
strcpy(la_update.s_price[la_update.l_num_recs],
ls_price);
strcpy(la_update.s_sku[la_update.l_num_recs],ls_sku);
strcpy(la_update.s_store[la_update.l_num_recs],
ls_sku);
la_update.l_num_recs++;
}

} /* end for */
/* end while */

if(la_update.l_num_recs > 0 )
{
/* last time through for update/commit may not have
* happened Therefore, a final update call and forced
* commit are required here
*/
}

Module 8: Table Based Restart/Recovery

147

Lesson 4: New Restart Library Description


As mentioned at the beginning of this module, the Restart/Recovery API
has recently been rewritten to make implementation easier. Below is some
informal documentation and code examples. In addition to reading these,
see the API files retek_2.h and retek_2.pc.
Design Overview:
Current RMS restart/recovery batch library is to be modified to benefit
maintainability, ease of coding, and improved performance. While the
current mechanism and functionality of batch restart/recovery are
preserved, the following improvement and enhancement are to be done:
= Clean up global variables associated with restart/recovery.
= Pass out restart/recovery variables under the batch coders' full control
during initialization.
= Get rid of temp write files to speed up the commit process.
= Hide more information and processing into the library code.
= Add more information into the restart/recover tables for tuning
purposes.

Module 8: Table Based Restart/Recovery

148

Tables:
RESTART_BOOKMARK:
Additional column:
Name
------------------------OUT_FILE_STRING
NON_FATAL_ERR_FLAG
NUM_COMMITS
AVG_TIME_BTWN_COMMITS

Null?
-------NULL
NULL
NULL
NULL

Type
---VARCHAR2(255)
VARCHAR2(1)
NUMBER(12)
NUMBER(12)

Null?
-------NULL
NULL

Type
---NUMBER(15)
NUMBER(15)

Null?
-------NULL
NULL
NULL
NULL
NULL

Type
---NUMBER(15)
VARCHAR2(1)
VARCHAR2(1)
NUMBER(12)
NUMBER(12)

RESTART_PROGRAM_STATUS:
Additional column:
Name
------------------------CURRENT_ORACLE_SID
CURRENT_SHADOW_PID
RESTART_PROGRAM_HISTORY:
Additional column:
Name
------------------------SHADOW_PID
SUCCESS_FLAG
NON_FATAL_ERR_FLAG
NUM_COMMITS
AVG_TIME_BTWN_COMMITS

Module 8: Table Based Restart/Recovery

149

Example Program
Compare the programs code_restartrecovery.pc, which is a simple
implementation of the old API, with example.pc, which implements the
same functional logic using the new r/r API.
See also example1.pc, example2, example3.pc and example4.pc available
from your instructor for more complex examples of the new API usage.

Module 8: Table Based Restart/Recovery

Exercise 4
Make a copy of exercise 2 and rename it for exercise 4 (cp xxx_02.pc
xxx_04.pc, where xxx = your initials)
= Add restart recovery functions to the program.
= Add restart recovery to the driving cursor.
= Implement restart recovery logic
= See handout for additional information.

150

Module 8: Table Based Restart/Recovery

151

Whats Next
Restart/recovery and mulithreading are conceptually different attributes of
a batch program. In fact, one can be implemented without the other.
However, the Retek Restart/Recovery and Multithreading API, for better
or worse, combine their implementation. The next module discusses
multithreading concepts and Reteks batch programming implementation.
Once completed, you will be able to implement either or both in a batch
program.

Module 8: Table Based Restart/Recovery

152

Module 9: Program Multithreading

This module includes the following lessons:


Lesson 1: Program Multithreading Purpose
Lesson 2: General Design
Lesson3: Implementation Details
Lesson 4: R/R & Multithreading Data Model
Lesson 5: Starting R/R Programs, Restart and Refresh

Module 9: Program Multithreading

154

Module Overview
As was explained in module 10, the implementations of multithreading
and restart/recovery in Retek batch programs are inextricably intertwined.
Yet, they are very separate concepts and can be implemented individually
or together in any single program.
This module discusses the benefits of multithreading, its implementation
in Retek batch programs, and finally, the database tables used for
persistent storage of restart/recovery and multithreading data.

Objectives
After completing this lesson you will be able to:
Explain the purpose of multithreading of batch programs.
Understand the variables and functions provided in the API.
Understand the restart/recovery and multithreading data model.
Implement restart/recovery or multithreading or both in a batch
program.

Module 9: Program Multithreading

155

Lesson 1: Program Multithreading Purpose


The term multithreading is a bit of a misnomer as it is used here in that
multithreading is generally an operating system level construct. Here you
mean the ability to concurrently run multiple instances of a single program
against a single data set.
Furthermore, Retek batch program multithreading has the additional
requirement of supporting restart/recovery at the thread level. Thus, if
one instance of a program fails, others are not affected and the failed
instance(s) can be restarted to complete processing.
The motivation for including program multithreading in the Retek batch
programming repertoire was the need to improve performance. Due to the
great amount of data generated by a large retailers production
merchandise management system, running a single instance of some batch
programs, even with array processing, took far too long.
Running multiple instances of a program takes advantage of two facts;
one, processors spends a lot of time waiting for other hardware and
communications, especially when running only one process; and two,
multiprocessor systems are generally most efficient when running multiple
tasks.
In general, two concurrent instances of a batch program can nearly halve
the run time of one instance against the same data set in ideal conditions.
And adding many more instances is possible, each adding additional
benefit (to a point). There is, however, an optimal number of instances
depending on the architecture of the runtime system, the size of the data
set, the distribution of data across key discrete data fields, etc.
As with the decision of commit_max_ctr size in restart/recovery, it would
be impossible for the designer and programmer of a batch program to
create a generic solution optimal for use in every production environment.
Rather, the Retek program multithreading API provides a means for each
system administrator to customize based on the administrators
circumstances.

Module 9: Program Multithreading

156

Lesson 2: General Design


Similar to restart/recovery, program multithreading requires organizing
data in a specific way. Recall restart/recovery required data to be ordered
such that, in the event of a restart, a program can return to the exact record
where it left off. In contrast, program multithreading requires data to be
chunked such that each thread operates on a specific set of records.
Threads will never attempt to operate on the same records. Furthermore,
when restarting from a failed run, a thread can be reassigned to its
original data set.
The chunking of data is accomplished using some discrete variable, like
store or dept, which is likely to produce an even distribution of data.
As a simple example, suppose a program needs to process all price
promotions, and it is determined that dividing the work by store is likely to
produce an even distribution of data across threads. Further suppose, there
are only five stores in the system (stores 1,2,3,4,5), and two instances of
the program will be run concurrently (threads a,b). Assume thread a is
assigned to process promotions for stores 1, 3, and 5, while thread b is
assigned stores 2 and 4. Then the driving cursor must be modified so that
when running as thread a it only selects promotions associated with
stores 1, 3 and 5, and when running as thread b it should only select
promotions associated with stores 2 and 4.
A standard algorithm is used to separate data. It is based on a modulus
function. Note that this requires that threading mechanisms only be
numeric values. Further, it assumes that the values are continuous and are
distributed evenly. Clearly, this may not be the case for every client, so it
has been made fairly simple to apply different algorithms to different
threading mechanisms by changing the internal function that is used in the
view. The restart_thread_return() function is a very simple modulus
routine. The following example shows the modulus function.
RESTART_THREAD_RETURN()

(EXAMPLE)

CREATE OR REPLACE FUNCTION


RESTART_THREAD_RETURN(in_unit_value
NUMBER,
in_total_threads NUMBER)
RETURN NUMBER IS
ret_val NUMBER;
BEGIN
ret_val :=
MOD(ABS(in_unit_value),in_total_threads) + 1;
RETURN ret_val;
END;

Module 9: Program Multithreading

157

Lesson 3: Implementation Details


Conceptually, this is an easy solution. But how does the program know
which thread number it is assigned, and which stores are assigned to this
thread value?

Which Thread Am I?
Each instance of a batch program with program multithreading must know
its thread value. This is determined during the call to retek_init( ).
There are actually two values required to answer this question completely.
Answering, Which thread am I? actually requires knowing, which
thread am I, and out of how many? It is not enough to know that a
program is thread number two, it must know that it is thread two of five in
order to properly divide the data.
During retek_init( ) two variables are populated to hold these values.
pi_num_threads is an integer pointer argument passed as an argument. It
points to the total number of threads scheduled. pi_thread_val stores the
integer value of the current thread, and should be copied into a local
variable.

Which Data Is Assigned to Me?


There are a number of database views in the Retek data model that answer
this question. They are named V_RESTART_*, where the * is discrete
variable assisting in the chunking of data. Here is a partial listing:
VIEW_NAME
------------------------V_RESTART_CLEARANCE
V_RESTART_COST_CHG
V_RESTART_DEPT
V_RESTART_PRICE_CHG
V_RESTART_SKU
V_RESTART_STORE
V_RESTART_STORE_WH
V_RESTART_SUPPLIER
V_RESTART_TRANSFER
V_RESTART_WH

Module 9: Program Multithreading

158

These views tell the threads how to divide up the records. To better
understand views, look at the v_restart_dept view. On this table, there is a
set of records for every number of threads currently on restart_control for
this particular driver, num_threads. In each set, there is one record for
every existing department, driver_value. Finally, the value in thread_val
determines which thread will work on this department. For example,
assume 4 departments and 3 programs that thread by department, one with
only 1 thread, one with 2 threads, and one with 4 threads. v_restart_dept
would look like this.
DRIVER_NAME
NUM_THREADS DRIVER_VALUE THREAD_VAL
-------------------- ----------- ------------ ---------DEPT
1
1001
1
DEPT
1
1002
1
DEPT
1
1003
1
DEPT
1
1004
1
DEPT
2
1001
2
DEPT
2
1002
1
DEPT
2
1003
2
DEPT
2
1004
1
DEPT
4
1001
2
DEPT
4
1002
3
DEPT
4
1003
4
DEPT
4
1004
1

So, given that any particular instance of a batch program knows its thread
value and the total number of threads, it only requires one more bit of
information to be able to select all its DRIVER_VALUES from the
appropriate restart view; namely the driver name. This, too, is a variable
populated by retek_init( ).

Driving Cursor
Given all these values, the driving cursor can now be modified to select
only the data assigned to the current thread. This is accomplished by
joining the existing cursor to the V_RESTART table.
Continuing with the previous example, the cursor might be modified as:
SELECT ...
FROM promstore, v_restart_store, ...
WHERE v_restart_store.driver_name =
:ls_restart_driver_name
AND v_restart_store.driver_value = promstore.store
AND v_restart_store.num_threads =
:ls_restart_num_threads
AND v_restart_store.thread_val
=
:ls_restart_thread_val
AND ...

Module 9: Program Multithreading

159

Lesson 4: R/R & Multithreading Data Model

restart_control
The RESTART_CONTROL table is the master table in the
restart/recovery table set. One record will exist on this table for each batch
program that is run with restart/recovery logic in place.
The restart/recovery initialization process uses this table to determine:
= The total number of threads used for each batch program,
= The maximum records that will be processed before a commit event
takes place,
= The driver for the threading (multi-processing) logic.

Module 9: Program Multithreading

160

restart_program_status
The RESTART_PROGRAM_STATUS table holds record keeping
information about current program processes. The number of rows for a
program on the status table will be equal to its NUM_THREADS value on
the RESTART_CONTROL table.
The table is modified during restart/recovery initialization and close logic.
The restart/recovery initialization logic will assign the next available
thread to a program based on the program status and restart flag. Once a
thread has been assigned the program_status is updated to prevent the
assignment of that thread to another process. Information will be logged
on the current status of a given thread, as well as record keeping
information such as operator and process timing information.

restart_program_history
TheRESTART_PROGRAM_HISTORY table will contain one record for
every successfully completed program thread with restart/recovery logic.
The restart_close( ) function inserts this record. Table purgings will be at
user discretion.

restart_bookmark
When a restart/recovery program thread is currently active, its state is
started or aborted. a record for it will exist on the restart_bookmark table.
Restart/recovery initialization logic inserts the record into the table for a
program thread. The restart/recovery commit process updates the record
with the following restart information:
= A concatenated string of key values for table processing,
= A file pointer value for file processing,
= Application context information such as counters and accumulators.
The restart/recovery closing process will delete the program thread record
if the program finishes successfully. In the event of a restart, the program
thread information on this table will allow the process to begin from the
last commit point.

Module 9: Program Multithreading

161

v_restart_x
As discussed in lesson 3 in this module, restart views will be used for
query-based programs that require multi-threading. Separate views will be
created for each threading driver, such as department or store. A join will
be made to a view based on threading driver to force the separation of
discrete data into particular threads.

Lesson 5: Starting R/R Programs, Restart


and Refresh
Before a program that implements restart/recovery or multithreading can
be run the first time, two tables must be populated.

1. Restart_Control:
Column
Program_name
Program_desc
driver_name

num_threads

update_allowed
process_flag
commit_max_ctr

Initial Value
Should always be the same as restart_name with some rare exceptions
(defined in the program and on restart_program_status table).
Description of the program - will be used in the online maintenance
screen.
Threading mechanism to be used: check out the views for actual
names. Examples - DEPT, STORE, STORE_WH, PRICE_CHG,
Use NONE when implementing r/r but not multi-threading.
Total number of threads to be used by the program.
Recommendation for testing is to set this to 2.
Clients will be able to modify this number for better performance.
Always set to 1 in file based processing.
File based processing (and potentially others) set this to N else Y.
F for file-based processing, T for table-based.
The number of records to be processed before a commit event occurs

Module 9: Program Multithreading

162

2. restart_program_status:
Column
restart_name
thread_val

start_time
program_name
program_status
restart_flag
Restart_time
Finish_time
current_pid
current_operator_id
err_message

Initial Value
Must be the same as in the program
There should exist as many records on this table as defined in
num_threads on RESTART_CONTROL, thread_val is the integer
counter.
If num_threads = 3, there should be 3 records on this table, one with
thread_val =1, one with thread_val = 2, and one with thread_val = 3.
Null (set by retek_init function)
Same as program_name on restart_control (probably same as
restart_name)
Initially enter as ready for start
Should be null - should only have value (Y) when program is
restarting. On normal start, this should always be null.
Null (set by retek_init function)
Null (set by retek_close function)
Null (possibly set by retek_init function)
Null (possibly set by retek_init function)
Null (set by retek_close function)

For normal processing, the PROGRAM_STATUS on


RESTART_PROGRAM_STATUS must be ready for start. After
successful completion the status will be completed. A process that is
currently running will have the status of started, as will a process that
abends with an unhandled exception or DB problem. A process that ends
due to a handled fatal error during processing will have a status of aborted.

To Start Fresh (Not Restart):


= Ensure that the program_status on the restart_program_status table is
ready for start
= Ensure that the restart_flag is null
= Ensure that no record exists on the restart_bookmark table for that
restart_name, thread_val combination
A utility program, refresh.pc, is available to do these three steps. A
version is included at the end of this module.

Module 9: Program Multithreading


/* refresh.pc */
#include <retek.h>
#define NUM_RECORDS_PROCESSED sqlca.sqlerrd[2]
EXEC SQL INCLUDE SQLCA.H;
EXEC SQL BEGIN DECLARE SECTION;
varchar ora_prog_name[50];
EXEC SQL END DECLARE SECTION;
long SQLCODE;
int main(int argc, char *argv[])
{
if (argc < 3)
{
printf("USAGE: refresh userid/password
program_name\n");
exit(-1);
}
if (LOGON(argc, argv) < 0) exit(-1);
strcpy(ora_prog_name.arr, argv[2]);
SSL(ora_prog_name);
EXEC SQL DELETE from restart_bookmark
where restart_name = :ora_prog_name;
if SQL_ERROR_FOUND
{
printf("cannot delete from restart_bookmark\n");
return(-1);
}
EXEC SQL UPDATE restart_program_status
set program_status = 'ready for start',
restart_flag = null
where restart_name = :ora_prog_name;

163

Module 9: Program Multithreading

164

if SQL_ERROR_FOUND
{
printf("cannot update restart_program_status\n");
exit(-1);
}
if NO_DATA_FOUND
{
printf("cannot update...no restart_name = %s on
table\n",
ora_prog_name.arr);
exit(-1);
}
EXEC SQL COMMIT;
printf("refresh done\n");
return(0);
}

To Restart After Failure:


There will be a record for the threads restart_name, schema, thread_val
combination on the restart_bookmark table and the status will be aborted
in the event of a handled fatal error or the status will be started in the
event of an unhandled fatal error or DB failure.
= Set the restart_flag to Y
= DO NOT change any other variables, or delete any associated records.

Module 9: Program Multithreading

Exercise 5
Make a copy of exercise 4 and rename it for exercise 5 (cp xxx_04.pc
xxx_05.pc, where xxx = your initials)
= Modify Exercise 4 to include multithreading
= See handout for additional information.

165

Module 9: Program Multithreading

Whats Next
File interfaces to and from Retek products.

166

Module 10: File Based System Interfaces

This module includes the following lessons:


Lesson 1: File Interface Introduction
Lesson 2: Interface Library Routines
Lesson3: Restart/Recovery for Interface Programs

Module 10: File Based System Interfaces

168

Module Overview
An interface batch program is simply a program that either takes data from
or prints data to a file.
An input (or upload) program reads data from a file that is then processed
and entered into the RMS database (an example is posupld, which reads
daily point of sales data from a file). An output or download program
prints the data to a file. That file can be sent to other places (such as
vendors), or used to interface with other systems the client uses (i.e.
accounts payable systems).

Objectives
After completing this lesson, you will be able to:
= Understand how Retek products interface with external systems
= Use Reteks file interface libraries
= Implement restart/recovery in a file based batch program

Module 10: File Based System Interfaces

169

Lesson 1: File Interface Introduction


For most interface programs, a standard file format is used, making the
programming easier. Each input file line contains one record that is read
into the program all at once into a C-defined struct. Each field in the line
corresponds to an element in the struct. For example, you can declare a
structure for the detail lines from which the carton number, location, and
tran_date will later be extracted:
struct detline
{
char carton_no[LEN_CARTON];
char location[LEN_LOC];
char tran_date[LEN_FILE_DATE];
}detail;

File Layout
The Retek interface library supports two standard file layouts. These are
master/detail files and detail only files. Master/detail files contain sets of
transactions with multiple detail lines within each transaction. Detail only
files consist of one transaction per detail line.
All records or lines within an input or output file, regardless of file type,
are identified by a 5-character file type record descriptor. This identifies
what type of record the line is. Each line of the file must begin with the
file type record descriptor, followed by a 10-character file line identifier.
This is the unique line number for each record.

Master/Detail Files
File layouts will have a standard file header record, a set of records for
each transaction to be processed, and a file trailer record. The transaction
set will consist of a transaction header record; one or more transaction
detail records which specify details such as SKUs or locations; and a
transaction trailer record. Valid record types include FHEAD, THEAD,
TDETL, TTAIL, and FTAIL. Other record types may be used in order to
better define the line type or in when there is more than one type of detail
line (i.e.: TSHIP, TITEM).
The FHEAD line has a 10-character long line ID plus customized
information; the THEAD lines have a 10-character ID plus a 14 character
long transaction ID plus customized information, as do the TDETL lines.
TTAIL lines have a 10-character ID followed by a 6-character long string
giving the number of lines in the transaction (not counting THEAD and
TTAIL lines). FTAIL lines have a line ID followed by a 10-character long

Module 10: File Based System Interfaces

170

string giving the total number of transaction lines in the file (all THEAD,
TDETL, and TTAIL lines).
Line IDs are unique; transaction IDs are unique to a transaction but should
remain the same for all records within the transaction.
Example:
FHEAD0000000001RTV 19960908172000
THEAD000000000200000000000001199609091202000000000003R
TDETL000000000300000000000001000001SKU10000012
TTAIL0000000004000001
THEAD000000000500000000000002199609091202001215720131R
TDETL000000000600000000000002000001UPC400100002667
TDETL000000000700000000000002000001UPC400100002643 0
TTAIL0000000008000002
FTAIL00000000090000000007

Detail Only Files


File layouts will have a standard file header record, a detail record for each
transaction to be processed, and a file trailer record. Valid record types
are FHEAD, FDETL, and FTAIL. FHEAD and FTAIL lines have the
same structure as in master/detail files; FDETL lines contain a line ID
followed by customized information.
Example:
FHEAD0000000001STKU1996010100000019960929
FDETL0000000002SKU100000040000011011
FDETL0000000003SKU100000050003002001
FDETL0000000004SKU100000050003002001
FTAIL00000000050000000003

Retek file struct: rtk_file


Interface files in Retek batch programs are accessed using the predefined
structure, rtk_file. The struct is defined as follows:
typedef struct
{
FILE* fp;
/* File pointer */
char filename[MAX_FILENAME_LEN+1]; /* Filename */
int
pad_flag;
/* Flag whether to pad
thread_val to filename */
} rtk_file;

Module 10: File Based System Interfaces

171

As shown above, the structure contains the file pointer, the filename, and a
pad flag for adding the thread value to the filename. The normal struct
operators could access any of the elements of the structure if needed. The
library of pre-defined restart/recovery functions, however, is defined to
just pass in the struct name. The restart API takes care of opening the file,
adding the thread value to the name if its required, writing to it, and
closing the file. The functions using the new Retek file struct and the new
restart/recovery are similar to the old interface functions except they
accept the new file struct instead of a file pointer. The new functions have
been named retek_ + the old function name, e.g. the old interface
function get_record() has been recreated with the new file struct as
retek_get_record().

Reject Files
Because input files come from other systems, they are not entirely
trustworthy, and may contain bad data. Upload programs will check this,
and transactions containing bad data will be written by the upload program
to a reject file. This reject file has exactly the same format as the input file,
but only contains rejected transactions (a reject file for a run with a good
input file will only contain FHEAD and FTAIL lines). The FTAIL line
will contain the total number of lines in the reject file (not in the input
file).

Module 10: File Based System Interfaces

172

Implied Decimal Places


Note that to avoid the use of decimal points in input files, all prices and
quantities are multiplied by 10000 in input and output files (giving 4
implied decimal places).

Business Validation
It is also important to do business validation of variables coming from an
input file. For example, once you have checked that a SKU is all numeric,
you must also make sure that it exists in the system.

Fatal vs. Non-Fatal Errors


While most batch programs have only fatal errors, interface programs
frequently use non-fatal errors. A non-fatal error occurs when a detail
record fails validation and is written to a reject file, but you want to
continue reading and processing the rest of the input file (as opposed to a
fatal error, when you would want the program to terminate immediately).

Module 10: File Based System Interfaces

173

Lesson 2: Interface Library Routines


Retek has developed a library described in intrface.h, which contains
functions for processing files and should be #include'd in every interface
program. These functions can be divided into three categories:
1. Functions that are used to read data from input file and write data to
reject files.
2. Functions that are used to convert strings from the internal formats to
file formats and back.
3. Functions that perform validation typically found in upload programs.

File Interface Functions


get_record()
int get_record(void *struct_ptr, int struct_size,
FILE *in_file, FILE *rej_file,
char *rec_type)
usage:
if (get_record((char*)&detail,
sizeof(detail),
fp_carton,
fp_rejrec,
rec_type) < 0) return(-1);

Reads the next line from input file in_file into the structure (of size
struct_size) pointed at by struct_ptr. The five-letter record
identification code (FHEAD, TDETL, etc.) is returned in rec_type.
get_record() performs minimal validation on the line, making sure
that the identification code is FHEAD, FDETL, FTAIL, THEAD, TDETL,
or TTAIL and that this line type is legally positioned in the file (i.e., that a
TDETL only follows a THEAD or another TDETL).
If the validation is unsuccessful for any reason, the line is written to reject
file rej_file and get_record() returns -1.

Module 10: File Based System Interfaces

174

retek_write_rej()
int retek_write_rej(rtk_file *rej_file, rtk_file
*in_file)
usage: retek_write_rej(fp_rejrec,fp_carton)

Writes the current transaction being read from input file in_file to the
reject file rej_file. If in_file is a detail only file, only the current FDETL
record is written to rej_file. If in_file is a master/detail file, everything
from the last THEAD record to the current record is written to rej_file.
Note that when part of a transaction has an error, the entire transaction
should be written to the reject file, so write_rej() should be called only
after the current transactions TTAIL record is read.

Conversion Functions
left_shift()
int left_shift(char *str)

Removes all leading spaces or zeros from str.

null_pad()
int null_pad(char *str, int str_len)
usage:
nullpad(table,255)

Null terminates str, which is of length str_len, and removes all


trailing spaces (i.e., "fred " becomes "fred").

zero_pad()
void zero_pad(int str_len, char *str)
usage:
zero_pad(LEN_QTY, os_qty)

Pads str to length str_len with leading zeros (i.e., zero padding "123"
to str_len = 6 would result in "000123").

Module 10: File Based System Interfaces

175

Validation Functions
valid_all_numeric()
int valid_all_numeric(char *str, int desired_len)
usage:
if (valid_all_numeric(os_carton,LEN_CARTON)...

If the first desired_len characters of str are numeric,


valid_all_numeric() returns 0. If not, it returns a 1. If an error
occurs in valid_all_numeric(), it returns -1. (Here, you only want
to check LEN_* characters, not NULL_* characters, since the last one
should be NULL anyway).

valid_all_numeric_signed()
int valid_all_numeric_signed(char *str, int desired_len)

Does the same thing as valid_all_numeric(), but allows the first


character of str to be '+' or '-'.

all_blank()
int all_blank(char

*str)

Returns 1 if str consists only of spaces, 0 otherwise.

valid_date()
int valid_date(char *str)

If str is a valid date of the form YYYYMMDD or


YYYYMMDDHH24MISS, valid_date() returns 0. If not,
valid_date() returns 1. If an error occurs in the function, -1 is
returned.
For example, you first copy a variable from the detail structure, nullpad it,
check to see if any data is actually present (with all_blank), and then check
to see if the data given is valid:
strncpy(os_tran_date,
fileheader.file_create_date,LEN_DATE);
nullpad(os_tran_date, NULL_DATE);
if (all_blank(os_tran_date) == 0)
{
if (valid_date(os_tran_date)) return(NON_FATAL);

Module 10: File Based System Interfaces

176

Lesson 3: Restart/Recovery for Interface


Programs
Restart/Recovery for files works on the same functionality as table-based
Restart/Recovery, except that you are no longer reading from and writing
to tables, but files instead. This means that your definition of
committing needs to be changed, as well as the information you store in
the restart_bookmark table.
For download programs, the change is not too great; they still have driving
cursors, so their processing is still table-based. The only change needed is
to the committing process. Upload programs, on the other hand, have
input files instead of driving cursors, and therefore need to have different
methods for bookmarking and multithreading.
Below are brief overviews of these changes and descriptions of the library
functions specific to Restart/Recovery for files.

Restart/Recovery for Downloads


Even though a download program writes records to an output file rather
than inserting rows onto a table, the Restart/Recovery functionality is very
much the same as normal table-based Restart/Recovery. The driving
cursor will still use values fetched from the restart_bookmark table to
determine where it left off with the input. The output to the file, on the
other hand, involves a few changes. Writing to the file is done using the
function, rtk_print(). Also, the application image comes into play here.
To keep track of additional information (besides the restart string and the
file pointer), for example the line sequence, the application image is the
appropriate place to store this. In the call to retek_commit(), the
application image values should be included in the same order as in the
call to retek_init().

Module 10: File Based System Interfaces

177

After calling retek_init(), check to see if the program is being


restarted by calling the function is_new_start(). If it is a new start, an
FHEAD record should be written to the output file and the line sequence
should be set to 1. If it is being restarted, the FHEAD should not be
written (because it was already written on the initial run), and the
sequence number will be returned by retek_init().
Finally, note that download programs are considered table-based (i.e., the
process flag on restart_control should be T).

Restart/Recovery for Uploads


Restart/recovery for upload programs is true file-based restart/recovery.
Instead of having a table to base multithreading and logical units of work
on, upload programs have files. The largest difference between tablebased and file-based restart/recovery is that the file-based type uses a file
pointer instead of a restart string. The value of the file pointer in the
rtk_file struct at the point of the last commit, is saved on the restart tables
in a similar manner to the way the restart string is stored for the tablebased version.
Finally, the thread value for an upload program does not come from tables
as in table-based Restart/Recovery, but from the file name. All upload
files should have the format <file name>.<thread value>. To set the
thread value, parse_name_for_thread() is called. This will check
the thread value from the file name and assign the appropriate thread to
the program. For example, if the upload file is myfile.4,
parse_name_for_thread () will assign thread value 4 to the
program. This function is called before set_filename().

Restart/Recovery Interface Routines


parse_name_for_thread_val()
int parse_name_for_thread_val(char *name)
if (parse_name_for_thread_val((char*)argv[2]) < 0)
exit (-1);

Takes in a file name name, in the format <file name>.<thread value>


and returns the thread value based on the number after the period.

Module 10: File Based System Interfaces

178

set_filename()
int set_filename(rtk_file *file_struct,
char *file_name,
int pad_flag)
if (set_filename(&pf_out, argv[2], PAD)<0)
exit (-1);

Takes in a file name from the command line (argv[2]), and copies it into
the new rtk_file structure. The thread value can be appended to the name
in the format <file name>.<thread value>, by use of the pad_flag
parameter. Two pre-compiler definitions have been defined in retek_2.h
to handle this, PAD and NO_PAD. If the thread value is required to be
part of the name, use PAD for the pad_flag. If no thread should be
appended to the name, then use NO_PAD. Since the file is actually
opened in retek_init(), this function is normally called just before the call
to retek_init.

Module 10: File Based System Interfaces

Exercise 6
Make a copy of exercise 4 and rename it for exercise 6 (cp xxx_04.pc
xxx_06.pc, where xxx = your initials)
1. Write a download program to copy data from this table to an output
file with the following structure:

179

Module 10: File Based System Interfaces

180

Record

Variable name

Field type

Default

Description

FHEAD

File record descriptor

Char(5)

FHEAD

identifies file
record type

Line number

Number(10)

0000000001

identifies file
line number

Program descriptor

Char(4)

WIST

identifies the
program

Create date

Char(14)

file create date


YYYYMMDDHH24MISS format

FDETL

file record descriptor

Char(5)

line number

Number(10)

FDETL

Detail line
sequential line
number

sku

Char(8)

SKU number

store

Char(4)

store number

unit_cost

Char(20)

unit cost of SKU


(4 implied dec.)

sales_type

Char(1)

R regular, 'P'
promotion, 'C'
clearance

sales_units

Char(20)

number of units
(4 implied dec.)

sales_value

Char(20)

value (4 implied
decimals.

FTAIL

file record identification

Char(5)

line number

Number(10)

TTAIL

File trailer
sequential line
number
(here=total
number of lines
in file)

number of transaction lines Number(6)

total number of
transaction
lines in file
(not including
FHEAD and
FTAIL)

Module 10: File Based System Interfaces

181

Exercise 7
Make a copy of exercise 4 and rename it for exercise 7 (cp xxx_04.pc
xxx_07.pc, where xxx = your initials)
1. Write an upload program that reads records from output file produced
by the program in exercise one, validates the data, and writes them to
the trn_win_store_hist table.

Module 10: File Based System Interfaces

182

Module 11: Course Review and Evaluation

This module includes the following lessons:


Lesson 1: Review Topics, Q & A
Lesson 2: Resources
Lesson 3: Course Evaluation

Module 11: Course Review and Evaluation

Lesson 1: Review Topics, Q & A


UNIX
C
Pro*C
Error Handling
Array Processing
Restart/Recovery
File Interfacing
Coding best practices

184

Module 11: Course Review and Evaluation

Lesson 2: Resources
DDL
PVCS
Man
Books K&R, PL/SQL, UNIX Programming Environment
Oracle Documentation CD, technet
$l grep
impact grep
Dinkum C library reference
Operations Guide
Getting Started Guide
Standards Guides

185

Module 11: Course Review and Evaluation

Lesson 3: Course Evaluation


Please fill in the evaluation form provided.

Whats Next
Your first assignment.

186

Appendix A: The C Programming Language

Appendix A: The C Programming Language

188

Appendix Overview
Reteks batch programs are written in C and Pro*C, an Oracle proprietary
extension to C that provides easy access to an Oracle database. This
appendix reviews the major aspects of C, a popular procedural language,
with particular emphasis on how it is used at Retek.
This appendix is not intended to teach C to those with no experience in a
procedural programming language. If you are not familiar with concepts
such as variable declaration, assignment, Boolean logic, functions,
program flow and operator precedence please read Reteks Amazing C
Primer by Andy Beger, or the second edition of Brian Kernighan and
Dennis Ritchies The C Programming Language (Prentice Hall, 1988).

Objectives
After completing this appendix, you will be able to:
Name the basic data types in C.
Use sequence, iteration, selection and functions to control program
flow.
Recognize the various ways strings are declared, allocated and
manipulated.
Understand how a C program is invoked by the operating system.
Declare, define, and call functions.
Write a C program using Retek standards for naming variables.
Know where to find answers to C language questions.

Appendix A: The C Programming Language

189

Lesson 1: Data Types, Control Structures,


Strings
Data Types and Variables
A variable is a memory location reserved for storing one instance of a
particular data type. A char variable can hold one character, for example.
The basic structure of a C variable declaration is
<data type> <identifier>

as in
int my_integer;

A variable can be named almost anything, as long as the name follows 2


simple rules:
1. It can only contain letters, numbers, and the underscore character,
_
2. The first character must be a letter.
An optional assignment may follow a variable declaration. As in
int my_integer = 12;

There are four basic data types available in C; int, char, float, and double.
A variable declared as an int can hold an integer. The min and max values
are platform dependent. An integer declaration can be modified with the
following key words: signed, unsigned, short and long. Short and long
integers are commonly used in Retek batch programming.
A char variable holds one character. The exact internal representation is
platform dependent. Integers and characters can be freely interpreted as
the other data type (implicitly cast). This practice is not often used in
Retek code, but the concept may be useful in understanding compiler
messages.
Variables of type float and double hold real numbers. Doubles allow for a
larger range of values. By convention real numbers are always declared as
type double in Retek batch programs.
Notice there are NO types for Boolean or string variables. Boolean
expressions are evaluated using integers. Any non-zero integer is
considered true. Zero is false. String variables are discussed below.

Appendix A: The C Programming Language

190

Arrays and Pointers


An array of any of the above data types can be declared using square
brackets. As in
int x[10];

which sets aside enough contiguous memory locations to store 10 integers.


The locations can be referenced via a zero based index. So, assigning the
first integer in the array the value 12064 may look like
x[0] = 12064;

Pointers and arrays are very similar in C. A pointer is a variable declared


to hold the address of another variable. A pointers type is that of the
variable it points to. For instance, a pointer may be declared to hold the
address of an integer variable.
int *xp;

The * operator identifies the declaration of type pointer to int. Now xp


can point to any memory location that holds an int. Consider the
following statement.
xp = &x[0];

xp now holds the address of the first integer of the array, x, that you
declared earlier. The & operator is said to de-reference the variable x[0].
In other words, it returns the address of the variable.
The value 12064 that you assigned to the first integer in your array can
now be accessed in multiple ways, *xp and x[0] are two examples. In this
situation the * operator means, the value pointed to by; as in, the
integer value pointed to by xp.
A bit more complexity is added when you examine the array name without
the square brackets that identify a specific array member. Consider the
array name x declared above. It is in every way, except one, treated by
the C language as a pointer. Thus, you can add *x to your list of ways of
referring to the value 12064. Also, your earlier assignment statement
xp = &x[0];

can be rewritten as
xp = x;

Appendix A: The C Programming Language

191

The one difference between a pointer and an array name is that a pointer is
a variable, an array name is not. Thus, the statements
xp = x
xp = xp + 1 /* adds 1 to the address, not the value
pointed to */

are valid. In contrast, the following are not valid.


x = xp;

/* cannot assign a new address to an array


identifier */
x = x + 1;

A final bit of caution about arrays. Arrays really have no internal structure
in C, as they do in some other programming languages. You cannot ask
an array how long it is, as you can in Java, for instance. Arrays are simply
pointers to the first element in a sequence of reserved memory locations.
The key point is that arrays provide no bounds checking. Given your
declaration
int x[10];

there is nothing about the C language preventing a program from referring


to x[20], as in the assignment
x[20] = 24345;

Hopefully, you can recognize why this is a problem, and why a


programmer must be diligent in using pointers and arrays to access only
appropriate memory locations.

Appendix A: The C Programming Language

192

Strings
Representation
Recall that there is no string data type in C. Rather, strings are represented
as arrays of characters. Even though there is no such data type, the term
string is used frequently in this course and generally among C
programmers. Always keep in mind that this term is informal, and strictly
speaking, is referring to a character array.
Based on the above discussion of arrays and pointers, you might recognize
a difficulty with this representation of strings. That is, if arrays have no
inherent length management and strings by definition are of variable
length, then managing the length of a string becomes the problem of the
programmer.
Cs solution is relatively elegant, but, to say the least, not very robust.
There is a character value refereed to as the null (or nil) character. The
null character is used as the terminating character for strings. Thus, a
string is actually a character array with an implicit agreement between the
programmer and the runtime system that the final character will be the null
character.
As a result, when an operation is to be performed on a string, the first
character in the string is referenced with a character pointer. The
operation continues on subsequent characters until the null character is
encountered. If, for some reason, a null character is never encountered
(and hence the implicit agreement is violated), bad things will happen.

Memory Allocation
A further difficulty is in ensuring that the character array in which a string
is to be stored is large enough. Imagine a program that needs a string
variable to hold customer names input from a user interface. How long
should a character array be declared to hold a customer name?
This is a very real problem for C programmers. Fortunately, Retek batch
programs deal with data fields the length of which are usually known
they are defined by corresponding columns on a database table.
Nevertheless, it is worth pointing out that there are essentially two
approaches to memory allocation. The first is static, as in
char last_name[120];

Appendix A: The C Programming Language

193

The second method is to declare a pointer, then at runtime allocate enough


memory for whatever is being stored. Consider the following contrived,
but hopefully illustrative, example.
/* variable declarations */
int
last_name_length;
char *last_name;
/* allocate storage, and get last name */
get_input_int(&last_name_length);
last_name = (char *)malloc(last_name_length);
get_input_string(last_name);

Malloc is one of a group of functions that allocate memory at runtime.


Dont spend too much effort in understanding all the details of this
example. Just realize that the declaration of the variable last_name sets
aside no memory in which to store a last name. It merely allocates enough
memory to hold an address in memory. Thus, before you can store a last
name, you must allocate memory.
Dynamic memory allocation is used in Retek batch programming, albeit in
slightly different circumstances. Consider this discussion foreshadowing.

Structs
An array is a collection of objects of the same type that are stored in
contiguous memory locations. Structs are similar but allow for objects of
various types to be stored together.
Heres what a structure declaration looks like:
struct <structure name>
{
<field type> <field name>;
<field type> <field name>;
...
} <variable name>;

If you wanted to create a struct for a personnel program, it might look like
this:
struct personnel_record
{
char name[30];
int age;
char sex;
} Andy_record;

This declaration does two things. One, it defines a structure to hold


personnel records. Two, a variable called Andy_record is declared and
enough memory is allocated to hold the name, age, and sex fields.
This declaration is analogous to int x in that it names a type then a
variable name. In both declarations memory is reserved for the variable.

Appendix A: The C Programming Language

194

The difference is that when declaring a struct the programmer must


specify the internal structure of the type. When declaring an int, in
contrast, the internal structure of the type is already known.
The struct variables fields are accessed by putting a period (.) between
the variable name and field name:
strcpy(Andy_record.name, Andy Beger);
Andy_record.age = 22;
Andy_record.sex = M;
printf(%s is %d years old\n, Andy_record.name,
Andy_record.age);

The structure name, in this case personnel_record, can be used as a


pseudo-data type to define more variables. For example, if you need to
create another personnel record, you dont have to type out everything in
Example 1 again; all you need to do is:
struct personnel_record Ann_record;

Control Structures
All logic possible using a procedural language can be accomplished
through three basic control structures: sequence, iteration, and selection.

Sequence
Sequence is the most basic control structure. It describes how control
flows from one statement to the next in the order they are entered in the
source code file. Consider the following two statements.
x = x + 1;
y = x * 2;

/* line 1 */
/* line 2 */

Note first of all that statements in C are terminated with a semicolon. As


you probably expect, line one is excited before line 2. Furthermore, line 2
cannot begin execution until the value of x resulting from line 1 is
resolved. This sequential ordering of the execution of statements is the
most basic control structure.

Iteration
Iteration is a generic name for the class of control structures that allow for
repeated execution of a set of statements. In C, there are two basic kinds
of iteration: the while loop and the for loop.
The while loop is a relatively straightforward:
while ( <expression> )

Appendix A: The C Programming Language

195

{
.
.
.
}

Basically, what this says is, Do everything in the following block of code
over and over as long as <expression> is true. Then go on. For
example, consider the following code:
int i = 10;
/*
while (i <= 12)
{
i = i + 1;
}
j = m + 58;
/*

Line 1 */
/* Line 2 */
/* Line 3 */
Line 4 */

i starts out as 10. Then, Line 2 checks: is i (10) less than or equal to 12?
Since it is, you go on to Line 3 and add one to i. Now, back to Line 2 to
check: is i (11) less than or equal to 12? Yes. So, add one to i. Since i
(12) is still less than or equal to 12, you add one more to i. Now, when
Line 2 checks, i (13) is not less than or equal to 12, so you do not do Line
3, and you go on to Line 4 and you continue from there.
The for loop is basically an advanced while loop:
for (<statement1>; <expression>; <statement2>)
{
.
.
.
}

What this says is: First, do <statement1>. Now, do the code in the
braces over and over while <expression> is true. Each time, after youve
done the code in the braces, but before youve checked <expression>, do
<statement2>. Confused? Well, maybe this will help: a for statement
can basically be looked at like this:
<statement1>;
while (<expression>)
{
.
.
.
<statement2>;
}

Appendix A: The C Programming Language

196

The most common for loop is a counting loop:


for (i = 0; i < 10; i++)
/* Line 1 */
{
printf(%d,,i);
/* Line 2 */
}
printf(\n);
/* Line 3 */

Heres what happens:


When you hit Line 1 the first time, i is set to 0. Is i (0) less than 10?
Sure! So, you run Line 2 and print out i (0). Now, you increment i by 1.
Back to Line 1. Is i (1) less than 10? Yes, it is, so back to line 2 and print
out i (1). Jump ahead and look at when i is 9. Is i (9) less than 10?
Yes. You run line 2 one more time, and then increment i . Now, is i
(10) less than 10? No. So, you go on to Line 3.
The output of this code would be:
0,1,2,3,4,5,6,7,8,9,

Selection
The if statement in C allows you to tell the program to only execute
certain commands if a certain condition is true. Remember, there is no
Boolean data type, so integers are used. Non-zero is true, and zero is
false. The Boolean comparison operator == returns 1 if true, and 0 if
false. The general structure of an if statement is
if ( <expression> )
{
<one or more statements>
}
else if ( <expression> )
{
<one or more statements>
}
else
{
<one or more statements>
}

For example, if you only want to increment integer x when integer j is


even, you would use the following if statement:
x = 5;
/* Line 1 */
if (j % 2 == 0)
/* Line 2 */
{
x++;
/* Line 3 */
}
printf(x is %d.\n,x); /* Line 4 */

Appendix A: The C Programming Language

197

If j is even, j % 2 will evaluate to 0, which (of course) equals 0, and Line


3 will execute, and when you hit Line 4, this line will be printed:
x is 6.

But, if j is odd, j % 2 will evaluate to 1, which does not equal 0, so Line3


will not be executed, and this line will be printed when you hit Line 4:
x is 5.

But what if you want to do one thing when j is even, and another thing
when its odd? You could write two if statements, but thats pretty
redundant, especially since you have the else clause. The else clause says
if the expression in the if statement is not true, do the following
commands instead. For example,
x = 5;
/*Line 1*/
y = 9;
/*Line 2*/
if (j % 2 == 0)
/*Line 3*/
{
x++;
/*Line 4*/
}
else
{
y = x * 3;
/*Line 5*/
}
printf(x is %d.\ny is %d.\n, x, y);

/*Line 6*/

When j is even, Line 3 will evaluate to True, and so Line 4 will get
executed, and then you go on to line 6 and print out
x is 6.
y is 9.

When j is odd, Line 3 will evaluate to a false expression, and so the


program executes the commands in the else clause, which is Line 5 in this
example. Then, when Line 6 is executed, you see
x is 5.
y is 15.

C provides another selection statement, the swith-case-default statement.


It essentially replaces a long list of if-else ifs. It is only occasionally used
in Retek batch programs.

Appendix A: The C Programming Language

198

Goto
Notice that gotos are not a necessary control structure according to the list
presented above. Any logic that can be accomplished with a goto can be
done better with sequence, integration, and selection. It is generally
accepted that using goto statements can easily result in unmanageable
code. Although goto statements are part of the C language, they are
NEVER used in Retek batch programming.

Appendix A: The C Programming Language

199

Lesson 2: Operators
The basic operators:
Mathematical
Operators
+ addition

== equal to

- subtraction

!= not equal to

* multiplication

< less than

/ division

> greater than

% modulo
++
unary increment

<= less than or equal to

--

unary decrement

Logical Operators

>= greater than or equal to


&& logical AND
|| logical OR

Operator precedence and associativity:


Operators
( ) [ ] -> .
! ++ -- * & sizeof
*/%
+< <= > >=
== !=
&&
||
= += -= *= /=

Associativity
Left to right
Right to left
Left to right
Left to right
Left to right
Left to right
Left to right
Left to right
Right to left

Appendix A: The C Programming Language

200

Lesson 3: Functions
Main()
The most basic C program looks like this
main()
{
<one or more commands>
}

Main is a function. Every C program has exactly one function called main
because this is the function invoked by the operating system when a
program begins executing. The general syntax of a function will be
discussed presently.

Function Basics
Although programs could be written as one block of code in main(), it
would be extremely wasteful and difficult to read. This approach would
be wasteful because most programs repeat certain tasks many times, which
in turn, would require repeating the same sequence of commands many
times. A function is a set of commands grouped together and named.
This allows the set of commands to be repeated executed by simply calling
the function. This should be a familiar concept. If not, refer to a text on
the C programming language.
The basic syntax of a C function is:
<return type> <function name> ( <set of parameter
declarations> )
{
<variable declarations>
.
.
.
<executable statements>
.
.
.
return(<return value>);
}

Appendix A: The C Programming Language

201

Key Features of C Functions


1. Braces define the scope of a function
2. Variables declared within a function may only be used or
referenced by statements in the same function. They are said to be
local to the function.
3. Local variables cease to exist after the function has completed
execution. If the same function is called again later during the
same program run, all the local variables are newly declared.
4. All variable declarations must precede the first executable
statement.
5. There may be zero or more parameters.
6. A parameter declaration has the form: <type> <name>.
7. Arguments are always passed by valueincluding pointers!
8. C does not permit nested functions. In other words, one function
may not be defined inside another.

Variable Scope
Variables may be declared outside the scope of any function. In which
case they called global variables. Global variables can be seen and used
by any function in the program. Global variables keep their values for as
long as the program is running. There is occasionally good reason to use
global variables, but they are to be avoided.
One last thing: a local variable can have the same name as a global
variable; C isnt going to stop you. However, be warned: if youre in a
function that has a local variable that has the same name as a global
variable, C will assume that youre always referring to the local variable in
that function. Generally, its a good idea not to override variable names
like that.

Appendix A: The C Programming Language

202

Parameter Passing
The subtle implications of feature 7 listed above, Arguments are always
passed by valueincluding pointers, usually take experience to fully
appreciate, and are a key to truly being comfortable writing C. In general,
passing arguments by value means that, while a function can change the
value of an argument, the value of the variable in the calling function
remains unchanged. This is also called passing by copy.
int foo()
{
int x = 10;
bar(x);
printf(x = %d,x);
}

/* x = 10 */

int bar(int a)
{
a = 15;
return(0);
}

Passing a pointer instead of a copy of a variable gives access in the called


function to the calling functions variable.
int foo()
{
int x = 10;
bar(&x);
printf(x = %d,x);
}
int bar(int *a)
{
*a = 15;
return(0);
}

/* x = 15 */

Appendix A: The C Programming Language

203

An important bit of subtlety is that even pointers are passed by copy. This
means that if the address stored in a pointer is changed in a called
function, the address to the original variable is not lost.
int foo()
{
int x = 10;
int *xp;
xp = &x;
bar(xp);
printf(*xp = %d,*xp);
= 10 */
}

/* xp still points at x. x

int bar(int *a)


{
a = NULL; /* change the address stored in a */
return(0);
}

This can be worked around when necessary by passing a pointer to a


pointer.
int foo()
{
int x = 10;
int *xp;
xp = &x;
bar(&xp);
pointer xp */
printf(*xp = %d,*xp);
a valid address */
}

/* dereference the int


/* runtime failure! xp is not

int bar(int **a) /* a is a pointer to a pointer to an int


*/
{
*a = NULL;
/* change the address of the pointer
that a points to */
return(0);
}

This final technique is rarely used in Retek batch programming. It is


presented here to round out this discussion of argument passing. If you
are a beginner to the C programming language, you should not expect all
of this to make perfect sense yet. Revisit these points from time to time to
judge your progress in using C. When you truly understand each of these
examples, you will be well on your way to being an accomplished C
programmer.

Appendix A: The C Programming Language

204

Function Prototypes
Functions may be declared before they are defined, usually near the top of
the source code file. This practice is encouraged because it provides the
compiler with a signature for each function. This signature is used at
compile time to verify each function call has the correct number and type
of arguments.
Without function prototypes, a program may compile, link and run with
incorrectly formatted function calls. If, for example, a non-prototyped
function is written to use two arguments, but only one is passed in, the
runtime system will use what may seem like a random value for the
second argument and the results will usually be very confusing.
This type of bug is usually quite time consuming to find. Functional
prototyping is an easy way to ensure that you will never need to do so.
Here is an example with some function prototypes.
/* prototypes */
int foo(int x, char *y);
double bar(double d, double dd);
main()
{
int a = 2;
int b = 3;
int c;
double d = 5.23;
double e = 6.29;
double f;
c = foo(a);
compiler

/* invalid call to foo which the


will not catch without the

foo prototype */
c = foo(a, b);

/* valid call to foo */

f = bar (d, e, e); /* invalid call to bar which the


compiler
will not catch without the
bar prototype */
f = bar(d, e);

/* valid call to bar */

}
int foo(int x, char *y)
{
return (0);
}
double bar(double d, double dd)
{
return (0);
}

Appendix A: The C Programming Language

205

Function Scope
A function can be called from any other function declared in the same
program. The order in which they are defined does not matter.
C programs may span more than one source code file, making it possible
to call functions defined in multiple files. In Retek batch programming,
you make some use of this through library functions. However, ignoring
for a moment these library functions which most programmers will never
be asked to change, your batch programs will be contained in one source
file. As a result, this course will not discuss in any depth the syntax or
semantics of creating and using functions in multiple files.

Main() Revisited
Now that you have a fuller understanding of function declarations and
parameter passing, return to main( ), the function that is called by the
operating system to begin processing a C program.
This is the complete signature for main.
int main ( int argc, char *argv[ ] )

Notice three features not mentioned before


= Return type of intan integer value can be returned to the calling
environment. By convention a return value of 0 means success, 1
implies non-fatal error(s), and -1 implies a fatal error occurred during
processing.
= An integer parameterthe first parameter, by convention called argc,
is a count of the arguments in the command that started the program.
Argc is at least one because the name of the program is entered to
invoke a program.
= A pointer to a character arraythe second parameter is effectively an
array of strings. The strings are those of the command that started the
program. The first string, argv[0], is the program name. Succeeding
strings are arguments to the program.
Imagine a program called tria that calculates the area of a triangle, given
the height and width as floating point numbers. The command to run
might be:
tria 3.24 5.4444

Appendix A: The C Programming Language

206

In main, this would be represented as:


Argument
argc
argv[0]
argv[1]
argv[2]

Value
3
tria
3.24
5.4444

Notice the numeric arguments are stored in strings. The program would
need to convert them to double before calculating the triangles area.

Appendix A: The C Programming Language

207

Commonly Used Library Functions


There are many library functions available in C. Since Retek batch
programs process almost all data as strings, the following table shows
some that will be among the most useful in your context.
Function Signature
char *strcat(char *s1, char *s2);
char *strchr(char *s, int c);

int strcmp(char *s1, char *s2);

char *strcpy(char *s1, char *s2);


char *strncpy(char *s1, char *s2, int n);

int printf(char *format, ...);


int fprintf(FILE *stream, char *format,
...);
int sprintf(char *s, char *format, ...);
int scanf(char *format, ...);

int atoi(char *s);

Description
Concatenates s2 on to s1
Searches s for the first
occurrence of c, returns
the address of c, else
NULL
Compares s1 and s2.
Returns 0 if they are the
same, else non-zero.
Copies s2 into s1
Copies s2 into s1, but
only up to at most n
characters
Prints formatted text
string to stdout
Prints formatted text
string to stream
Stores formatted text
string in string s
Reads text from stdin in
the using formatted string
as guide to format of
input, each of the
variables to be used for
storage must be pointers
Returns string s
converted into an integer.
Ex. atoi(12) returns the
integer 12
Ex. atoi(2.3) returns
the integer 2
Ex. atoi(hi) returns null
(zero)

Appendix A: The C Programming Language

208

Formatted String Arguments


Several of these functions use formatted text strings. Format strings
provide a way to handle one or more variables of different types as a
string. They work by defining a string in which is embedded references to
different types. Each of these embedded references is replaced by a
variable of the specified type.
For example, suppose you want to print to the screen a message like,
Employee Andy Beger is 22 years old and is 1.82 meters tall. for all
employees in your department. You could create a different string for
employee, but using a format string will save a lot of work.
Assume variables name, age and height are declared as types char *, int,
and double. Then you want a string something like
Employee <name> is <age> years old and <height> meters
tall.

The syntax to print this message to stdout is:


printf(Employee %s is %d years old and %lf meters
tall.\n, name, age, height);

Notice first that the character % is used to specify a that variables value
should be inserted into the string in the current position. The format string
is followed by a comma-separated list of variables holding the values to be
inserted.
As you might guess the letter(s) following the % specify the type of
variable. As shown in the example, s specifies a string, d specifies an
integer and lf specifies a double. Refer to a text for an exhaustive list of
type specifies.
More complex examples of format strings will be part of the file
interfacing appendix later in this course.
The \n at the end of the format string is the new line character.

Appendix A: The C Programming Language

209

Lesson 4: Preprocessor Macros


The first steps in compiling C source code are done be a preprocessor.
Two preprocessor commands used in Retek batch code are #define and
#include.
= #include is used to add on text to the source code before continuing
with compilation. A file is named after the #include keyword. The
preprocessor finds the file and inserts its full text in place of the
#include command. In this way header may be included without
making the source code file unreasonable large. Header files typically
declare variables, functions and #defined macros that are intended for
use by multiple programs.
The file named by an #include statement may be specified in two ways.
= #include my_header.h
= #include <my_header.h>
The difference in how the preprocessor searches for the named file and is
implementation dependent.
All Retek batch programs begin like this.
#include retek.h

= #define replaces all instances of some specified text. The following


example replaces of occurrences of the text FALSE with the number
zero.
#define FALSE 0

This example might be used for Boolean comparisons.


If ( valid_store(store_id) == FALSE )
{

The preprocessor would replace this and all other instances of the string
FALSE with the constant 0.
Most often, Retek #defines are used to implement constant values as
described in the next lesson.

Appendix A: The C Programming Language

210

Lesson 5: Libraries and Header Files


The header file retek.h mentioned above #includes the following Retek
and C library headers.
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include

<stdio.h>
<stdlib.h>
<string.h>
<ctype.h>
<memory.h>
<termio.h>
<math.h>
<floatingpoint.h>
<restart.h>
<std_err.h>
<std_len.h>

/* macros */
#define MATCH(a,b)

!strcmp(a,b)

Of the eleven header files listed, the first eight are standard C headers. To
view their contents visit the directory /usr/include. For more information
and examples, there are many web sites and texts with descriptions of the
standard C libraries.
The final three header files are Retek specific. They can be viewed in
your $l directory.
= Restart.h provides the functionality for restart and recovery which will
be discussed later in this course.
= Std_err.h provides several variables and macros used to simplify error
handling in Pro*C. Here is the text of std_err.h:
char PROGRAM[30];
char err_data[355];
char table[255];
/* macros */
#define NO_DATA_FOUND
(SQLCODE == 1403)
#define SQL_ERROR_FOUND
(SQLCODE != 0 && SQLCODE !=
1403)
#define NUM_RECORDS_PROCESSED sqlca.sqlerrd[2]
#define DUP_VAL_FOUND
(SQLCODE == -1)
/* define error codes to
function */
#define RET_FUNCTION_ERR
#define RET_FILE_ERR
#define RET_PROC_ERR
#define RET_EDI_ERR

be passed into WRITE_ERROR


103
104
105
108

= Std_len.h defines many integer constants that correspond to data


lengths of columns in the Retek database. These are used throughout
Retek batch programs for declaring string lengths.

Appendix A: The C Programming Language

211

For example, a store id has 4 characters. Thus, the following constant


definition in std_len.h:
#define LEN_STORE 4

Recall that strings in C need a null terminator. Thus:


#define NULL_STORE 5

defines a constant to be used in the declaration of a character array to hold


a store id:
char store_id[NULL_STORE];

String variables should never be defined with a hard coded integer value.
Rather, always use a #defined constant.
char store_id1[5];
/* works, but very bad
practice */
char store_id2[NULL_STORE]; /* much better */

Appendix A: The C Programming Language

Exercises
1. Write a C program to print the sum of two integer command line
arguments to stdout. Here is an example compile and run.
> cc -o add add.c
> add
usage: add <int1> <int2>
> add 12 38
12 + 38 = 50
> add 2.5 34
2 + 34 = 36
> add 3.33 4.2232
3 + 4 = 7
> add one two
0 + 0 = 0

2. Write a program to count from 1 to 10 printing each integer to the


screen. Print one line at a time in the following format:
> cc -o oddeven oddeven.c
> oddeven
ODD
EVEN
1
2
3
4
5
6
7
8
9
10

Hint: the \t character can be used in your format string.

212

Appendix A: The C Programming Language

213

3. Write a program to calculate the area of a triangle given the height


and width. Use scanf to accept input from the user. Include a
function to calculate the triangles area. Call this new function
from main( ).
> cc -o tria tria.c
> tria

Enter the triangle's height: 1.45


Enter the triangle's length: .76
The triangle's area is 0.551000
Hint: use %lf in format strings for reading or writing a double.
4. A professor generates letter grades using the following table:
% correct
0-60
61-70
71-80
81-90
91-100

grade
F
D
C
B
A

She wants a program to enter grades. The program optionally accepts as


an input argument the course name. While running, the program prompts
for student_name, integer_grade pairs, then prints the student name and
letter grade to a file called grades<_course_name>.out.
Before beginning consider how to use stepwise refinement to make
solving problem easier. (accept input correctly for one student, convert
input to letter grade, print output to screen, put in a loop to accept multiple
students input (ctrl-d or some other character to end), open file, print
input to file).

Appendix A: The C Programming Language

214

Appendix B: Programming Best Practices

Appendix A: The C Programming Language

216

Appendix Overview
Creating quality in the first place saves money, time, and frustration in the
long run. As one batch trainer so succinctly phrased it, Coding is never a
race. It is also important to keep in mind that other people will be
reading (and maintaining) your code. . . creating quality code makes their
(and your) job easier.

Objectives
After completing this appendix, you will be able to:
Understand how implementation (a.k.a. coding) fits into the context of
a software development organization.
Approach a difficult programming task in a thoughtful and structured
manor.
Know where to begin looking for background information relative to
your programming tasks.
Understand the importance of modularity and have a vocabulary for
discussing modularity issues.

Appendix A: The C Programming Language

217

Lesson 1: Coding in Context


The Software Lifecycle
There are many different models presented by many different people in
the software industry that present ways in which to approach the
development of a product. Many of these are based on the type of
technology being used an object-oriented development environment, to
use an example, being very different from more traditional types (Retek
being grouped with the latter). One of the more traditional models, and
one that Retek in general adheres to, is the waterfall model. This model
presents the software lifecycle as a series of steps, each step leading to the
next and only the next step. It ensures that development is done is a
consistent and logical manner that allows for all parts of the process to be
well thought-out and executed.
It is the construction part of this process that you will mainly be
concerning yourself with, but it is equally important that you realize the
parts of the process that come before and after you have played your part.
Before a developer is handed a detail design, requirements must be pinned
down, specifications made and approved by both the client and/or business
analysts, and high-level designs written, ones that encompass the total
scope of the business issues being addressed. After a developer has gotten
the code in question through unit testing and code review, the code has to
survive a system test, integration with existing software and maintenance.
However cliched the idea is, it cannot be denied that a developer who is
aware of the processes that precede and follow them will produce better
quality code because of an awareness of the big picture.

Appendix A: The C Programming Language

218

Lesson 2: Approaching a Programming


Task
The Design
Read the design critically expect bugs and you will find them. Challenge
the designer to explain questionable parts. Remember that there are times
when issues that appear obvious to you might be overlooked by a designer
and it is your job to make the designer aware of these issues and to change
the design, if necessary. Just as you as a developer should never make the
assumption (however fabulous you are) that your code is perfect, the same
goes for designers. And although it might seem obvious, never make a
design change decision independently of the designer. . . even if it seems
like the simplest change in the world, sometimes there are reasons for the
way a certain task was done with ramifications outside of whatever you
are working on.

Functional Knowledge
Whenever a new project begins, many developers are eager to begin
coding and only do a brief and perfunctory read through of the functional
design before concentrating on the individual piece or pieces of
functionality that has been assigned to them. THIS IS A MISTAKE.
Take the time to read through all the functional and technical designs.
Ask questions of the lead designers about how your piece fits in with the
rest of the project. Many problems and issues that were not apparent at a
high level, can be found and fixed early if the developers have a real
functional knowledge of what is going on. Designers can often miss
issues that can become critical at later points. . . if you as a developer have
a sturdy understanding of the functional whole AND you are actually
getting down and dirty with the code, you will find things that others miss.
Guaranteed.
Resources for a greater functional understanding can be:
-

the data model (DDL)

the operations guide (CD)

the designer

other developers who have worked on the appendix before you

Appendix A: The C Programming Language

219

Unit Testing
While many developers groan at the prospect of organized unit testing, the
quality of code that results from researched, thoughtful unit test cases is so
superior to that code which goes through a partial, disorganized and
undocumented testing phase it boggles the mind. Test cases should be
thought through before a single line of code is typed out. . . and while this
may seem illogical, it ensures that the test cases are written to test what is
designed and not what is coded. Of course, when coding, additional
scenarios can always be added to test certain aspects of the code that
present themselves later on in the process.
One of the most useful, frustration-easing tools to help in the testing
process is the script. If your data can be re-setup simply by running a
script, hours of online work can be avoided. A script can, for example,
repeatedly re-insert records that you delete However, remember that all
data should always initially be set up online to ensure that all business
rules and dependencies are followed and that all necessary tables are
populated.

Stepwise Refinement
Stepwise refinement is a method for coding that focuses initially on the
big picture, the appendix as a whole, and only after all parts of the code
have been mapped out is complexity and functionality added. For
example, if the task given is to code a new program from scratch, a
developer using stepwise refinement as a mantra would initially map out
the totality of the program, perhaps drawing some flow or bubble charts
(informally) and picking the names and purposes of all of the sub
functions. Only after this program outline had been coded (and perhaps
even compile) would any of the guts of the functions be added. And
again, only after all the business functionality in is place and working
would features like restart/recovery and multithreading be added. One of
the benefits of this method is that it allows the developer to see where
functionality is repeated and hence will help in further modularizing the
code.

Appendix A: The C Programming Language

Lesson 3: Functions
Atomic code units (do one thing, do it well)
Modularity (cohesion, coupling)
Balance fan-in, fan-out
Parameters & arguments
Global variables (common coupling)

220

Appendix A: The C Programming Language

221

Lesson 4: Quality
Internal Quality
The internal quality of any given piece of code is determined by several
factors:
-

how it is put together

the quality of the code (easy to read, not tricky, standardized)

whether or not it is modularized

how easy is it to debug

how much white space there is

how well the functions are named (i.e. Can you tell exactly what a
given program does by a read-through of the function names? And if
some functions are too hard to name, perhaps they are not modularized
enough?)

how well it is documented

how the variables are named (ls_curr_code v. ls_order_curr_code)

External Quality
The external quality of code is determined by measuring very different
factors:
-

how easy it is for an end user to implement without error

how robust it is

what type of response time any given piece takes

Appendix A: The C Programming Language

222

Appendix C: Solutions to Exercises

Appendix C: Solutions to Exercises

Exercise 1
/*********************************\
| 9.0 Batch Development Standards |
| xxx_01.pc
|
| Revised mm/dd/yy - Your Name
|
| Get vdate and get store
|
| and sku from trn_win_store
|
\*********************************/
#include "retek_2.h"
EXEC SQL INCLUDE SQLCA.H;
long SQLCODE;
/* Vdate assumed for most batch programs */
char ps_vdate[NULL_DATE];
int main(int argc, char *argv[])
{
int init_results;
if (argc < 2)
{
fprintf(stderr, "Usage: %s userid/passwd\n", argv[0]);
exit(-1);
}
if (LOGON(argc, argv) < 0)
exit(-1);
if (init() < 0)
{
fprintf(stderr,"Aborted in init...\n");
exit(-1);
}
if (process() < 0)
{
fprintf(stderr,"Aborted in process...\n");
exit(-1);
}
if (final() < 0)
{
fprintf(stderr,"Aborted in final...\n");
exit(-1);
}
else
exit(0);
} /* end main */
int init()
{
char *function = "init";
EXEC SQL VAR ps_vdate IS STRING(NULL_DATE);
EXEC SQL DECLARE c_get_vdate CURSOR FOR
SELECT TO_CHAR( vdate, 'YYYYMMDD' )
FROM trn_period;
EXEC SQL OPEN c_get_vdate;
EXEC SQL FETCH c_get_vdate INTO :ps_vdate;
EXEC SQL CLOSE c_get_vdate;

224

Appendix C: Solutions to Exercises


return(0);
} /* end init */
int process()
{
char *function = "process";
char ls_sku[NULL_SKU]; /* Sku fetched into string */
char ls_store[NULL_STORE]; /* Store fetched into a string */
EXEC SQL DECLARE c_trn_win_store CURSOR FOR
SELECT sku,
store
FROM trn_win_store;
EXEC SQL OPEN c_trn_win_store;
while(1)
{
EXEC SQL FETCH c_trn_win_store INTO :ls_sku,
:ls_store;
if (NO_DATA_FOUND)
break;
printf("sku = %s, store = %s\n", ls_sku, ls_store);
}
printf("\nthe date is: %s\n", ps_vdate);
return(0);
} /* end process */
int final()
{
char *function = "final";
printf("Program completed successfully.\n");
return(0);
} /* end final */

225

Appendix C: Solutions to Exercises

226

Exercise 2
/***********************************\
| 9.0 Batch Development Standards
|
| Exercise 2 - xxx_02.pc
|
| Revised mm/dd/yy - Your Name
|
| Gross product calculation with
|
| Get vdate function and get store, |
| sku, unit cost and sales info
|
| from trn_win_store
|
| error handling added. With gross |
| profit calculation
|
\***********************************/
#include "retek_2.h"
#define NULL_SATYPE

EXEC SQL INCLUDE SQLCA.H;


long SQLCODE;
/* Vdate assumed for most batch programs */
char ps_vdate[NULL_DATE];
main(int argc, char *argv[])
{
char ls_logmessage[255];
if (argc < 2)
{
fprintf(stderr, "Usage: %s userid/passwd\n", argv[0]);
exit(-1);
}
if (LOGON(argc, argv) < 0) exit(-1);
if (init() < 0)
{
strcpy(ls_logmessage,"Aborted in init");
LOG_MESSAGE(ls_logmessage);
exit(-1);
}
if (process() < 0)
{
strcpy(ls_logmessage,"Aborted in process");
LOG_MESSAGE(ls_logmessage);
exit(-1);
}
if (final() < 0)
{
strcpy(ls_logmessage,"Aborted in final");
LOG_MESSAGE(ls_logmessage);
exit(-1);
}
else
{
strcpy(ls_logmessage,"Terminated ok");
LOG_MESSAGE(ls_logmessage);
exit(0);
}
} /* end main */
int init()

Appendix C: Solutions to Exercises


{
char *function = "init";
/* Assumed lib call (later) for vdate */
if (get_vdate(ps_vdate) < 0)
return(-1);
return(0);
} /* end init */
int process()
{
char *function = "process";
/* Declare Variables */
char
char
double
char
double
double
double

ls_sku[NULL_SKU];
ls_store[NULL_STORE];
ld_unit_cost;
ls_sales_type[NULL_SATYPE];
ld_sales_units;
ld_sales_value;
ld_gp;

/* Declare Cursor */
EXEC SQL DECLARE c_get_grossproduct
CURSOR FOR
select sku,
store,
NVL(unit_cost,0),
NVL(sales_type, 'R'),
NVL(sales_units,0),
NVL(sales_value,0)
from trn_win_store;
EXEC SQL OPEN c_get_grossproduct;
if (SQL_ERROR_FOUND)
{
sprintf(err_data, "Opening c_get_grossproduct");
strcpy(table, "trn_win_store");
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1);
}
while(1)
{
/* fetch cursor */
EXEC SQL FETCH c_get_grossproduct
INTO :ls_sku,
:ls_store,
:ld_unit_cost,
:ls_sales_type,
:ld_sales_units,
:ld_sales_value;
if SQL_ERROR_FOUND
{
sprintf(err_data, "Fetch c_get_grossproduct");
strcpy(table, "trn_win_store");
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1);
}
/* if no data found then break out of while loop */

227

Appendix C: Solutions to Exercises

228

if (NO_DATA_FOUND)
break;
ld_gp = ld_sales_value - (ld_sales_units * ld_unit_cost);
EXEC SQL INSERT INTO trn_win_store_hist
(sku,
store,
eow_date,
sales_type,
value,
gp)
VALUES
(:ls_sku,
:ls_store,
to_date(:ps_vdate, 'YYYYMMDD'),
:ls_sales_type,
:ld_sales_value,
:ld_gp);
if (SQL_ERROR_FOUND || NO_DATA_FOUND)
{
sprintf(err_data, "Inserting data into win_store_hist");
strcpy(table, "trn_win_store_hist");
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1);
}
/*

printf ("SKU:%s

STORE:%d \n", os_sku, ol_store); */

} /* end while */
printf("\nthe date is: %s\n", ps_vdate);
return(0);
} /* end process */
int final()
{
char *function = "final";
printf("Program completed successfully.\n");
return(0);
} /* end final */
/* Assumed future library function */
int get_vdate(char* os_vdate)
{
EXEC SQL VAR os_vdate IS STRING(NULL_DATE);
char* function = "get_vdate";
EXEC SQL DECLARE c_get_vdate CURSOR FOR
SELECT TO_CHAR( vdate, 'YYYYMMDD' )
FROM trn_period;
EXEC SQL OPEN c_get_vdate;
if (SQL_ERROR_FOUND)
{
sprintf(err_data,"Opening c_get_vdate");
WRITE_ERROR(SQLCODE,function,"trn_period",err_data);
return(-1);
}
EXEC SQL FETCH c_get_vdate INTO :os_vdate;
if (SQL_ERROR_FOUND || NO_DATA_FOUND)
{
sprintf(err_data,"Fetching from c_get_vdate");

Appendix C: Solutions to Exercises


WRITE_ERROR(SQLCODE,function,"trn_period",err_data);
return(-1);
}
EXEC SQL CLOSE c_get_vdate;
if (SQL_ERROR_FOUND)
{
sprintf(err_data,"Closing from c_get_vdate");
WRITE_ERROR(SQLCODE,function,"trn_period",err_data);
return(-1);
}
}

return(0);
/* end of get_vdate() */

229

Appendix C: Solutions to Exercises

Exercise 3
/************************************\
| 9.0 Batch Development Standards
|
| Exercise 3 - xxx_03.pc
|
| Revised mm/dd/yy - Your Name
|
| Gross product calculation with
|
| array processing.
|
\************************************/
#include "retek_2.h"
#define NULL_TYPE 2
#define ARRAYSIZE 1000
struct fetch_array
{
char s_sku[ARRAYSIZE][NULL_SKU];
char s_store[ARRAYSIZE][NULL_STORE];
double d_unit_cost[ARRAYSIZE];
char s_sales_type[ARRAYSIZE][NULL_TYPE];
double d_sales_units[ARRAYSIZE];
short i_units_ind[ARRAYSIZE];
double d_sales_value[ARRAYSIZE];
short i_value_ind[ARRAYSIZE];
double d_gp[ARRAYSIZE];
short i_gp_ind[ARRAYSIZE];
short i_type_ind[ARRAYSIZE];
char s_vdate[ARRAYSIZE][NULL_DATE];
} pa_fetch_array;
EXEC SQL INCLUDE SQLCA.H;
long SQLCODE;
/* Vdate assumed for most batch programs */
char ps_vdate[NULL_DATE];
main(int argc, char *argv[])
{
char ls_logmessage[255];
if (argc < 2)
{
fprintf(stderr, "Usage: %s userid/passwd\n", argv[0]);
exit(-1);
}
if (LOGON(argc, argv) < 0) exit(-1);
if (init() < 0)
{
strcpy(ls_logmessage,"Aborted in init");
LOG_MESSAGE(ls_logmessage);
exit(-1);
}
if (process() < 0)
{
strcpy(ls_logmessage,"Aborted in process");
LOG_MESSAGE(ls_logmessage);
exit(-1);
}
if (final() < 0)
{

230

Appendix C: Solutions to Exercises


strcpy(ls_logmessage,"Aborted in final");
LOG_MESSAGE(ls_logmessage);
exit(-1);
}
else
{
strcpy(ls_logmessage,"Terminated ok");
LOG_MESSAGE(ls_logmessage);
exit(0);
}
} /* end main */
int init()
{
char *function = "init";
int li_current_record;
EXEC SQL DECLARE c_get_vdate CURSOR FOR
SELECT TO_CHAR( vdate, 'YYYYMMDD' )
FROM trn_period;
EXEC SQL OPEN c_get_vdate;
if (SQL_ERROR_FOUND)
{
sprintf(err_data,"Opening c_get_vdate");
WRITE_ERROR(SQLCODE,function,"trn_period",err_data);
return(-1);
}
EXEC SQL FETCH c_get_vdate INTO :ps_vdate;
if (SQL_ERROR_FOUND || NO_DATA_FOUND)
{
sprintf(err_data,"Fetching from c_get_vdate");
WRITE_ERROR(SQLCODE,function,"trn_period",err_data);
return(-1);
}
EXEC SQL CLOSE c_get_vdate;
if (SQL_ERROR_FOUND)
{
sprintf(err_data,"Closing from c_get_vdate");
WRITE_ERROR(SQLCODE,function,"trn_period",err_data);
return(-1);
}
for(li_current_record=0;
li_current_record < ARRAYSIZE;
li_current_record++)
{
strcpy(pa_fetch_array.s_vdate[li_current_record],
ps_vdate);
}
return(0);
} /* end init */
int process()
{
char *function = "process";
int li_current_record;
int li_ndf=0;
int li_nrp=0;
int li_recs_returned=0;
/* driving cursor */
EXEC SQL DECLARE c_get_win_store CURSOR FOR
SELECT ws.sku,
ws.store,
NVL(ws.unit_cost,0),
ws.sales_type,

231

Appendix C: Solutions to Exercises

232

ws.sales_units,
ws.sales_value
FROM trn_win_store ws;
EXEC SQL OPEN c_get_win_store;
if (SQL_ERROR_FOUND)
{
sprintf(err_data,"Opening c_get_win_store");
WRITE_ERROR(SQLCODE,function,"trn_win_store",err_data);
return(-1);
}
/*start of while loop to insert into trn_win_store_hist*/
while (1)
{
EXEC SQL FOR :ARRAYSIZE
FETCH c_get_win_store INTO :pa_fetch_array.s_sku,
:pa_fetch_array.s_store,
:pa_fetch_array.d_unit_cost,
:pa_fetch_array.s_sales_type:pa_fetch_array.i_type_ind,
:pa_fetch_array.d_sales_units:pa_fetch_array.i_units_ind,
:pa_fetch_array.d_sales_value:pa_fetch_array.i_value_ind;
if (SQL_ERROR_FOUND)
{
strcpy(err_data,"Fetch c_get_win_store");
WRITE_ERROR(SQLCODE,function,"trn_win_store",err_data);
return(-1);
}
if (NO_DATA_FOUND) li_ndf=1;
li_recs_returned = NUM_RECORDS_PROCESSED - li_nrp;
li_nrp=NUM_RECORDS_PROCESSED;
for(li_current_record=0;
li_current_record < ARRAYSIZE;
li_current_record++)
{
/* calculate GP--mark as NULL if no units were sold */
if (pa_fetch_array.i_units_ind[li_current_record] != -1
&& pa_fetch_array.i_value_ind[li_current_record] != -1)
{
pa_fetch_array.d_gp[li_current_record] =
pa_fetch_array.d_sales_value[li_current_record] (pa_fetch_array.d_sales_units[li_current_record] *
pa_fetch_array.d_unit_cost[li_current_record]);
}
else
{
pa_fetch_array.i_gp_ind[li_current_record] = -1;
}
}
EXEC SQL FOR :li_recs_returned
INSERT INTO trn_win_store_hist(sku,
store,
eow_date,
sales_type,
value,
gp)
values(:pa_fetch_array.s_sku,
:pa_fetch_array.s_store,
TO_DATE(:pa_fetch_array.s_vdate,'YYYYMMDD'),
:pa_fetch_array.s_sales_type:pa_fetch_array.i_type_ind,
:pa_fetch_array.d_sales_value:pa_fetch_array.i_value_ind,
:pa_fetch_array.d_gp:pa_fetch_array.i_gp_ind);
if (SQL_ERROR_FOUND)
{
sprintf(err_data,"Inserting");
WRITE_ERROR(SQLCODE,function,"trn_win_store_hist",err_data);
return(-1);
}

Appendix C: Solutions to Exercises


}

if (li_ndf) break;
/*end of while loop*/

printf("\nthe date is: %s\n", ps_vdate);


return(0);
}/* end of process */
int final()
{
char *function = "final";
printf("Program completed successfully.\n");
return(0);
} /* end final */

233

Appendix C: Solutions to Exercises

Exercise 3 Part 2
/***********************************\
| 9.0 Batch Development Standards
|
| Exercise 3 - Part 2 - xxx_03.pc
|
| Revised mm/dd/yy - Your Name
|
| Gross product calculation with
|
| dynamic array processing.
|
\************************************/
#include "retek_2.h"
#define NULL_TYPE 2
int gi_size;
struct fetch_array
{
char (* s_sku)[NULL_SKU];
char (* s_store)[NULL_STORE];
double *d_unit_cost;
char (* s_sales_type)[NULL_TYPE];
double *d_sales_units;
short *i_units_ind;
double *d_sales_value;
short *i_value_ind;
double *d_gp;
short *i_gp_ind;
short *i_type_ind;
char (* s_vdate)[NULL_DATE];
} pa_fetch_array;
EXEC SQL VAR pa_fetch_array.s_sku IS STRING(NULL_SKU);
EXEC SQL VAR pa_fetch_array.s_store IS STRING(NULL_STORE);
EXEC SQL VAR pa_fetch_array.s_sales_type IS STRING(NULL_TYPE);
EXEC SQL INCLUDE SQLCA.H;
long SQLCODE;
/* Vdate assumed for most batch programs */
char ps_vdate[NULL_DATE];
/* Function prototypes */
int size_array();
main(int argc, char *argv[])
{
char ls_logmessage[255];
if (argc < 3)
{
fprintf(stderr,"Usage: %s userid/passwd ARRAYSIZE\n",
argv[0]);
exit(-1);
}
if (LOGON(argc, argv) < 0) exit(-1);
gi_size = atoi(argv[2]);
if (init() < 0)
{
strcpy(ls_logmessage,"Aborted in init");
LOG_MESSAGE(ls_logmessage);
exit(-1);
}

234

Appendix C: Solutions to Exercises


if (process() < 0)
{
strcpy(ls_logmessage,"Aborted in process");
LOG_MESSAGE(ls_logmessage);
exit(-1);
}
if (final() < 0)
{
strcpy(ls_logmessage,"Aborted in final");
LOG_MESSAGE(ls_logmessage);
exit(-1);
}
else
{
strcpy(ls_logmessage,"Terminated ok");
LOG_MESSAGE(ls_logmessage);
exit(0);
}
} /* end main */
int init()
{
char *function = "init";
int li_current_record;
/* Assumed lib call (later) for vdate */
if (get_vdate(ps_vdate) < 0)
return(-1);
if (size_array() < 0)
return(-1);
for(li_current_record=0;li_current_record <
gi_size;li_current_record++)
{
strcpy(pa_fetch_array.s_vdate[li_current_record],ps_vdate);
}
return(0);
} /* end init */
int process()
{
char *function = "process";
int li_current_record;
int li_ndf=0;
int li_nrp=0;
int li_recs_returned=0;
/* driving cursor */
EXEC SQL DECLARE c_get_win_store CURSOR FOR
SELECT ws.sku,
ws.store,
NVL(ws.unit_cost,0),
ws.sales_type,
ws.sales_units,
ws.sales_value
FROM trn_win_store ws;
EXEC SQL OPEN c_get_win_store;
if (SQL_ERROR_FOUND)
{
sprintf(err_data,"Opening c_get_win_store");
WRITE_ERROR(SQLCODE,function,"trn_win_store",err_data);
return(-1);
}
/*start of while loop to insert into trn_win_store_hist*/

235

Appendix C: Solutions to Exercises

236

while (1)
{
EXEC SQL FOR :gi_size
FETCH c_get_win_store INTO :pa_fetch_array.s_sku,
:pa_fetch_array.s_store,
:pa_fetch_array.d_unit_cost,
:pa_fetch_array.s_sales_type:pa_fetch_array.i_type_ind,
:pa_fetch_array.d_sales_units:pa_fetch_array.i_units_ind,
:pa_fetch_array.d_sales_value:pa_fetch_array.i_value_ind;
if (SQL_ERROR_FOUND)
{
strcpy(err_data,"Fetch c_get_win_store");
WRITE_ERROR(SQLCODE,function,"trn_win_store",err_data);
return(-1);
}
if (NO_DATA_FOUND) li_ndf=1;
li_recs_returned=NUM_RECORDS_PROCESSED-li_nrp;
li_nrp=NUM_RECORDS_PROCESSED;
for(li_current_record=0;li_current_record<li_recs_returned;li_curr
ent_record++)
{
/* calculate GP--mark as NULL if no units were sold */
if (pa_fetch_array.i_units_ind[li_current_record] != -1
&& pa_fetch_array.i_value_ind[li_current_record] != -1)
{
pa_fetch_array.d_gp[li_current_record] =
pa_fetch_array.d_sales_value[li_current_record] (pa_fetch_array.d_sales_units[li_current_record] *
pa_fetch_array.d_unit_cost[li_current_record]);
}
else
{
pa_fetch_array.i_gp_ind[li_current_record] = -1;
}
}
EXEC SQL FOR :li_recs_returned
INSERT INTO trn_win_store_hist(sku,
store,
eow_date,
sales_type,
value,
gp)
values(:pa_fetch_array.s_sku,
:pa_fetch_array.s_store,
TO_DATE(:pa_fetch_array.s_vdate,'YYYYMMDD'),
:pa_fetch_array.s_sales_type:pa_fetch_array.i_type_ind,
:pa_fetch_array.d_sales_value:pa_fetch_array.i_value_ind,
:pa_fetch_array.d_gp:pa_fetch_array.i_gp_ind);
if (SQL_ERROR_FOUND)
{
sprintf(err_data,"Inserting");
WRITE_ERROR(SQLCODE,function,"trn_win_store_hist",err_data);
return(-1);
}
if (li_ndf) break;
}
/*end of while loop*/
printf("\nthe date is: %s\n", ps_vdate);
return(0);
}/* end of process */
int final()
{
char *function = "final";

Appendix C: Solutions to Exercises


printf("Program completed successfully.\n");
return(0);
} /* end final */
/* Assumed future library function */
int get_vdate(char* os_vdate)
{
EXEC SQL VAR os_vdate IS STRING(NULL_DATE);
char* function = "get_vdate";
EXEC SQL DECLARE c_get_vdate CURSOR FOR
SELECT TO_CHAR( vdate, 'YYYYMMDD' )
FROM trn_period;
EXEC SQL OPEN c_get_vdate;
if (SQL_ERROR_FOUND)
{
sprintf(err_data,"Opening c_get_vdate");
WRITE_ERROR(SQLCODE,function,"trn_period",err_data);
return(-1);
}
EXEC SQL FETCH c_get_vdate INTO :os_vdate;
if (SQL_ERROR_FOUND || NO_DATA_FOUND)
{
sprintf(err_data,"Fetching from c_get_vdate");
WRITE_ERROR(SQLCODE,function,"trn_period",err_data);
return(-1);
}
EXEC SQL CLOSE c_get_vdate;
if (SQL_ERROR_FOUND)
{
sprintf(err_data,"Closing from c_get_vdate");
WRITE_ERROR(SQLCODE,function,"trn_period",err_data);
return(-1);
}
}

return(0);
/* end of get_vdate() */

int size_array()
{
char *function = "size_arrays";
int li_no_mem = 0;
if ((pa_fetch_array.s_sku =
(char(*)[NULL_SKU])calloc(gi_size,NULL_SKU)) ==
NULL) li_no_mem = 1;
if ((pa_fetch_array.s_store =
(char(*)[NULL_STORE])calloc(gi_size,NULL_STORE)) ==
NULL) li_no_mem = 1;
if ((pa_fetch_array.d_unit_cost =
(double*)calloc(gi_size,sizeof(double))) ==
NULL) li_no_mem = 1;
if ((pa_fetch_array.s_sales_type =
(char(*)[NULL_TYPE])calloc(gi_size,NULL_TYPE)) ==
NULL) li_no_mem = 1;
if ((pa_fetch_array.d_sales_units =
(double*)calloc(gi_size,sizeof(double))) ==
NULL) li_no_mem = 1;
if ((pa_fetch_array.i_units_ind =
(short*)calloc(gi_size,sizeof(short))) ==
NULL) li_no_mem = 1;
if ((pa_fetch_array.d_sales_value =
(double*)calloc(gi_size,sizeof(double))) ==
NULL) li_no_mem = 1;

237

Appendix C: Solutions to Exercises

238

if ((pa_fetch_array.i_value_ind =
(short*)calloc(gi_size,sizeof(short))) ==
NULL) li_no_mem = 1;
if ((pa_fetch_array.d_gp
=
(double*)calloc(gi_size,sizeof(double))) ==
NULL) li_no_mem = 1;
if ((pa_fetch_array.i_gp_ind
=
(short*)calloc(gi_size,sizeof(short))) ==
NULL) li_no_mem = 1;
if ((pa_fetch_array.i_type_ind =
(short*)calloc(gi_size,sizeof(short))) ==
NULL) li_no_mem = 1;
if ((pa_fetch_array.s_vdate =
(char(*)[NULL_DATE])calloc(gi_size,NULL_DATE)) ==
NULL) li_no_mem = 1;
if (li_no_mem)
{
sprintf(err_data,"Unable to allocate memory");
WRITE_ERROR(RET_FUNCTION_ERR,function,"",err_data);
return(-1);
}
return(0);
} /* end size_array */

Exercise 4
/*********************************\
| 9.0 Batch Development Standards |
| Exercise 4 - xxx_04.pc
|
| Revised mm/dd/yy - Your Name
|
| Gross product calculation with |
| table-based Retart/Recovery
|
| , sku, unit cost and sales info |
| from trn_win_store
|
\*********************************/
#include <retek_2.h>
#define NUM_INIT_PARAMETERS
array */
#define NULL_CODE
sales_type */
#define NULL_SATYPE 2

/* Number of elements init_parameter

/* Program specific: for length of

EXEC SQL INCLUDE SQLCA.H;


long SQLCODE;
/************************************\
| Array of retek_init() parameters: |
| See retek_2.h for correct format. |
\************************************/
init_parameter parameter[NUM_INIT_PARAMETERS] =
{
/* NAME ---------- TYPE ------ SUB_TYPE */
"commit_max_ctr", "long",
"",
"restart_store",
"string",
"S",
"restart_sku",
"string",
"S"
};
/* Variables used as start string (to limit query of driving cursor)
*/
long pl_commit_max_ctr;
char ps_restart_store[NULL_STORE];
char ps_restart_sku[NULL_SKU];

Appendix C: Solutions to Exercises


/* Vdate assumed for most batch programs */
char ps_vdate[NULL_DATE];
main(int argc, char* argv[])
{
char* function = "main";
int
li_init_results;
if (argc < 2)
{
fprintf(stderr,"Usage: %s userid/passwd\n",argv[0]);
exit(-1);
}
if (LOGON(argc, argv) < 0)
exit(-1);
li_init_results = init();
if (li_init_results < 0)
{
LOG_MESSAGE("Aborted in init");
exit(-1);
}
else if (li_init_results == NO_THREAD_AVAILABLE)
{
LOG_MESSAGE("Terminated OK. No available threads for
processing");
exit(0);
}
/******************************************************&&&&************\
| If processing fails, set gi_error_flag to be used by retek_close() |
| and for use of message selection in call to LOG_MESSAGE()
|
\**********************************************************************/
if (process() < 0)
gi_error_flag = 1;
if (final() < 0)
{
LOG_MESSAGE("Aborted in Final");
exit(-1);
}
else
{
if (!gi_error_flag)
{
LOG_MESSAGE("Terminated OK");
exit(0);
}
else
{
LOG_MESSAGE("Aborted while processing.");
exit(-1);
}
}
}

exit(0);
/* end of main() */

int init( )
{
char* function = "init";
int
li_init_return;
/********************************\
| retek_init:
|
| Initialize restart/recovery. |
\********************************/

239

Appendix C: Solutions to Exercises


li_init_return = retek_init(NUM_INIT_PARAMETERS,
parameter,
&pl_commit_max_ctr,
ps_restart_store,
ps_restart_sku);
if (li_init_return != 0)
return(li_init_return);
/* Assumed lib call (later) for vdate */
if (get_vdate(ps_vdate) < 0)
return(-1);
}

return(0);
/* end of init() */

int process()
{
char* function = "process";
char
char
double
char
double
double

ls_sku[NULL_SKU];
ls_store[NULL_STORE];
ld_unit_cost;
ls_sales_type[NULL_SATYPE];
ld_sales_units;
ld_sales_value;

/* Driving Cursor */
EXEC SQL DECLARE c_trn_win_store CURSOR FOR
SELECT sku,
store,
NVL(unit_cost,'0'),
'R',
'0',
'0'
FROM trn_win_store
WHERE (store > NVL(:ps_restart_store, -999) OR
(store = :ps_restart_store AND
(sku > :ps_restart_sku)))
ORDER BY store, sku;
/* Open driving cursor */
EXEC SQL OPEN c_trn_win_store;
if (SQL_ERROR_FOUND)
{
sprintf(err_data, "Opening c_trn_win_store");
strcpy(table, "trn_win_store");
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1);
}
while (1)
{
/* Fetch a record */
EXEC SQL FETCH c_trn_win_store INTO :ls_sku,
:ls_store,
:ld_unit_cost,
:ls_sales_type,
:ld_sales_units,
:ld_sales_value;
if (SQL_ERROR_FOUND)
{
sprintf(err_data, "Fetching from c_trn_win_store");
strcpy(table, "trn_win_store");
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1);
}
else if (NO_DATA_FOUND)
break;

240

Appendix C: Solutions to Exercises

241

Appendix C: Solutions to Exercises


/*******************************************************\
| retek_commit:
|
| Check if commit point reached;
|
| If yes, commit db changes, update restart_bookmark. |
\*******************************************************/
if (retek_commit(2,
ls_store,
ls_sku) < 0)
return(-1);
/* Do processing 1: insert record to a hist table */
EXEC SQL INSERT INTO trn_win_store_hist
(sku,
store,
eow_date,
sales_type,
value,
gp)
VALUES (:ls_sku,
:ls_store,
TO_DATE(:ps_vdate, 'YYYYMMDD'),
:ls_sales_type,
:ld_sales_value,
(:ld_sales_value (:ld_unit_cost * :ld_sales_units)));
if (SQL_ERROR_FOUND)
{
sprintf(err_data, "Inserting to trn_win_store_hist");
strcpy(table, "trn_win_store_hist");
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1);
}
}
}

/* end of while loop */

return(0);
/* end of process() */

int final()
{
char* function = "final";
int li_final_return;
/*************************************\
| retek_close:
|
| Clean up the restart tables;
|
| Perform final rollback or commit. |
\*************************************/
li_final_return = retek_close();
}

return(li_final_return);
/* end of final() */

/* Assumed future library function */


int get_vdate(char* os_vdate)
{
EXEC SQL VAR os_vdate IS STRING(NULL_DATE);
char* function = "get_vdate";
EXEC SQL DECLARE c_get_vdate CURSOR FOR
SELECT TO_CHAR( vdate, 'YYYYMMDD' )
FROM trn_period;
EXEC SQL OPEN c_get_vdate;
if (SQL_ERROR_FOUND)
{

242

Appendix C: Solutions to Exercises


sprintf(err_data, "Opening c_get_vdate");
strcpy(table, "period");
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1);
}
EXEC SQL FETCH c_get_vdate INTO :os_vdate;
if (SQL_ERROR_FOUND)
{
sprintf(err_data, "Fetching c_get_vdate");
strcpy(table, "period");
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1);
}
EXEC SQL CLOSE c_get_vdate;
if (SQL_ERROR_FOUND)
{
sprintf(err_data, "Closing c_get_vdate");
strcpy(table, "period");
WRITE_ERROR(SQLCODE,function,table,err_data);
return(-1);
}
}

return(0);
/* end of get_vdate() */

243

Das könnte Ihnen auch gefallen