You are on page 1of 104

Computer Assisted

Audit Tools and


Techniques
Selection and Application
of CAATTs

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
Todays Environment
Internal Auditors are advising organizations
on internal control attributes and ways to
gain assurance from information.
SOX compliance efforts have led companies
to delve more deeply into their financial
statement reporting elements and into the
data that feeds and supports the financial
data.

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
Todays Environment
Internal Audit groups faced with growing
workloads and heightened accountability
Discovering that Computer Assisted
Auditing Tools and Techniques (CAATTs)
offer much needed help
Audit technology tools facilitate more granular
analysis of data and help to determine the
accuracy of the information

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
CAATTs- Review 100% of
data
Comprehensive approach of testing
contrasts with traditional audit sampling
methods (extracting small data sets and
extrapolating conclusions about the
population of transactions)
Sampling techniques require audit judgment
and confidence levels; whereas CAATTs deliver
more definitive results because the entire
population of data can be tested

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
CAATTs- Review 100% of
data
Filtering large volumes of data is much
more practical and effective
Work with greater quantities of data
Work with data that is more complex
Ability to identify financial leakage, policy
noncompliance, and mistakes or errors in data
processing
For example: duplicate vendor payments; fraudulent
transactions, circumvention of invoice approval limits

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
Tool selection
The challenge
Make sure you are looking at the right tools to
deliver the benefits your company needs
It is the users responsibility to become familiar
with the tools available in order to pick the right
one
Have a solid knowledge of your business, your
data, and the accounting practices in your
industry

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
Tool selection
The IIA conducted an audit software
analysis and reported several key
recommendations for internal auditors to
consider in the selection of CAATTs:
1. Determine the enterprises audit mission,
objectives and priorities
2. Determine the types and scope of audits
3. Consider the enterprises technology
environment
4. Be aware of the risks

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
1. Determine the enterprises audit
mission, objectives and priorities

Auditors must consult with management


regarding what audit functions are of the
highest priority and where computer audit tools
may be applied to help meet those priorities.

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
2. Determine the types and
scope of audits

What is the stated objective of the audits?


What kinds of questions will auditors be asking
and what will be the boundaries?
Arriving at answers to these questions will be
critical in making an appropriate software
decision.

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
3. Consider the enterprises
technology environment

Any audit tools selected will have to mesh with


the other software, hardware and network
systems already in place.
In some cases, the existing IT infrastructure
may incorporate tools that auditors can use in
concert with automated software tools for
improved effect.

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
4. Be aware of the risks

Applying software to any mission-critical


function carries some risks, and auditing
software is no different.
Automated software tools can prompt auditors
to jump to faulty conclusions or make
assumptions that run counter to enterprise
operations.

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
Tool Selection

Consider:
How many data sources you have
Volume of transactions

Characteristics to look for in CAATTs:


Ease of use
Ease of data extraction
Ability to access a wide variety of data files from different platforms
Ability to integrate data with different format
Ability to define fields and select from standard formats
Menu-driven functionality for processing analysis commands
Simplified query building and adjustments
Logging features

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
Audit data analysis
techniques
Execute tests for virtually all industries and almost all
types of data:
Accounts Receivable
Payroll
Cash Disbursements
Purchasing
Sales
General Ledger
Work in Progress
Loss Prevention
Asset Management
Limiting factors:
Access to data
Understanding of the data fields
Creativity of the auditor

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
ACL (Generalized Audit
Software)
Data is locked down as read-only
No chance of inadvertently changing the data
Much higher risk when using spreadsheets
Commands are auditor-friendly
Fairly easy to grasp what the commands will do
once explained
Reasonably short learning curve

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
ACL
Automatically records all of the commands
that are run and the results of the
procedures in its log
LOG feature enables automation of workpapers
Export the log to a word processor or other file
type

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
ACL
Batch feature (Writing Scripts)
Develop audit procedures to run in ACL
Auditor puts together the various routines in a
batch (similar to a macro)
Next time the auditor can run one command
(push a button), and all of those procedures will
run on autopilot with ACL dumping the results
into the log
Become much more efficient over time by
running same tests periodically, adding new
procedures to the batch

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
Additional Keys to Success
Identify a Champion- person with ability to
motivate, supervise, and generally make
sure the technology is employed and
becomes successful
General Training- for the users of the
software (www.acl.com)
Identify power users- given more specific
training and become leaders of
implementing the chosen software; assist
other auditors; conduct in-house training.

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
Audit data analysis
techniques
CAATTs especially valuable in environments
that have:
High volumes of transactions
Complex processes
Distributed operations
Unrelated applications and systems

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
Advantage of CAATTs
Organizations gain assurance about the
accuracy of transactional data, and the
extent to which business transactions
adhere to controls and comply with policies
Consistent use of automated transaction
analysis and continuous monitoring, CAATs
enable real-time independent testing and
validation of critical enterprise data.

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
Advantage to Management
Management can use such information to
proactively identify exceptions to controls
and compliance policies and take
immediate action.
Implementing these programs can lead to

increased confidence in the corporate data


underlying financial reporting.

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
Computer-Assisted Audit Tools
and Techniques

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
TESTING THE
APPLICATION
CONTROLS
I-P-O

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
Hall, 3e 22
or posted to a publicly accessible website, in whole or in part.
Introduction to Input Controls
Designed to ensure that the transactions that bring
data into the system are valid, accurate, and
complete
Data input procedures can be either:
Source document-triggered (batch)
Direct input (real-time)

Source document input requires human


involvement and is prone to clerical errors.

Direct input employs real-time editing techniques to


identify and correct errors immediately
2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,
copied
3e or duplicated, 23
or posted to a publicly accessible website, in whole or in part.
Classes of Input Controls
1) Source document controls
2) Data coding controls
3) Batch controls
4) Validation controls
5) Input error correction
6) Generalized data input
systems
2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,
copied
3e or duplicated, 24
or posted to a publicly accessible website, in whole or in part.
#1-Source Document Controls
Controls in systems using physical source
documents
Source document fraud
To control for exposure, control procedures
are needed over source documents to
account for each one

Use pre-numbered source documents


Use source documents in sequence
Periodically audit source documents

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 25
or posted to a publicly accessible website, in whole or in part.
#2-Data Coding Controls
Checks on data integrity during processing
Transcription errors
Addition errors, extra digits
Truncation errors, digit removed
Substitution errors, digit replaced
Transposition errors
Single transposition: adjacent digits transposed (reversed)
Multiple transposition: non-adjacent digits are transposed
Control = Check digits
Added to code when created (suffix, prefix,
embedded)
Sum of digits (ones): transcription errors only

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 26
or posted to a publicly accessible website, in whole or in part.
#3-Batch Controls
Method for handling high volumes of
transaction data esp. paper-fed IS

Controls of batch continues thru all phases of


system and all processes (i.e., not JUST an
input control)

1) All records in the batch are processed together


2) No records are processed more than once
3) An audit trail is maintained from input to output

Requires grouping of similar input transactions

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 27
or posted to a publicly accessible website, in whole or in part.
#3-Batch Controls
Requires controlling batch throughout
Batch transmittal sheet (batch control record)
Unique batch number (serial #)
A batch date
A transaction code
Number of records in the batch
Total dollar value of financial field
Sum of unique non-financial field
Hash total
E.g., customer number

Batch control log

Hash totals

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 28
or posted to a publicly accessible website, in whole or in part.
#4-Validation Controls
Intended to detect errors in data
before processing

Most effective if performed close to


the source of the transaction

Some require referencing a master


file

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 29
or posted to a publicly accessible website, in whole or in part.
#4-Validation Controls
Field Interrogation
Missing data checks
Numeric-alphabetic data checks
Zero-value checks
Limit checks
Range checks
Validity checks
Check digit
Record Interrogation
Reasonableness checks
Sign checks
Sequence checks
File Interrogation
Internal label checks (tape)
Version checks
Expiration date check
2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,
copied
3e or duplicated, 30
or posted to a publicly accessible website, in whole or in part.
#5-Input Error Connection
Batch correct and resubmit
Controls to make sure errors dealt with
completely and accurately
1) Immediate Correction
2) Create an Error File
Reverse the effects of partially
processed, resubmit corrected records
Reinsert corrected records in
processing stage where error was
detected
3) Reject the Entire Batch

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 31
or posted to a publicly accessible website, in whole or in part.
#6-Generalized Data Input Systems
(GDIS)
Centralized procedures to manage data input
for all transaction processing systems
Eliminates need to create redundant routines
for each new application
Advantages:

Improves control by having one common


system perform all data validation
Ensures each AIS application applies a
consistent standard of data validation
Improves systems development efficiency

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 32
or posted to a publicly accessible website, in whole or in part.
#6-GDIS

Major components:

1) Generalized Validation Module


2) Validated Data File
3) Error File
4) Error Reports
5) Transaction Log
2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,
copied
3e or duplicated, 33
or posted to a publicly accessible website, in whole or in part.
Classes of Processing Controls

1) Run-to-Run Controls

2) Operator Intervention
Controls

3) Audit Trail Controls

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 34
or posted to a publicly accessible website, in whole or in part.
#1-Run-to-Run (Batch)
Use batch figures to monitor
the batch as it moves from
one process to another
1) Recalculate Control Totals
2) Check Transaction Codes
3) Sequence Checks

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 35
or posted to a publicly accessible website, in whole or in part.
#2-Operator Intervention
When operator manually enters
controls into the system

Preference is to derive by logic


or provided by system

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 36
or posted to a publicly accessible website, in whole or in part.
#3-Audit Trail Controls
Every transaction becomes traceable from
input to output
Each processing step is documented
Preservation is key to auditability of AIS
Transaction logs
Log of automatic transactions
Listing of automatic transactions
Unique transaction identifiers
Error listing

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 37
or posted to a publicly accessible website, in whole or in part.
Output Controls
Ensure system output:
1) Not misplaced
2) Not misdirected
3) Not corrupted
4) Privacy policy not violated
Batch systems more susceptible to exposure,
require greater controls
Controlling Batch Systems Output
Many steps from printer to end user
Data control clerk check point
Unacceptable printing should be shredded
Cost/benefit basis for controls
Sensitivity of data drives levels of controls
2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,
copied
3e or duplicated, 38
or posted to a publicly accessible website, in whole or in part.
Output Controls
Output spooling risks:
Access the output file and change
critical data values
Access the file and change the
number of copies to be printed
Make a copy of the output file so
illegal output can be generated
Destroy the output file before printing
take place
2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,
copied
3e or duplicated, 39
or posted to a publicly accessible website, in whole or in part.
Output Controls
Print Programs
Operator Intervention:
1) Pausing the print program to load output paper
2) Entering parameters needed by the print run
3) Restarting the print run at a prescribed checkpoint after a
printer malfunction
4) Removing printer output from the printer for review and
distribution
Print Program Controls Exposure
Production of unauthorized copies
Employ output document controls similar to source document
controls
Unauthorized browsing of sensitive data by employees
Special multi-part paper that blocks certain fields

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 40
or posted to a publicly accessible website, in whole or in part.
Output Controls
Bursting
Supervision
Waste
Proper disposal of aborted copies
and carbon copies
Data control
Data control group verify and log
Report distribution
Supervision
2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,
copied
3e or duplicated, 41
or posted to a publicly accessible website, in whole or in part.
Output Controls
End user controls
End user detection

Report retention:
Statutory requirements (govt)
Number of copies in existence
Existence of softcopies (backups)
Destroyed in a manner consistent
with the sensitivity of its contents

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 42
or posted to a publicly accessible website, in whole or in part.
Output Controls
Controlling real-time systems output
Eliminates intermediaries

Threats:
Interception
Disruption
Destruction
Corruption

Exposures:
Equipment failure
Subversive acts

Systems performance controls

Chain of custody controls


2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,
copied
3e or duplicated, 43
or posted to a publicly accessible website, in whole or in part.
Testing Computer Application
Controls
1) Black box (around)

2) White box (through)

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 44
or posted to a publicly accessible website, in whole or in part.
Testing Computer Application
Controls Black Box
Ignore internal logic of application
Use functional characteristics
Flowcharts
Interview key personnel
Advantages:
Do not have to remove application from
operations to test it
Appropriately applied:
Simple applications
Relative low level of risk

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 45
or posted to a publicly accessible website, in whole or in part.
Testing Computer Application
Controls White Box
Relies on in-depth understanding of the
internal logic of the application
Uses small volume of carefully crafted,
custom test transactions to verify specific
aspects of logic and controls
Allows auditors to conduct precise test with
known outcomes, which can be compared
objectively to actual results

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 46
or posted to a publicly accessible website, in whole or in part.
White Box Test Methods
1) Authenticity tests:
Individuals / users
Programmed procedure
Messages to access system (e.g., logons)
2) Accuracy tests:
System only processes data values that conform
to specified tolerances
3) Completeness tests:
Identify missing data (field, records, files)

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 47
or posted to a publicly accessible website, in whole or in part.
White Box Test Methods
4) Redundancy tests:
Process each record exactly once
5) Audit trail tests:
Ensure application and/or system creates an
adequate audit trail
Transactions listing
Error files or reports for all exceptions

6) Rounding error tests:


Salami slicing
Monitor activities excessive ones are serious
exceptions; e.g, rounding and thousands of
entries into a single account for $1 or 1

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 48
or posted to a publicly accessible website, in whole or in part.
Computer Assisted Audit Tools and
Controls(CAATTs)
1) Test data method
2) Base case system evaluation
3) Tracing
4) Integrated Test Facility [ITF]
5) Parallel simulation
GAS

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 49
or posted to a publicly accessible website, in whole or in part.
#1 Test Data
Used to establish the application processing
integrity
Uses a test deck
Valid data
Purposefully selected invalid data
Every possible:
Input error
Logical processes
Irregularity

Procedures:
1) Predetermined results and expectations
2) Run test deck
3) Compare
2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,
copied
3e or duplicated, 50
or posted to a publicly accessible website, in whole or in part.
#2 Base Case System
Evaluation (BCSE)
Variant of Test Data method

Comprehensive test data

Repetitive testing throughout SDLC

When application is modified, subsequent test


(new) results can be compared with previous
results (base)

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 51
or posted to a publicly accessible website, in whole or in part.
#3 Tracing
Test data technique that takes step-by-step
walk through application

1) The trace option must be enabled for the application

2) Specific data or types of transactions are created as


test data

3) Test data is traced through all processing steps of


the application, and a listing is produced of all lines of
code as executed (variables, results, etc.)

Excellent means of debugging a faculty


program
2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,
copied
3e or duplicated, 52
or posted to a publicly accessible website, in whole or in part.
#4 Integrated Test Facility
ITF is an automated technique that allows auditors
to test logic and controls during normal operations
Set up a s within the application system

1) Set up a dummy entity within the application system


2) System able to discriminate between ITF audit module
transactions and routine transactions
3) Auditor analyzes ITF results against expected results

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 53
or posted to a publicly accessible website, in whole or in part.
#5 Parallel Simulation
Auditor writes or obtains a copy of the program that
simulates key features or processes to be reviewed
/ tested
1) Auditor gains a thorough understanding of the application
under review

2) Auditor identifies those processes and controls critical to


the application

3) Auditor creates the simulation using program or


Generalized Audit Software (GAS)

4) Auditor runs the simulated program using selected data


and files

5) Auditor evaluates results and reconciles differences


2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,
copied
3e or duplicated, 54
or posted to a publicly accessible website, in whole or in part.
PRACTICAL EXERCISES

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 55
or posted to a publicly accessible website, in whole or in part.
1
Which statement is not correct? The audit
trail in a computerized environment
a.consists of records that are stored
sequentially in an audit file
b.traces transactions from their source to
their final disposition
c.is a function of the quality and integrity
of the application programs
d.may take the form of pointers, indexes,
and embedded keys

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 56
or posted to a publicly accessible website, in whole or in part.
2
All of the following concepts are associated with
the black box approach to auditing computer
applications except
a.the application need not be removed from
service and tested directly
b.auditors do not rely on a detailed knowledge
of the application's internal logic
c.the auditor reconciles previously produced
output results with production input transactions
d.this approach is used for complex transactions
that receive input from many sources

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 57
or posted to a publicly accessible website, in whole or in part.
3
Which test is not an example of a white
box test?
a.determining the fair value of inventory
b.ensuring that passwords are valid
c.verifying that all pay rates are within a

specified range
d.reconciling control totals

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 58
or posted to a publicly accessible website, in whole or in part.
4
When analyzing the results of the test
data method, the auditor would spend
the least amount of time reviewing
a.the test transactions
b.error reports
c.updated master files
d.output reports

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 59
or posted to a publicly accessible website, in whole or in part.
5
All of the following are advantages of
the test data technique except
a.auditors need minimal computer

expertise to use this method


b.this method causes minimal

disruption to the firm's operations


c.the test data is easily compiled
d.the auditor obtains explicit evidence

concerning application functions

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 60
or posted to a publicly accessible website, in whole or in part.
6
All of the following are disadvantages of the test
data technique except
a.the test data technique requires extensive
computer expertise on the part of the auditor
b.the auditor cannot be sure that the application
being tested is a copy of the current application
used by computer services personnel
c.the auditor cannot be sure that the application
being tested is the same application used
throughout the entire year
d.preparation of the test data is time-consuming

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 61
or posted to a publicly accessible website, in whole or in part.
7
Program testing
a.involves individual modules only, not

the full system


b.requires creation of meaningful test

data
c.need not be repeated once the

system is implemented
d.is primarily concerned with usability

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 62
or posted to a publicly accessible website, in whole or in part.
8
The correct purchase order
number,is123456. All of the following
are transcription errors except
a.1234567
b.12345
c.124356
d.123454

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 63
or posted to a publicly accessible website, in whole or in part.
9
Which of the following is correct?
a.check digits should be used for all

data codes
b.check digits are always placed at the

end of a data code


c.check digits do not affect processing

efficiency
d.check digits are designed to detect

transcription and transposition errors

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 64
or posted to a publicly accessible website, in whole or in part.
10
Which statement is not correct? The
goal of batch controls is to ensure that
during processing
a.transactions are not omitted
b.transactions are not added
c.transactions are free from clerical

errors
d.an audit trail is created

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 65
or posted to a publicly accessible website, in whole or in part.
Data Structures and
CAATTs for Data
Extraction

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
or posted to a publicly accessible website, in whole or in part.
Data Structures
Two fundamental components:

Organization: the way records are


physically arranged on the secondary
storage device

Access method: technique used to


locate records and to navigate through
the database or file

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 67
or posted to a publicly accessible website, in whole or in part.
File Processing Operations
1. Retrieve a record by key
2. Insert a record
3. Update a record
4. Read a file Individual
Records
5. Find next record
6. Scan a file
7. Delete a record

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 68
or posted to a publicly accessible website, in whole or in part.
Data Structures
Flat file structures
Sequential structure
All records in contiguous storage spaces in specified
sequence (key field)

Sequential files are simple & easy to process

Application reads from beginning in sequence

If only small portion of file being processed, inefficient


method

Does not permit accessing a record directly

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 69
or posted to a publicly accessible website, in whole or in part.
Data Structures

Flat file structures


Indexed structure
In addition to data file, separate index
file
Contains physical address in data file
of each indexed record

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 70
or posted to a publicly accessible website, in whole or in part.
Data Structures
Flat file structures
Indexed random file VS Indexed sequential file
Records are created without regard to physical
proximity to other related records

Physical organization of index file itself may be


sequential or random

Random indexes are easier to maintain, sequential


more difficult

Advantage over sequential: rapid searches

Other advantages: processing individual records,


efficient usage of disk storage

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 71
or posted to a publicly accessible website, in whole or in part.
Data Structures
Flat file structures
Virtual Storage Access Method (VSAM)
Large files, routine batch processing
Moderate degree of individual record processing
Used for files across cylinders
Uses number of indexes, with summarized content
Access time for single record is slower than Indexed
Sequential or Indexed Random
Has 3 physical components: indexes, prime data storage area,
overflow area
Might have to search index, prime data area, and overflow
area slowing down access time
Integrating overflow records into prime data area, then
reconstructing indexes reorganizes ISAM files

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 72
or posted to a publicly accessible website, in whole or in part.
Hashing Structure
Employs algorithm to convert primary key
into physical record storage address
No separate index necessary
Advantage: access speed
Disadvantage
Inefficient use of storage
Different keys may create same
address

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 73
or posted to a publicly accessible website, in whole or in part.
Pointer Structure
Stores the address (pointer) of related record in a
field with each data record
Records stored randomly
Pointers provide connections b/w records
Pointers may also provide links of records b/w files
Types of pointers:
Physical address actual disk storage location
Advantage: Access speed
Disadvantage: if related record moves, pointer must be changed
& w/o logical reference, a pointer could be lost causing
referenced record to be lost
Relative address relative position in the file
Must be manipulated to convert to physical address
Logical address primary key of related record
Key value is converted by hashing to physical address

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 74
or posted to a publicly accessible website, in whole or in part.
Database Conceptual
Models
Refers to the particular method used to
organize records in a database.
a.k.a. logical data structures
Objective: develop the database efficiently so
that data can be accessed quickly and easily.
There are three main models:
hierarchical (tree structure)
network
relational
Most existing databases are relational. Some
legacy systems use hierarchical or network
databases.

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
Hall, 3e 75
or posted to a publicly accessible website, in whole or in part.
The Relational Model
The relational model portrays data in
the form of two dimensional tables.
Its strength is the ease with which

tables may be linked to one another.


a major weakness of hierarchical and
network databases
Relational model is based on the
relational algebra functions of restrict,
project, and join.

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
Hall, 3e 76
or posted to a publicly accessible website, in whole or in part.
Associations and
Cardinality
Association

Represented by a line connecting two entities


Described by a verb, such as ships, requests, or
receives
Cardinality the degree of association
between two entities
The number of possible occurrences in one
table that are associated with a single
occurrence in a related table
Used to determine primary keys and foreign
keys

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
Hall, 3e 77
or posted to a publicly accessible website, in whole or in part.
Examples of Entity Associations

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
Hall, 3e 78
or posted to a publicly accessible website, in whole or in part.
Properly Designed Relational
Tables
Each row in the table must be unique in at
least one attribute, which is the primary
key.
Tables are linked by embedding the primary key
into the related table as a foreign key.
The attribute values in any column must all
be of the same class or data type.
Each column in a given table must be
uniquely named.
Tables must conform to the rules of
normalization, i.e., free from structural
dependencies or anomalies.

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
Hall, 3e 79
or posted to a publicly accessible website, in whole or in part.
Three Types of Anomalies
Insertion Anomaly: A new item cannot
be added to the table until at least one
entity uses a particular attribute item.
Deletion Anomaly: If an attribute item
used by only one entity is deleted, all
information about that attribute item is
lost.
Update Anomaly: A modification on an
attribute must be made in each of the
rows in which the attribute appears.
Anomalies can be corrected by creating
additional relational tables.

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
Hall, 3e 80
or posted to a publicly accessible website, in whole or in part.
Advantages of Relational Tables
Removes all three types of
anomalies.
Various items of interest

(customers, inventory, sales)


are stored in separate tables.
Space is used efficiently.
Very flexible users can form

ad hoc relationships.

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
Hall, 3e 81
or posted to a publicly accessible website, in whole or in part.
The Normalization Process
A process which systematically splits
unnormalized complex tables into
smaller tables that meet two conditions:
all nonkey (secondary) attributes in the table are
dependent on the primary key
all nonkey attributes are independent of the
other nonkey attributes
When unnormalized tables are split and
reduced to third normal form, they must
then be linked together by foreign keys.

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
Hall, 3e 82
or posted to a publicly accessible website, in whole or in part.
Accountants and Data
Normalization
Update anomalies can generate conflicting
and obsolete database values.
Insertion anomalies can result in
unrecorded transactions and incomplete
audit trails.
Deletion anomalies can cause the loss of
accounting records and the destruction of
audit trails.
Accountants should understand the data
normalization process and be able to
determine whether a database is properly
normalized.
2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
Hall, 3e 83
or posted to a publicly accessible website, in whole or in part.
Six Phases in Designing
Relational Databases
1. Identify entities
identify the primary entities of the
organization
construct a data model of their
relationships
2. Construct a data model showing
entity associations
determine the associations between
entities
model associations into an ER
diagram

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
Hall, 3e 84
or posted to a publicly accessible website, in whole or in part.
Six Phases in Designing
Relational Databases
3. Add primary keys and attributes
assign primary keys to all entities in the
model to uniquely identify records
every attribute should appear in one or
more user views
4. Normalize and add foreign keys
remove repeating groups, partial and
transitive dependencies
assign foreign keys to be able to link
tables
2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
Hall, 3e 85
or posted to a publicly accessible website, in whole or in part.
Six Phases in Designing
Relational Databases
5. Construct the physical database
create physical tables
populate tables with data
6. Prepare the user views
normalized tables should support all
required views of system users
user views restrict users from having
access to unauthorized data

2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated,
Hall, 3e 86
or posted to a publicly accessible website, in whole or in part.
Auditors and Data Normalization
Database normalization is a technical matter
that is usually the responsibility of systems
professionals.
The subject has implications for internal control
that make it the concern of auditors also.
Most auditors will never be responsible for
normalizing an organizations databases; they
should have an understanding of the process
and be able to determine whether a table is
properly normalized.
In order to extract data from tables to perform
audit procedures, the auditor first needs to
know how the data are structured.

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 87
or posted to a publicly accessible website, in whole or in part.
Embedded Audit Module (EAM)
Identify important transactions live
while they are being processed and
extract them
Examples

Errors
Fraud
Compliance

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 88
or posted to a publicly accessible website, in whole or in part.
Embedded Audit Module
Disadvantages:

Operational efficiency can decrease


performance, especially if testing is
extensive
Verifying EAM integrity - such as
environments with a high level of
program maintenance
Status: increasing need, demand, and
usage of COA/EAM/CA
2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,
copied
3e or duplicated, 89
or posted to a publicly accessible website, in whole or in part.
Generalized Audit Software (GAS)
Brief history
Most widely used CAATT
Usages include:

1) Footing and balancing entire files or selected data


items (e.g., extending inventory)
2) Selecting and reporting detail data

3) Selecting stratified statistical samples from data files

4) Formatting results into audit reports (auto work papers!)

5) Printing confirmations

6) Screening / filtering data


2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,
copied
3e or duplicated,
7) Comparing multiple files for differences
90
or posted to a publicly accessible website, in whole or in part.
Generalized Audit Software
Popular because:
1. GAS software is easy to use and requires
little computer background
2. Many products are platform independent,
works on mainframes and PCs
3. Auditors can perform tests independently
of IT staff
4. GAS can be used to audit the data
currently being stored in most file
structures and formats
2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,
copied
3e or duplicated, 91
or posted to a publicly accessible website, in whole or in part.
Generalized Audit Software
Simple structures
Complex structures
Auditing issues:

Auditor must sometime rely on IT personnel to produce


files/data
Risk that data integrity is compromised by extraction
procedures
Auditors skilled in programming better prepared to avoid
these pitfalls

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 92
or posted to a publicly accessible website, in whole or in part.
ACL
ACL is a proprietary version of GAS
Leader in the industry
Designed as an auditor-friendly meta-
language (i.e., contains commonly
used auditor tests)
Access to data generally easy with
ODBC interface

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 93
or posted to a publicly accessible website, in whole or in part.
Multiple Choice
Questions

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 94
or posted to a publicly accessible website, in whole or in part.
1
An inventory record contains part
number, part name, part color, and part
weight. These individual items are
called
a. fields.
b. stored files.
c. bytes.
d. occurrences.

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 95
or posted to a publicly accessible website, in whole or in part.
2
It is appropriate to use a sequential file
structure when
a. records are routinely inserted.
b. single records need to be retrieved.
c. records need to be scanned using

secondary keys.
d. a large portion of the file will be

processed in one operation.

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 96
or posted to a publicly accessible website, in whole or in part.
3
Which of the following statements is not true?
a. Indexed random files are dispersed throughout
the storage device without regard for physical
proximity with related records.
b. Indexed random files use disk storage space
efficiently.
c. Indexed random files are efficient when
processing a large portion of a file at one time.
d. Indexed random files are easy to maintain in
terms of adding records.

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 97
or posted to a publicly accessible website, in whole or in part.
4
Which characteristic is associated with
the database approach to data
management?
a.data sharing
b.multiple storage procedures
c.data redundancy
d.excessive storage costs

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 98
or posted to a publicly accessible website, in whole or in part.
5
Which statement is not correct? The
VSAM structure
a. is used for very large files that need

both direct access and batch


processing.
b. may use an overflow area for records.
c. provides an exact physical address

for each record.


d. is appropriate for files that require

few insertions or deletions.


2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,
copied
3e or duplicated, 99
or posted to a publicly accessible website, in whole or in part.
6
Which statement is true about a
hashing structure?
a. The same address could be
calculated for two records.
b. Storage space is used efficiently.
c. Records cannot be accessed rapidly.
d. A separate index is required.

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 100
or posted to a publicly accessible website, in whole or in part.
7
In a hashing structure,
a. two records can be stored at the same
address.
b. pointers are used to indicate the location
of all records.
c. pointers are used to indicate the location
of a record with the same address as
another record.
d. all locations on the disk are used for
record storage.

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 101
or posted to a publicly accessible website, in whole or in part.
8
Pointers can be used for all of the
following except
a. to locate the subschema address of

the record.
b. to locate the physical address of the

record.
c. to locate the relative address of the

record.
d. to locate the logical key of the record.

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 102
or posted to a publicly accessible website, in whole or in part.
9
Pointers are used
a. to link records within a file.
b. to link records between files.
c. to identify records stored in overflow.
d. all of the above.

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 103
or posted to a publicly accessible website, in whole or in part.
10
In a hierarchical model
a.links between related records are

implicit
b.the way to access data is by following

a predefined data path


c.an owner (parent) record may own

just one member (child) record


d.a member (child) record may have

more than one owner (parent)

2011 Cengage Learning. All Rights Reserved. May not be scanned,Hall,


copied
3e or duplicated, 104
or posted to a publicly accessible website, in whole or in part.