Data-Driven Fraud Detection: Bwanika Najib

DATA-DRIVEN FRAUD
DETECTION
BWANIKA NAJIB
W10017OO14
AUDIT KECURANGAN
MAGISTER AKUNTANSI
UNIVERSITAS MUHAMMADIYAH SURAKARTA
bwanikanajib@gmail.com
PRESENTATION STRUCTURE
• Introduction
• Difference between anomalies and frauds
• The data analysis process
• Data analysis software packages.
• Data access
• Data analysis techniques or procedure
• Real-time analysis
• Analyzing financial statement reports
DATA-DRIVEN FRAUD
DETECTION
Introduction
Data-driven fraud analysis is one of the most exciting developments in fraud
investigation. This emerging field is a synthesis of many different knowledge
areas including; fraud, audit, investigation, database theory, and analysis
techniques.
An example of data-driven fraud detection
On April 26, 2007, the Daily Tar Heel, a newspaper at the University of North
Carolina at Chapel Hill, published the following story … The auditor’s office has
investigated the Department of Motor Vehicles(DMV), N.C. Central University and
the Department of Justice. At the DMV, it was found that about 27,000 fraudulent
SSNs had been provided to obtain driver’s licenses, Mears said. The office takes
SSNs from the payroll office of the agency being audited and cross references them
with valid numbers in the Social Security Administration’s database. This example
shows the application of a very simple data-driven fraud detection method:
comparing SSNs with a list of valid numbers or with a list of known invalid
numbers.
Difference between anomalies and frauds
Accounting anomalies are not intentional mistakes, they do not represent

fraud and normally do not result in legal action being taken. Second,
anomalies will be found throughout a data set. For example, the double-
payment-of-invoices anomaly would likely occur every time a printer failure
happens.
On the other hand fraud is different and it is the intentional subversion of
controls by intelligent human beings. Perpetrators cover their tracks by
creating false documents or changing records in database systems. Detecting
a fraud is like finding the proverbial “needle in the haystack.”
THE DATA ANALYSIS PROCESS
Step 1: Understand the Business.
Since each business environment is different even within the same industry
or firm, examiners must have a good understanding of the business processes
and procedures.
Several potential methods to gather information about a business include;
tour the business, become familiar with competitor processes, interview key
personnel, analyze financial statements and other accounting information,
review process documentation, work with auditors and security personnel.
Step 2: Identify Possible Frauds That
Could Exist.
This risk assessment step requires an understanding of the nature of different
frauds, how they occur, and what symptoms they exhibit.
In this step, people involved in the business are interviewed and fraud
examiners should ask questions such as the following: How do insiders and
outsiders interact with each other?, What types of fraud have occurred or
been suspected in the past?, What types of fraud could be committed against
the company or on behalf of the company?, How could employees or
management acting alone commit fraud?
Step 3: Catalog Possible Fraud Symptoms
Fraud itself is rarely seen. Only its symptoms are usually observed so
identifying fraud symptoms is often the best and sometimes the only
practical method of proactive fraud detection.
These fraud symptoms are: accounting anomalies, internal control
weaknesses, analytical anomalies, extravagant lifestyles, unusual behaviors,
tips and complaints.
Step 4: Use Technology to Gather Data
about Symptoms
Once symptoms are defined and correlated (catalogued) with specific frauds,
supporting data are extracted from corporate databases, online Web sites, and
other sources. The deliverable of this step is a set of data that matches the
symptoms identified in the previous step.
Step 5: Analyze Results
Since computer-based analysis is often the most efficient method of
investigation, every effort should be made to screen results using computer
algorithms. Several fraud analysis techniques like discovery of outliers,
digital analysis, stratification and summarization, trending, and text
matching are used in this step.
Step 6: Investigate Symptoms
The final step of the data-driven approach is investigation into the most
promising indicators. Investigators should continue to use computer analyses
to provide support and detail.
Note
The primary advantage of the data-driven approach is the investigator takes
charge of the fraud investigation process. Instead of merely waiting for tips
or other indicators to become egregious enough to show on their own, the
data-driven approach can highlight frauds while they are still small.
The primary drawback to the data-driven approach is that it can be more
expensive and time intensive than the traditional approach. Since the
brainstorming process in Steps 2 and 3 usually results in hundreds of
potential indicators, it can take a significant amount of time to complete
Steps 4 and 5.
DATA ANALYSIS SOFTWARE
PACKAGES.
• ACL Audit Analytics is the data application used most widely by auditors
worldwide. ACL also includes a programming language called ACLScript that
makes automation of procedures possible.
• CaseWare’s IDEA is ACL technology’s primary competitor. Feature for feature,
it is very similar to ACL, but the interface is slightly different.
• Picalo . It is similar in features to ACL and IDEA, but it adds the concept of
detectlets, which are small plug-ins that discover specific indicators such as the
matching of vendor addresses with employee addresses.
• Microsoft Office + ActiveData is a plug-in for Microsoft Office that provides
enhanced data analysis procedures.
DATA ACCESS.
1. Open Database Connectivity (ODBC). It is a standard method of querying data
from corporate relational databases. Suppose Najib is asked to investigate a large
chain of ice cream shops for fraud. Najib could set up an ODBC connection as
follow;
• Najib contacts the IT department and asks for a description of the databases at
the company.
• Najib asks the IT department what kind of database the genjournal is running on.
He is told it is PostgreSQL version 8.1.9. He asks for a read-only ODBC
connection to the database, and after some discussion, is given access.
• The IT department gives Najib the server IP address, a username and password,
and other technical information required to set up the connection.
• Najib searches www.postgresql.org for the appropriate driver and installs it to
his computer. He sets up a connection in the control panel using the user
information provided by the IT department.
• Najib opens Picalo (or ACL, IDEA, MS Access, etc.) and selects File | New
Database Connection. He configures his connection and finishes the connection
wizard. He now has access to all the tables in genjournal as if they were regular
Picalo tables.
2. Text import
Several text formats exist for manually transferring data from one application (i.e.,
a database) to another (i.e., an analysis application). These include;
Plain text file that contains one row per database record. Columns are separated by
a delimiting character like a comma, a tab, etc
Fixed-width format, which again uses one row in the file per record in the
database. However, rather than using a delimiting character like a comma to denote
columns, spaces are used to pad each field value to a standard position.
3. Hosting a data warehouse
Many investigators simply import data directly into their analysis
application, effectively creating a data warehouse. For example, data are
imported, stored, and analyzed within ACL.
Once the data are in the data warehouse, the investigator connects (via
ODBC) his or her analysis application to the data warehouse for primary
analysis procedures.
DATA ANALYSIS TECHNIQUES OR
PROCEDURE
1. Data preparation
This includes type conversion and ensuring consistency of values. Investigators
should ensure that number columns are correctly typed as numbers and that text
columns are correctly typed as text. For example, the text “1” added to the text “1”
yields the text “11” because the two values are concatenated.
To correctly prepare data for time trend analysis, the time scale must be
standardized per some value of time, such as sales per day, hours worked per week,
and so forth.
2. Digital analysis
Digital analysis is the art of analyzing the digits that make up numbers like invoice amounts,
reported hours, and costs. For example, the numbers 987.59 and 9,811.02 both have a 9 in
the first position and an 8 in the second position. The distribution of digits actually follows
Benford’s Law, which is the primary method of digital analysis in fraud investigations.
According to “Benford’s Law,” the first digit of random data sets will begin with a 1 more
often than with a 2, a 2 more often than with a 3, and so on. In fact, Benford’s Law
accurately predicts for many kinds of financial data that the first digits of each group of
numbers in a set of random numbers will conform to the predicted distribution pattern.
Table showing Benford’s law probability
values
3. Outlier Investigation
By focusing on outliers, investigators can easily identify cases that do not match the norm.
The statistical z-score calculation is one of the most powerful and yet simple methods for
identifying outliers. It converts data to a standard scale and distribution, regardless of the
amounts and variances in the data. The calculation for a z-score is as follows:
Z−score = ðValue MeanÞ Starndard Deviation
Statistical theory predicts that 68 percent of the data have scores between -1 and 1, 95
percent will have scores between -2 and 2, and 99.7 percent will have scores between -3 and
3. With real-world data, cases sometimes have z-scores of 7, 9, or even 12. As a general rule,
values greater than 2 or 3 should be investigated.
4. Stratification and Summarization
Stratification is the splitting of complex data sets into case-specific tables.
Summarization is an extension of stratification. Instead of producing a number of
subtables (one for each case value), summarization runs one or more calculations
on the subtables to produce a single record summarizing each case value.
Remember this …
Stratification and summarization are similar analysis methods. The difference is
that stratification provides the record detail for each group, while summarization
provides summary calculations only.
5. Time Trend Analysis
Before time trend analysis can be performed, data must be standardized for
time. The most basic technique for time trend analysis is easy: simply graph
each case. For example, a graph of the price of each product over time will
reveal the products that are increasing abnormally. This can be done in a
spreadsheet program or in a more advanced analysis application.
6. Fuzzy Matching
The classic use of this technique is matching of employee and vendor
addresses, ZIP Codes, phone numbers, or other personal information.
REAL-TIME ANALYSIS
Data-driven investigation is usually performed during investigation (i.e., during periodic
audits), but it can be integrated directly into existing systems to perform real-time analysis
on transactions.
Real-time analysis specifically analyzes each transaction for fraud (rather than for
correctness).
One way to view the results of multiple indicators is to use a chart called a Matosas matrix.
This matrix lists one record per contract for which vendors bid. Each column in the table
represents an indicator run by the system. The Matosas matrix is a high-level view of which
contracts have indicator hits that need to be investigated. It allows the investigator to
mentally combine different indicators to different schemes.
Example of matosas matrix for contract
bidding.
ANALYZING FINANCIAL STATEMENT
REPORTS
Financial statements are the end product of the accounting cycle and the
primary financial statements are; balance sheets, income statements, and
statements of cash flows.
In order to detect fraud, balance sheets and income statements are converted
from position and period statements to change statements in four
ways:(1)comparing account balances in the statements from one period to
the next, (2) calculating key ratios and comparing them from period to
period, (3) performing vertical analysis, and (4) performing horizontal
analysis.
Detecting fraud through financial statement ratios is much easier than
assessing changes in the financial statement numbers themselves. Common
ratios that can be used to detect fraud are shown in Table 6.2.
Vertical analysis is a very useful fraud detection technique, because
percentages are easily understood.
Horizontal analysis is the most direct method of focusing on changes and
the changes in amounts from period to period are converted to percentages
(Change/Year 1 Amount = % Change).
Table 6.2 common ratios
Vertical analysis of a balance sheet
Vertical analysis of an income statement
Horizontal analysis of a balance sheet
Horizontal analysis of an income statement

Data-Driven Fraud Detection: Bwanika Najib

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Data-Driven Fraud Detection: Bwanika Najib

Hochgeladen von

Copyright:

Verfügbare Formate

DATA-DRIVEN FRAUD

Accounting anomalies are not intentional mistakes, they do not represent

Das könnte Ihnen auch gefallen