Sie sind auf Seite 1von 95

QualityStage Essentials 7.

QualityStage Essentials
(QS320)
Lab Exercises
Contact:
Learning Services
Ascential Software
50 Washington Street
Westboro, MA 01581
1-888-486-9636 x3345
EdServices@AscentialSoftware.com

QualityStage Essentials 7.0

This document and the software described herein are the property of Ascential Software Corporation
and its licensors and contain confidential trade secrets. All rights to this publication are reserved. No
part of this document may be reproduced, transmitted, transcribed, stored in a retrieval system or
translated into any language, in any form or by any means, without prior permission from Ascential
Software Corporation.
2004 Ascential Software Corporation.
Ascential Software Corporation reserves the right to make changes to this document and the software
described herein at any time without any notice. No warranty is expressed or implied other than any
contained in the terms and conditions of sale.
Ascential Software Corporation
50 Washington Street
Westboro, MA 01581-1021 USA
Phone: (508) 366-3888
Fax: (508) 366-3669
Ascential is a trademark of Ascential Software Corporation Any other product or company names
mentioned are used for identification purposes only, and may be trademarks or service marks of their
respective owners.
May 26, 2004 /V7.0.1

ii

QualityStage Essentials 7.0

Table of Contents
AUDIENCE............................................................................................................................................................1
PREREQUISITES.................................................................................................................................................1
STRUCTURE OF THIS COURSE......................................................................................................................1
FILE LOCATIONS ..................................................................................................................................................2
MODULE 1: DATA QUALITY............................................................................................................................4
EXERCISE 1-1 COURSE PROJECT.........................................................................................................................4
MODULE 2: INTRODUCTION TO QUALITYSTAGE..................................................................................8
EXERCISE 2-1 CONFIGURING QUALITYSTAGE.....................................................................................................8
MODULE 3: DEVELOPMENT BASICS.........................................................................................................11
EXERCISE 3-1 DEPLOY & RUN..........................................................................................................................11
EXERCISE 3-2 DEFINE A PROJECT.....................................................................................................................13
EXERCISE 3-3 DEFINE A DATA FILE..................................................................................................................14
EXERCISE 3-4 DEFINE DATAFIELD DEFINITIONS...............................................................................................15
EXERCISE 3-5: COPY AUTOHOME DATA FILE AND FIELD DEFINITIONS TO CREATE LIFE DATA FILE AND
FIELD DEFINITIONS............................................................................................................................................16
MODULE 4: INVESTIGATE............................................................................................................................17
EXERCISE 4-1: CHARACTER DISCRETE TYPE C AUTOHOME POLICIES...................................................17
LAB 4-1: CHARACTER DISCRETE TYPE T AUTOHOME POLICIES..............................................................21
EXERCISE 4-2: CHARACTER CONCATENATE INVESTIGATE................................................................................25
LAB 4-2: CHARACTER DISCRETE TYPE C & T LIFE POLICIES.....................................................................28
EXERCISE 4-3: WORD INVESTIGATION - NAME AUTOHOME POLICIES...................................................29
LAB 4-3: WORD INVESTIGATION ADDRESS & AREA AUTOHOME POLICIES...............................................32
MODULE 5: ADDING A UNIQUE KEY..........................................................................................................33
EXERCISE 5-1: TRANSFER STAGE ADD RECORD KEY TO AUTOHOME......................................................33
MODULE 6: STANDARDIZE...........................................................................................................................37
EXERCISE 6-1 COUNTRY STANDARDIZE............................................................................................................37
EXERCISE 6-2: SELECT STAGE: SPLIT US DATA FROM NON-US DATA............................................................40
EXERCISE 6-3: STANDARDIZE: DOMAIN PRE-PROCESSING................................................................................43
EXERCISE 6-4: STANDARDIZE US NAME, ADDRESS AND AREA DATA..............................................................45
EXERCISE 6-5: INVESTIGATE UNHANDLED NAME PATTERNS........................................................................48
LAB 6-6: INVESTIGATE UNHANDLED ADDRESS AND AREA PATTERNS............................................................51
MODULE 7: RULE SET OVERRIDES............................................................................................................52
EXERCISE 7-1: USNAME RULE SET OVERRIDES..............................................................................................52
LAB 7-1: US ADDRESS RULE SET OVERRIDES.................................................................................................59
MODULE 8: MATCHING.................................................................................................................................60
EXERCISE 8-1: UNDUPLICATION MATCH........................................................................................................60
EXERCISE 8-2: CUSTOM MATCH EXTRACT....................................................................................................63
EXERCISE 8-3: IMPROVE MATCH PASS 1: SET CRITICAL VARTYPES.............................................................66
EXERCISE 8-4: SET MATCH CUTOFFS............................................................................................................68
EXERCISE 8-5: MATCH PASS 2.......................................................................................................................69
MODULE 9: SURVIVORSHIP..........................................................................................................................70
EXERCISE 9-1: SURVIVE BEST OF BREED CUSTOMER RECORD.................................................................70

iii

QualityStage Essentials 7.0


APPENDIX A: COURSE APPLICATION DESIGN.......................................................................................72
APPENDIX B: FILE LAYOUTS.......................................................................................................................73

iv

QualityStage Essentials 7.0

Course Introduction

Welcome to the QualityStage Essentials course!


Ascential Software presents this QualityStage Essentials curriculum to provide you
with knowledge about QualityStage software and data quality methods. This course
is designed to teach participants about how to use QualityStage, how to build a
custom application, how to view and interpret results, and how to approach various
problems using the examples illustrated in this Student Guide. Participation in group
discussions, hands-on labs, and practice exercises will prepare you for the realworld application of QualityStage and data quality methods. This workbook can be
used as either a self-paced study guide or as a comprehensive tutorial with Ascential
Software instructors.
Throughout this course, you use known data sets with known problems and
solutions. To add value to your training experience, the instructor may opt to
perform similar exercises with a sample of your project data.

Audience
Ascential Software customers currently using or preparing to use QualityStage or
starting data quality and data integration initiatives.

Prerequisites
This course is intended for system/data analysts, developers, and business analysts
working or beginning work on a data integration project. Familiarity with Windows,
the operating system (OS) running QualityStage Server, and a text editor are
required.

Structure of this Course


Each chapter of this course follows these steps:
Introduction of the concepts to be covered
Explanation of concepts through classroom instruction, group discussions,
and a walk-through of the practice exercise
Individual practice exercises
Review of concepts learned

QualityStage Essentials 7.0

Course Introduction

File Locations
In this course the QualityStage Designer and Server are installed locally on your PC.
QualityStage uses the master projects directory to store project libraries and the
results of jobs that you execute. The location of the master projects directory is on
the server. In this course, the location of the Master Projects directory is
C:\Projects.
Note: The sub-folder projectname under the Master Projects directory refers
to the project that you are currently working on in QualityStage. You may
have multiple projects and, consequently, will have multiple project subfolders under the Master Projects directory
PATH

USAGE

CONTENTS

C:\ASCENTIAL\Q UALITYSTAGEDESIGNER70

This is the Designer


application folder for
QualityStage. This
folder also contains the
Designer repository
(QualityStageDesigner.
mdb) which stores the
information that youve
entered through the
Designer.

Data files and subfolders including:


QualityStageDesign
er.mdb
RULES directory
Object repository
Documentation

C:\ ASCENTIAL\QUALITYSTAGEDESIGNER70 \RULES

The Rules folder


contains all of the rule
sets available to
QualityStage.

Rule Sets (both prebuilt and custom-built)


including:
USADDR
USNAME
CAAREA
GBPREP

C:\ ASCENTIAL\QUALITYSTAGEDESIGNER\DOCUMENTATION

The Documentation
folder contains all of
the user guides for this
version of
QualityStage. The
User Guide can be
accessed at anytime
through F1, the help
function.

Documentation for this


version of
QualityStage
including:
Getting Started
Guide
User Guide
Rules User Guide
Stage Guide
Match Concepts

C:\PROJECTS\PROJECTNAME\DATA

This is the main folder for your data files. This


includes Input files, Output files, and reports.
Remember that Output files for one procedure,
will generally be the Input file for the next
procedure (unless otherwise noted in this guide).

QualityStage Essentials 7.0

Course Introduction

PATH

USAGE

CONTENTS

C:\PROJECTS\PROJECTNAME\SCRIPTS

The scripts folder


contains the
instructions from the
GUI to the Server
component of
QualityStage.

Execute scripts with


information
captured from the
GUI

C:\PROJECTS\projectname\LOGS

The Logs folder


contains the error and
run logs for every
procedure you perform
in QualityStage. The
logs can help you
determine what part of
the procedure you may
have encountered
errors

Error Logs
Run Logs

C:\PROJECTS\projectname\CONTROLS

The Controls folder


keeps track of scripts
and processes
performed during an
execution

Scripts run
Match Reports
.MAT files

QualityStage Essentials 7.0

Module 1: Data Quality

Module 1: Data Quality


Exercise 1-1 Course Project
Goal: Introduce the customer consolidation/initial load course project
Tasks:
1. Review the business case and project design
Course Business Case: WINN Insurance CRM Project
Executive Summary
WINN Insurance is a leader in the insurance marketplace, providing their customers with a
wide range of homeowner, automobile, and life insurance policies. WINN Corporation has
made the strategic decision to change their account-based systems to a customer-based
view. A comprehensive understanding of each of their customers is critical to reach WINN
Corporations goals for strategic growth initiatives.
Company Stats:
Name of Business: WINN Insurance
Type: Insurance
Lines of Business: Homeowners, Life, Automobile
The Business Challenge
WINN Corporation has recognized that their current data systems do not lend themselves to
providing a comprehensive view of their customer base. While one customer may have
purchased several automobile policies and life insurance policies, the systems do not give
WINN Corporation visibility to all of the policies purchased by each unique customer. To
achieve a customer-centric focus, WINN Corporation is looking to create a single customer
repository.
WINN Corporation believes that with a single customer repository they will be able to
enhance their organizational efficiencies and profits by getting a true count and accurate
view of their unique customers. With this information, they will be able to better market the
correct products to their existing customer base and save money by alleviating duplicate
mailings.
WINN System and Data Information
WINN Corporation has identified multiple sources as feeds to this repository. The challenge
is to identify rules for cleansing the data and providing consolidated views of the data across
all sources. They currently maintain three systems, one for each type Insurance Line. They
would ultimately like to create a customer information system that represents a consolidated
view of their customers with an audit trail showing where the data originates.

QualityStage Essentials 7.0

Module 1: Data Quality

WINN has already identified some data quality issues within their existing systems. In
addition, they have identified several requirements for establishing their Customer
Information (CIF) system. These issues are a serious concern of the management and they
would like to see a comprehensive plan for addressing these problems and requirements.
US records should be split from international address records

There is customer name and address information spread across free-form text fields.
They would like to see this organized into specified fields.
They want to match records across all sources

They want to remove all duplicate customer records*

They have inconsistent naming conventions across their systems. They would like to see
the name fields separate rather than in freeform text fields.
They need to standardize their address formats.

They want to establish a unique customer profile

They have found blank entries in their account fields

They want to maintain a cross-reference file to the legacy systems

Existing Systems
Source System

Description

AUTOHOME
LIFE

Homeowners & Automobile


Life

LOB/Operational Function

Policy Maintenance
Policy Maintenance

*Business rules for identifying duplicate customers


1. Same PID number
2. Same First Name, Last Name, Address, City, State, Zip
3. Similar Last Name, Same Address, City, State, Zip
4. Similar Address, Same City, State, Zip, Last Name

QualityStage Essentials 7.0

Module 1: Data Quality

QualityStage Essentials 7.0

Module 1: Data Quality

QualityStage Essentials 7.0

Module 1: Data Quality

Module 2: Introduction to QualityStage

QualityStage Essentials 7.0

Module 1: Data Quality

Exercise 2-1 Configuring QualityStage


Goal: Familiarize yourself with QualityStage Designer and configure the
Designer to communicate with the QualityStage server
Tasks:
2. Open QualityStage Designer
3. Create a Run Profile
Required Information:
Location where the QualityStage server is installed
C:\ASCENTIAL\QUALITYSTAGESERVER70
Name and location of the master projects directory
Instructions:
Run profiles provide QualityStage Designer with the locations of the QualityStage
server software and master projects directory on that server.
Exercise steps:
1. Start QualityStage Designer
2. Choose FILE, RUN PROFILES

3. Choose NEW, to create a new profile

QualityStage Essentials 7.0

Module 1: Data Quality

4. Select LOCAL WINDOWS, and click OK


5. Profile: MYLOCALPC
6. Host Type: WINDOWS LOCAL
7. Host Server Path: C:\ASCENTIAL\QUALITYSTAGESERVER70
8. Master Project Directory: C:\PROJECTS
9. Alternate Locale: <LEAVE BLANK>
10. Local Report Data Location: C:\PROJECTS
11. Check the box MAKE DEFAULT FOR ALL PROJECTS
12. Click OK
13. Click OK again

10

QualityStage Essentials 7.0

Module 1: Data Quality

When completing a run profile for a remote server you will need the
following information:
1. Host name or IP address of the host where the QualityStage server is
installed.
2. Directory location of QualityStage server executables.
3. Name and location of the master projects directory on the server.
4. Login ID and password for the host where the QualityStage server is
installed.
5. Port that the QualityStage server is using (this port is chosen when you
start the QualityStage server). When running locally on your PC, you do
not need to start the QualityStage server.
6. Local location for QualityStage temporary report files.
Please refer to the appropriate Server User Guide for additional details.
Alternate Locale: If you are processing data files that contain information in a
language that is different from the server you are running on, you can use the
Alternate Locale field for pointing to location with the correct parameters.

11

QualityStage Essentials 7.0

Module 3: Development Basics

Module 3: Development Basics

QualityStage Essentials 7.0

Module 3: Development Basics

Exercise 3-1 Deploy & Run


Goal: Learn how to execute a job and find the resulting project libraries on the
server
Exercise steps:
1. Select the demo QUALITY project, and expand the project (left pane)
2. Expand the JOBS folder
3. Select DEMO1 job (left pane)
4. Select the RUN button on the Toolbar

5. Uncheck the Run box, leave DEPLOY checked


6. Select the EXECUTE FILE MODE button
7. Select RUN FROM START

8. Click OK

TO

END button

QualityStage Essentials 7.0

Module 3: Development Basics

9. Open Windows Explorer and locate the project folders (libraries) under
the Master Projects Directory

QualityStage Essentials 7.0

Module 3: Development Basics

Exercise 3-2 Define a Project


Goal: Define the course project in QualityStage Designer
Exercise steps:
1. From the Designer main window, on the left pane, select the folder
PROJECTS
2. Select the New button on the toolbar, and highlight PROJECT

3. In the Add a New Project window, type in the following:


Name:

WINNCRM

Description: WINN INSURANCE CRM PROJECT


4. Click OK

QualityStage Essentials 7.0

Module 3: Development Basics

Exercise 3-3 Define a Data File


Goal: Create a file definition for the source data file AUTOHOME.
Exercise steps:
1. From the Designer main window, on the left pane select the DATAFILE
DEFINITION folder under the WINNCRM project folder
2. Select the New button from the toolbar and select DATAFILE DEFINITION,
or, right-click on the right pane and select NEW FILE

3. Complete the information on the Add a New Datafile screen


Name:

AUTOHOME

Description: AUTO & HOME POLICY RECORDS


Code Page:

DEFAULT

Language / Locale: ENGLISH


File Format : FIXED LENGTH TERMINATED
4. Select OK

QualityStage Essentials 7.0

Module 3: Development Basics

Exercise 3-4 Define Datafield Definitions


Goal: Define the AUTOHOME field definitions
Exercise steps:
1. From the Designer main window, on the left pane select the data file
AUTOHOME
2. On the right pane, right-click, from the pop-up menu choose NEW FIELD
3. Complete the Add a New Datafield screen for each of the following fields:
Name
SYSSRC
POLNUMB
NAME
ADDR1
ADDR2
CITY
STATE
ZIPCODE
FEDID
DOB
DOD

Description
Source System ID
Policy Number
Full Name
Address 1
Address 2
City Name
State
Zip Code
Federal ID
Date of Birth
Date of Death

DataType
ALPHANUM
ALPHANUM
ALPHANUM
ALPHANUM
ALPHANUM
ALPHANUM
ALPHANUM
ALPHANUM
ALPHANUM
ALPHANUM
ALPHANUM

UseType
S
S
S
S
S
S
S
S
S
S
S

Missing
S
S
S
S
S
S
S
S
S
S
S

Start
1
2
14
60
95
130
165
170
180
190
198

End Length
1
1
13
12
59
46
94
35
129
35
164
35
169
5
179
10
189
10
197
8
205
8

4. Select APPLY after completing each field definition and a new definition
screen will pop-up
5. After defining the last field select OK

QualityStage Essentials 7.0

Module 3: Development Basics

Exercise 3-5: Copy AUTOHOME Data File and Field Definitions to Create LIFE
Data File and Field Definitions
Goal: Practice copying data files and fields
Lab information:

Source file name is LIFE, it contains LIFE Insurance Policy

It is a flat file with fixed record length

LIFE source file has the same layout as AUTOHOME


1. From the Designer main window, on the left pane, expand the DATAFILE
DEFINITION folder under the WinnCRM project folder
2. On the right pane, select the AUTOHOME data file
3. Right-click on the file and select COPY
4. Right-click and select PASTE
5. When prompted, name the new file LIFE
6. Right-click on the LIFE file and choose MODIFY
7. Update the description to reflect life insurance policy information.

QualityStage Essentials 7.0

Module 4: Investigate

Module 4: Investigate

QualityStage Essentials 7.0

Module 4: Investigate

EXERCISE 4-1: Character Discrete Type C AUTOHOME Policies


Goal: Assess the data quality issues in the single domain fields of the AUTOHOME
source data file
Exercise Information:
The Investigate stage output files are reports
Reports are pre-formatted text files and do not require a file definition in
QualityStage
Investigate stages must be the only stage within a job
Advanced settings:
Leave the sample size equal to 1
The frequency should also be set to 1
Name the Investigate stage: IAHSDC (Investigate AUTOHOME Single
Domain fields, Type C)
Name the job the same as the stage
Tasks:
1. Build an Investigate Character Discrete stage to investigate the data in the
following fields:
SYSSRC
- Source System ID
POLNUMB Policy Number
FEDID Federal ID
DOB Date of Birth
DOD Date of Death
2. Run the Job
3. Move the source data
4. Review the results
5. Summarize the results
Task 1: Build the Stage and Design the Job
Exercise steps:
1. On the left pane, expand the WinnCRM project
2. On the left pane select STAGES
3. Right-click on the right pane and choose NEW STAGE
4. From the pop-up list CHOOSE INVESTIGATE
5. Complete the QualityStage Investigate Stage Wizard as follows:

QualityStage Essentials 7.0

Module 4: Investigate

Name: IAHSDC
Description: Char Disc C All Single Domain fields
Options: Character Discrete
Data File: AUTOHOME

6. Click NEXT
7. Select the fields, one at a time
Choose ADD TO SELECT FIELDS
OR
CLICK AND DRAG the field to the SELECT FIELDS BOX
8. The Field Mask Selection screen will pop-up
9. Choose the ALL C BUTTON
10. Click OK

QualityStage Essentials 7.0

Module 4: Investigate

11. Repeat steps 7-10 for all fields


12. Choose FINISH
13. On the left pane select the JOBS folder
14. Right-click on the right pane and add a New Job
15. Name the job IAHSDC, Description CHAR DISC C INV AH
DOMAIN FIELDS

SINGLE

16. Select OK
17. Expand the JOBS folder
18. On the right pane select the stage and drag it into the job
Task 2: Run the Job
Exercise steps:
1. Select the job to verify that the stage was moved into the job
2. Click the Run button on the toolbar
3. From the Job Run Options screen
Check DEPLOY
Uncheck RUN
Choose EXECUTE FILE MODE
On the FILE MODE EXECUTION SCREEN Choose RUN FROM START TO END

QualityStage Essentials 7.0

Module 4: Investigate

Task 3: Move the Source Data


Exercise steps:
1. Place the Student CD in the CD drive
2. From the EXERCISEFILES folder copy the files:
AUTOHOME
LIFE
3. To the Data library, under the project folder, under the Master Projects
location on the server
C:\Projects\WinnCRM\Data
4. Re-run the procedure, check the RUN box on the JOB RUN OPTIONS
screen
Task 4: Review Results
Exercise steps:
1. On the left pane select the JOBS folder
2. On the right pane, right-click the job IAHSDC and choose SERVER
REPORTS AND DATAFILES
3. Choose the report IAHSDCP.FRQ Frequency Distribution sorted by
frequency in descending order
4. Select VIEW FILE
5. Repeat these steps to view the other report IAHSDCp.SRT Frequency
distribution sorted alphabetical in ascending order
Task 5: Summarize Results
Exercise steps:
How often is the field populated?
How often is the field blank?
Do the data values match the field label?
Can you identify any potential default values or data anomalies?

QualityStage Essentials 7.0

Module 4: Investigate

QualityStage Essentials 7.0

Module 4: Investigate

LAB 4-1: Character Discrete Type T AUTOHOME Policies


Goal: Assess the data quality issues in the single domain fields of the AUTOHOME
source data file
Exercise Information:
The Investigate stage output files are reports
Reports are pre-formatted text files and do not require a file definition in
QualityStage
Investigate stages must be the only stage within a job
Advanced settings:
Leave the sample size equal to 1
The frequency should also be set to 1
Name the Investigate stage: IAHSDT (Investigate AUTOHOME Single
Domain fields, Type T)
Name the job the same as the stage
Tasks:
1. Build an Investigate Character Discrete stage to investigate the data in the
following fields:
SYSSRC
- Source System ID
POLNUMB Policy Number
FEDID Federal ID
DOB Date of Birth
DOD Date of Death
2. Run the Job
3. Review the results
4. Summarize the results
Task 1: Build the Stage and Design the Job
Exercise steps:
1. On the left pane, expand the WinnCRM project
2. On the left pane select STAGES
3. Right-click on the right pane and choose NEW STAGE
4. From the pop-up list CHOOSE INVESTIGATE

QualityStage Essentials 7.0

Module 4: Investigate

5. Complete the QualityStage Investigate Stage Wizard as follows:


Name: IAHSDT
Description: Char Disc T All Single Domain fields
Options: Character Discrete
Data File: AUTOHOME
6. Click NEXT
7. Select the fields, one at a time
Choose ADD TO SELECT FIELDS
OR
CLICK AND DRAG the field to the SELECT FIELDS BOX
8. The Field Mask Selection screen will pop-up
9. Choose the ALL T BUTTON
10. Click OK
11. Repeat steps 7-10 for all fields
12. Choose FINISH
13. On the left pane select the JOBS folder
14. Right-click on the right pane and add a New Job
15. Name the job IAHSDT, Description CHAR DISC T INV AH
FIELDS

SINGLE

DOMAIN

QualityStage Essentials 7.0

Module 4: Investigate

16. Select OK
17. Expand the JOBS folder
18. On the right pane select the stage and drag it into the job

Task 2: Run the Job


Exercise steps:
1. Select the job to verify that the stage was moved into the job
2. Click the Run button on the toolbar
3. From the Job Run Options screen
Check DEPLOY AND RUN
Choose EXECUTE FILE MODE
On the FILE MODE EXECUTION SCREEN Choose RUN FROM START TO END

QualityStage Essentials 7.0

Module 4: Investigate

Task 3: Review Results


Exercise steps:
1. On the left pane select the JOBS folder
2. On the right pane, right-click the job IAHSDT and choose SERVER REPORTS
AND DATAFILES
3. Choose the report IAHSDTP.FRQ Frequency Distribution sorted by
frequency in descending order
4. Select VIEW FILE
5. Repeat these steps to view the other report IAHSDTp.SRT Frequency
distribution sorted alphabetical in ascending order
Task 4: Summarize Results
Exercise steps:
Does the data in the fields conform to the same structure?
How many different date formats are in the data?
Do all the FedID values conform to 10 numbers?
Do the POLNUM values have the same length, same structure?

QualityStage Essentials 7.0

Module 4: Investigate

Exercise 4-2: Character Concatenate Investigate


Goal: Assess the correlation of the data in the DOB and DOD fields of the in the
AUTOHOME source data file
Exercise Information:
The Investigate stage output files are reports
Reports are pre-formatted text files and do not require a file definition in
QualityStage
Investigate stages must be the only stage within a job
Advanced settings:
Leave the sample size equal to 1
The frequency should also be set to 1
Name the Investigate stage: IAHDATES (Investigate AUTOHOME Date
fields, Character Concatenate- Type C)
Name the job the same as the stage
Tasks:
1. Build an Investigate Character Concatenate stage to investigate the data in
the following fields:
DOB - Date of Birth
DOD Date of Death
2. Run the Job
3. Review the results
4. Summarize the results
Task 1: Build the Stage and Design the Job
Exercise steps:
1. On the left pane, expand the WinnCRM project
2. On the left pane select STAGES
3. Right-click on the right pane and choose NEW STAGE
4. From the pop-up list CHOOSE INVESTIGATE
5. Complete the QualityStage Investigate Stage Wizard as follows:
Name: IAHDATES
Description: Char Conc C INV DOB and DOD fields
Options: Character Concatenate
Data File: AUTOHOME

QualityStage Essentials 7.0

Module 4: Investigate

6. Click NEXT
7. Select the fields, one at a time
Choose ADD TO SELECT FIELDS
OR
CLICK AND DRAG the field to the SELECT FIELDS BOX
8. The Field Mask Selection screen will pop-up
9. Choose the ALL C BUTTON
10. Click OK

11. Repeat steps 7-10 for all fields


12. Choose FINISH
13. On the left pane select the JOBS folder
14. Right-click on the right pane and add a New Job
15. Name the job IAHDATES, Description CHAR CONC C INV DOB & DOD
FIELDS
16. Select OK
17. Expand the JOBS folder
18. On the right pane select the stage and drag it into the job

QualityStage Essentials 7.0

Module 4: Investigate

Task 2: Run the Job


Exercise steps:
1. Select the job to verify that the stage was moved into the job
2. Click the Run button on the toolbar
3. From the Job Run Options screen
Check DEPLOY AND RUN
Choose EXECUTE FILE MODE
On the FILE MODE EXECUTION SCREEN Choose RUN FROM START TO END
Task 3: Review Results
Exercise steps:
1. On the left pane select the JOBS folder
2. On the right pane, right-click the job IAHDATES and choose SERVER
REPORTS AND DATAFILES
3. Choose the report IAHDATE P.FRQ Frequency Distribution sorted by
frequency in descending order
4. Select VIEW FILE
5. Repeat these steps to view the other report IAHDATEp.SRT Frequency
distribution sorted alphabetical in ascending order
Task 4: Summarize Results
Exercise steps:
How often is the field populated?
How often is the field blank?
Do the data values match the field label?
Can you identify any potential default values or data anomalies?

QualityStage Essentials 7.0

Module 4: Investigate

LAB 4-2: Character Discrete Type C & T LIFE Policies


Perform similar character discrete investigations on the LIFE datafile using both C
and T masks as you did on the AUTOHOME file. Is the data structured the same
between the two files? Do the files have some of the same data quality issues?

QualityStage Essentials 7.0

Module 4: Investigate

EXERCISE 4-3: Word Investigation - Name AUTOHOME Policies


Goal: Assess the data quality issues in the Name field of the AUTOHOME source
data file
Exercise Information:
The Investigate stage output files are reports
Reports are pre-formatted text files and do not require a file definition in
QualityStage
Investigate stages must be the only stage within a job
Name the Investigate stage: IAHNAME (Investigate AUTOHOME Name
Word Investigation)
Name the job the same as the stage
Use the USNAME rule set
Tasks:
1. Build an Investigate Word stage to investigate the data in the following
field(s):
NAME Name of Policy Holder
2. Run the Job
3. Review the results
4. Summarize the results
Task 1: Build the Stage and Design the Job
Exercise steps:
1. On the left pane, expand the WinnCRM project
2. On the left pane select STAGES
3. Right-click on the right pane and choose NEW STAGE
4. From the pop-up list CHOOSE INVESTIGATE
5. Complete the QualityStage Investigate Stage Wizard as follows:
Name: IAHNAME
Description: WORD INV NAME FIELD AH SOURCE
Options: WORD
Data File: AUTOHOME
6. Click NEXT

QualityStage Essentials 7.0

Module 4: Investigate

7. Select the rules set USNAME


8. Select the NAME field and drag to the Standard Fields box on the right of the
screen
9. Choose ADD RULE

10. Choose the ADVANCED OPTIONS button

11. Choose Include Unclassified Alphas in Word Frequency Files


12. Choose the OK
13. Choose FINISH
14. Add a new job with the same name as the stage

QualityStage Essentials 7.0

Module 4: Investigate

15. Right-click on the right pane and add a New Job


16. Drag the stage into the Job
Task 2: Run the Job
Exercise steps:
1. Select the job to verify that the stage was moved into the job
2. Click the Run button on the toolbar
3. From the Job Run Options screen
Check DEPLOY AND RUN
Choose EXECUTE FILE MODE
On the FILE MODE EXECUTION SCREEN Choose RUN FROM START TO END
Task 3: Review Results
Exercise steps:
1. On the left pane select the JOBS folder
2. On the right pane, right-click the job IAHNAME and choose SERVER REPORTS
AND DATAFILES
3. Choose the report IAHNAME P.FRQ Frequency Distribution of patterns
sorted by frequency in descending order
4. Select VIEW FILE
5. Repeat these steps to view the other reports
Task 5: Summarize Results
Exercise steps:
How often is the field populated?
Are there address components in the Name field?
Are the name personal names? Business names? Or both?

QualityStage Essentials 7.0

Module 4: Investigate

LAB 4-3: Word Investigation Address & Area Autohome Policies


Goal: Assess the data quality issues in the Address and City, State and Zip fields
the AUTOHOME source data file
Exercise Information:
The Investigate stage output files are reports
Reports are pre-formatted text files and do not require a file definition in
QualityStage
Investigate stages must be the only stage within a job
Name the job the same as the stage
Use the USADDR & USAREA rule sets
Tasks:
1. Build two separate Investigate Word Stages and Jobs:
ADDRESS
a. Use USADDR rule set
b. Drag over Address1 and Address 2
c. Include Unclassified Alphas in Word Frequency Files
AREA
a. Use USAREA rule set
b. Drag over City, State and Zip
c. Include Unclassified Alphas in Word Frequency Files
2. Run the Job
3. Review the results
4. Summarize the results

QualityStage Essentials 7.0

Module 5: Adding a Unique Key

Module 5: Adding a Unique Key


Exercise 5-1: Transfer Stage Add Record Key to AUTOHOME
Goal: Add a Unique Record Key to the AUTOHOME and LIFE data files
Exercise Information:
The Transfer stage reads in one input file and write out one output file
The output file and output fields must be defined in the QualityStage
Designer
The input and output files have a similar structure so we will start by
copy the AUTOHOME data file definition to the output file definition
Transfer stages can be added to Jobs with other stages (except
Investigate)
Name the Transfer stage: AHKEY
Name the Job: ADDKEY
Tasks:
1. Create the output file definition
Add fields for the Record Key, FileID, and Recnum
2. Build the Transfer stage
3. Design the Job and add the Stage to the Job
4. Run the Job
5. Review Results
Task 1: Create the Output File
Exercise steps:
1. On the left pane, select the file AUTOHOME
2. Click the COPY icon on the toolbar
3. Select the DATAFILE DEFINITIONS folder
4. Click the PASTE icon on the toolbar
5. When prompted to SELECT DATAFILE NAME type in COMBINED, click OK

QualityStage Essentials 7.0

Module 5: Adding a Unique Key

6. On the left pane, select the new COMBINED file to display the field definitions
on the right pane
7. On the right pane, right-click and choose NEW FIELD
8. Complete the NEW DATAFIELD screen as follows:
Name: RECKEY
Start position: 1
Length: 12
Description: UNIQUE RECORD KEY
Check the SHIFT ALL SUBSEQUENT FIELDS BOX
Click APPLY
9. Add the FILEID field
Name: FILEID
Start position: 1
Length: 2
Description: SOURCE FILE IDENTIFIER
Click APPLY
10. Add the Recnum field
Name: RECNUM
Start position: 3
Length: 9
Description: SEQUENTIAL RECORD NUMBER
Click OK
Task 2: Build Transfer Stage and Design Job
Exercise steps:
1. Define a Transfer Stage and name it AHKEY
Data File: AUTOHOME
Results File: COMBINE

QualityStage Essentials 7.0

Module 5: Adding a Unique Key

2. Assignment command: ASSIGN


In the FIELD ASSIGNMENT box type in: AH (AUTOHOME source), use
uppercase
Output File: select field FILED
Choose ADD COMMAND button
3. SEQUENCE COMMAND: Click radio button
Output File: select field RECNUM
Choose ADD COMMAND button
4. Movement Commands: MOVE LEFT
Input field: SYSSRC
Output File: SYSSRC
Choose ADD COMMAND button
5. Repeat step four for the remainder of the fields, selecting the identical input
and output field names

QualityStage Essentials 7.0

Module 5: Adding a Unique Key

6. Choose FINISH
7. Design Job named: ADDKEY
8. Add the AHKEY stage to the ADDKEY Job
Task 3: Run the Job
Task 4: Review results
Task 5: Repeat steps 1-8 for the LIFE file (be sure to check Append to File)

QualityStage Essentials 7.0

Module 6: Standardize

Module 6: Standardize
Exercise 6-1 Country Standardize
Goal: Assign a two byte ISO Country code the records so that we can then split
the US Data from the NON US Data
Exercise Information:
The Standardize stage reads in one input file and writes out one output
file
The output file must be defined in the QualityStage Designer. No output
fields need to be defined. The dictionary file of the rule set will populate
the output field definitions
Define the output file and name it CNTRYOUT
Standardize stages can be added to Jobs with other stages (except
Investigate)
Name the Standardize Country stage: CNTRYSTN
Name the Job: STAN
Tasks:
1. Define the output file
2. Define the Stage
3. Design the Job
4. Add the CNTRYSTN Stage to the STAN Job
5. Run the Job
6. Review Results
Task 1: Define the Output File
Exercise steps:
1. Create a Datafile Definition for the file: CNTRYOUT
Task 2: Define the Stage
Exercise steps:
1. Create a new Standardize stage named: CNTRYSTN
Data File: COMBINE
Results File: CNTRYOUT
Options: APPEND ALL
Default Output format: UPPERCASE ALL

41

QualityStage Essentials 7.0

Module 6: Standardize

NEXT
Choose the rule set COUNTRY
In THE LITERAL BOX type in the meta data label for US: ZQUSZQ
Click the ARROW BOX to add the meta data label to the STANDARD FIELDS
box
Add the fields below to the STANDARDS FIELDS box:
A. Address1
B. Address2
C. City
D. State
E. Zip
Choose ADD RULE

42

QualityStage Essentials 7.0

Module 6: Standardize

ChoosE FINISH
Task 3: Run the Job
Task 4: Review results

43

QualityStage Essentials 7.0

Module 6: Standardize

Exercise 6-2: Select Stage: Split US Data From Non-US Data


Goal: Separate the US data records from the non-us data records. We will
continue processing the US data and set aside the international data
Exercise Information:
The Select stage reads in one input file and writes out up to two output
file(s)
The output file(s) have the same structure as the input file
Copy the input Datafile Definition, CNTRYOUT, to the two output
files:
USDATA
INTLDATA
Name the Select stage: SPLIT
Add the stage to the Job: STAN
Tasks:
1. Copy the input file datafile definition to create the output file definitions
2. Define the Select Stage
3. Add the stage to the STAN Job. Order is important. So, this stage should
come after the CNTRYSTN stage
4. Run the Job
5. Review Results
Task 1: Define the Output File
Exercise steps:
1. Copy the CNTRYOUT file definition to create the output files
Task 2: Define the Stage
Exercise steps:
1. Create a new Select stage named: SPLIT
Data File: CNTRYOUT
Options: SPLIT
Accept File: USDATA
Reject File: INTLDATA

44

QualityStage Essentials 7.0

Module 6: Standardize

NEXT
Choose the field to split the data, the two byte ISO country code:
CCCOUNT
NEXT
Enter the value, US, to select the records with an ISO country CODE OF
US and reject the records that do not have US as their ISO country code

FINISH
Add the SPLIT stage to the STAN Job

Task 3: Run the Job

45

QualityStage Essentials 7.0

Module 6: Standardize

Task 4: Review results


US Data

International Data

46

QualityStage Essentials 7.0

Module 6: Standardize

Exercise 6-3: Standardize: Domain Pre-processing


Goal: Apply the USPREP rule set to filter name components from address fields,
and area components from address fields.
Exercise Information:
The US Data is the input to the USPREP rule set
Define the output file: PREPOUT
Name the Standardize stage: PREPSTAN
Add the stage to the Job: STAN
Tasks:
1. Define the output file
2. Define the Standardize Stage
3. Add the stage to the STAN Job. Order is important this stage should come
after the SPLIT stage
4. Run the Job
5. Review Results
Task 1: Define the Output File
Exercise steps:
1. Define the output file PREPOUT
Task 2: Define the Stage
Exercise steps:
1. Create a new Standardize stage named: PREPSTAN
Data File: USDATA
Output File: PREPOUT
Options: APPEND ALL
Case Formatting: ALL UPPERCASE
2. CHOOSE NEXT
Select the rule set USPREP
Type the Name meta data label in the literal field; ZQNAMEZQ
Select the Arrow button to move the meta data label to the Standard
Fields window:

47

QualityStage Essentials 7.0

Module 6: Standardize

Add the field NAME


Add meta data label ZQADDRZQ
Add the field ADDR1
Add meta data label ZQADDRZQ
Add the field ADDR2
Add meta data label ZQAREAZQ
Add the field CITY
Add meta data label ZQAREAZQ
Add the field STATE
Add meta data label ZQAREAZQ
Add the field ZIP
Add the RULE
Select FINISH
FINISH
Add the PREP stage to the STAN Job
Task 3: Run the Job
Task 4: Review results

48

QualityStage Essentials 7.0

Module 6: Standardize

Exercise 6-4: Standardize US Name, Address and Area Data


Goal: Standardize the NAME, ADDRESS, and AREA (City, State and Zip) fields
Exercise Information:
The Standardize stage reads in one input file and writes out one output file
The output file must be defined in the QualityStage Designer. No output
fields need to be defined. The dictionary file of the rule set(s) will populate
the output field definitions
Define the output file and name it STANOUT
Standardize stages can be added to Jobs with other stages (except
Investigate)
Name the Standardize stage: STANALL
Add this stage to the Job: STAN
Tasks:
1. Define the output file
2. Define the Stage
3. Add the Stage to the Job
4. Run the Job
5. Review Results
Task 1: Define the Output File
Exercise steps:
1. Create a Datafile Definition for the file: STANOUT
49

QualityStage Essentials 7.0

Module 6: Standardize

Task 2: Define the Stage


Exercise steps:
1. Create a new Standardize stage named: STANALL
Data File: PREPOUT
Results File: STANOUT
Options: APPEND ALL
Default Output format: UPPERCASE ALL
NEXT
2. Choose the rule set USNAME
Add the fields below to the STANDARDS FIELDS box:
A. NAUSPRE
Choose ADD RULE
3. Choose the rule set USADDR
Add the fields below to the STANDARDS FIELDS box:
A. ADUSPRE
Choose ADD RULE
4. Choose the rule set USAREA
Add the fields below to the STANDARDS FIELDS box:
A. ARUSPRE
Choose ADD RULE

ChoosE FINISH
5. Add the stage to the Job: STAN
50

QualityStage Essentials 7.0

Module 6: Standardize

Task 3: Run the Job


Task 4: Review results

51

QualityStage Essentials 7.0

Module 6: Standardize

EXERCISE 6-5: Investigate Unhandled Name Patterns


Goal: Use the Investigate Stage to review the results of the Name standardization
Exercise Information:
The Investigate stage output files are reports
Reports are pre-formatted text files and do not require a file definition in
QualityStage
Investigate stages must be the only stage within a job
Advanced settings:
Increase the sample size to 5
The frequency should also be set to 1
Name the Investigate stage: INUPNM (Investigate Unhandled Patterns for
Name data)
Name the job the same as the stage
Tasks:
1. Build an Investigate Character Concatenate stage to investigate the data in
the following fields:
UPUSNAME - Unhandled Name Patterns
UDUSNAME Unhandled Data
IPUSNAME Input Pattern Name
NAME Original Name data
2. Design the Job
3. Add the stage to the Job
4. Run the Job
5. Review the results
Task 1: Build the Stage and Design the Job
Exercise steps:
1. Add the Investigate stage
2. Complete the QualityStage Investigate Stage Wizard as follows:
Name: INUPNM
Description: Char Concat C Unhandled Name Patterns
Options: Character Concatenate
Data File: STANOUT
3. Click NEXT

52

QualityStage Essentials 7.0

Module 6: Standardize

4. Select the UPUSNAME field


The Field Mask Selection screen will pop-up
Choose the ALL C BUTTON
5. Click OK
6. Select the remaining fields, one at a time
UDUSNAME
IPUSNAME
NAME
The Field Mask Selection screen will pop-up
Choose the ALL X BUTTON

7. Choose ADVANCED OPTIONS


8. Increase the Sample Size to: 5

9. Choose OK
10. Choose FINISH
53

QualityStage Essentials 7.0

Module 6: Standardize

11. Design the Job, name it: INUPNM


12. Add the stage to the Job
Task 2: Run the Job
Task 3: Review Results

54

QualityStage Essentials 7.0

Module 6: Standardize

LAB 6-6: Investigate Unhandled Address and Area Patterns


Goal: Assess the standardize results by investigating the
AREA patterns

UNHANDLED ADDRESS

and

Exercise Information:
The Investigate stage output files are reports
Reports are pre-formatted text files and do not require a file definition in
QualityStage
Investigate stages must be the only stage within a job
Advanced settings:
Increase the sample size to 5
Leave the frequency set to 1
Name the Investigate stage: INUPADDR (Investigate unhandled address
and area patterns)
Name the job the same as the stage

55

QualityStage Essentials 7.0

Module 7: Rule Set Overrides

Module 7: Rule Set Overrides


Exercise 7-1: USName Rule Set Overrides
Goal: Apply User Overrides to process the records with unhandled patterns and
data.
Exercise Information:
Review the file INUPNM, The Investigate results of Unhandled Name
patterns
Summary of Investigate of Unhandled Name patterns:
82.379% have no unhandled patterns or data
6.483% have the unhandled pattern +FI
1.345% have the unhandled pattern +, +
1.172% have the unhandled pattern FFI
82 different unhandled patterns
Lets concentrate on the unhandled patterns that occur most
frequently
Remember Investigate stages cannot be added to Jobs with other stages
Tasks:
1. Review Investigate results of Unhandled Name Patterns
2. Add User Override
3. Use Standardization Rules Analyzer to test the override
4. Apply all overrides
5. Re-run STAN job to apply the overrides to the results file
6. Review Results
Task 1: Review Results of the Investigate Unhandled Name Patterns
1. On the left pane, select the JOBS folder
2. On the right pane, select the job INUPNM, right-click and choose SERVER
REPORTS & DATAFILES
3. Select INUPNMP.FRQ, and choose VIEW FILE
UNHANDLED PATTERN +FI

This pattern +FI represents unclassified last name (standard rule set approach),
a first name, followed by a middle initial
56

QualityStage Essentials 7.0

Module 7: Rule Set Overrides

The data appears to have been classified correctly so a classification override


wont process this data correctly
Notice that the input pattern is identical to the unhandled pattern. This is an
indication that the pattern did not match any patterns in the pattern action file and
an INPUT PATTERN OVERRIDE would cause that pattern to be processed

Unhandled
Pattern
+FI
+FI
+FI
+FI
+FI

Unhandled Data

Input
Pattern

Input Name Text

DAMORA WILLIAM H
PEPE NANCY J
KOPPLIN ELDEN E
KRATOCHWILL TOMAS R
LEININGER SALLY P

+FI
+FI
+FI
+FI
+FI

DAMORA WILLIAM H
PEPE NANCY J
KOPPLIN ELDEN E
KRATOCHWILL TOMAS R
LEININGER SALLY P

Task 2: Apply Input Pattern Override


1. From the QualityStage main menu select Rules from the menu bar, and then
choose STANDARDIZATION OVERRIDES. Select the USNAME rule set.
2. Select the INPUT PATTERN tab

3. Enter Input Pattern: +FI


4. From the CURRENT PATTERN LIST select the first entry, +
5. From the USER OVERRIDE OPTIONS choose:
Dictionary Field: LN-PRIMARY NAME
Move Current
57

QualityStage Essentials 7.0

Module 7: Rule Set Overrides

Original Value
No Leading Space
6. Repeat the process for the remaining tokens using the following settings
F TOKEN
Dictionary Field: FN-FIRST NAME
Move Current
Original Value
Leading Space
I TOKEN
Dictionary Fields: MN-MIDDLE NAME
Move Current
Original Value
Leading Space
7. Under Override Summary choose ADD
8. Select APPLY, then OK
Task 3: Test the Override Using the Standardization Rules Analyzer
1. From the QualityStage main menu choose Rules from the menu bar, and then
choose STANDARDIZATION RULES ANALYZER. Then select the rule set
USNAME.
2. Enter the text string from the first record listed in the Investigate Unhandled
Pattern report
3. Select TEST THIS STRING

58

QualityStage Essentials 7.0

Module 7: Rule Set Overrides

4. See how the pattern is now correctly processed by the rule set
5. Test all the names for this pattern
6. Select EXIT
UNHANDLED PATTERN +,+

This pattern +,+ represents unclassified last name (standard rule set approach),
a comma, and an unclassified first name
The first name values appear to not be classified as first names. One approach
would be to review the first names for addition to the classification table
You may also want to check the report IAHNMN.DLT to check the frequency of
the data value, the more often it occurs the more likely we are to classify that
word

Unhandled
Pattern
+,+
+,+
+,+
+,+
+,+

Unhandled Data

Input
Pattern

Input Name Text

HOCHREITER, CAROLYNNE
HAYWARD, WINSLOW
ESHAGHIAN, JOUBIN
SODIA, MARVYN
SAKURAZAWA, HARUKO

+,+
+,+
+,+
+,+
+,+

HOCHREITER, CAROLYNNE
HAYWARD, WINSLOW
ESHAGHIAN, JOUBIN
SODIA, MARVYN
SAKURAZAWA, HARUKO

Task 4: Apply Classification Override

59

QualityStage Essentials 7.0

Module 7: Rule Set Overrides

1. From the QualityStage main menu select Rules from the menu bar, then
choose STANDARDIZATION OVERRIDES. Select the USNAME rule set.
2. Select the CLASSIFICATION tab

3. Enter Input token: CAROLYNNE


4. Standard Form: CAROLYNNE
5. From the classification drop down menu choose, F-FIRST NAME
6. Under Override Summary choose ADD
7. Repeat the process for the remaining tokens F TOKEN
WINSLOW
JOUBIN
MARVYN
HARUKO
8. Test the Override with the Standardization Rules Analyzer
UNHANDLED PATTERN FFI

This pattern FFI represents a last name that was recognized as a first name, a
classified first name and an initial. Notice this data does not include a comma
providing context to the first and last name tokens
In this case the input pattern for all the sample records is not the same as the
unhandled pattern. There are 2 distinct input patterns and one distinct unhandled
pattern
60

QualityStage Essentials 7.0

Module 7: Rule Set Overrides

Applying the override to the unhandled pattern will allow us to add one override.
If we had chosen the input pattern override then we would need to add an
override for each pattern.
This is an indication that the pattern did not match any patterns in the pattern
action file and an UNHANDLED PATTERN OVERRIDE would cause this pattern to be
processed

Unhandled
Pattern
FFI
FFI
FFI
FFI
FFI

Unhandled Data

Input
Pattern

Input Name Text

HARRIS MARJORIE M
ROSS JOSEPH P
YOUNG THERESA C
OLIVA LAWRENCE M
LANG LEE B

+FI.
+FI
+FI.
+FI
+FI

HARRIS MARJORIE M.
ROSS JOSEPH P
YOUNG THERESA C.
OLIVA LAWRENCE M
LANG LEE B

Task 5: Apply Input Pattern Override


1. From the QualityStage main menu select Rules from the menu bar, and then
choose STANDARDIZATION OVERRIDES. Select the USNAME rule set.
2. Select the UNHANDLED PATTERN tab
3. Enter Unhandled Pattern: FFI
4. From the CURRENT PATTERN LIST select the first entry, F
5. From the USER OVERRIDE OPTIONS choose:
Dictionary Fields: LN-PRIMARY NAME
Move Current
Original Value
No Leading Space
6. Repeat the process for the remaining tokens using the following settings
F TOKEN
Dictionary Fields: FN-FIRST NAME
Move Current
Original Value
No Leading Space
I TOKEN
61

QualityStage Essentials 7.0

Module 7: Rule Set Overrides

Dictionary Fields: MN-MIDDLE NAME


Move Current
Original Value
No Leading Space

7. Under OVERRIDE SUMMARY choose ADD


8. Select APPLY, then OK
9. Test the Overrides using the STANDARDIZATION RULES ANALYZER

62

QualityStage Essentials 7.0

Module 7: Rule Set Overrides

Lab 7-1: US Address Rule Set Overrides


Goal: Apply User Overrides to process the records with unhandled patterns and
data.
Exercise Information:
Review the file INUPADDR, The Investigate results of Unhandled Address
patterns
Summary of Investigate of Unhandled Name patterns:
97.59% have no unhandled patterns or data
Lets concentrate on the unhandled patterns that occur most
frequently
Remember Investigate stages cannot be added to Jobs with other stages

Tasks:
1. Review Investigate results of Unhandled Address Patterns
2. Add User Override
3. Use Standardization Rules Analyzer to test the override
4. Apply all overrides
5. Re-run STAN job to apply the overrides to the results file
6. Review Results

63

QualityStage Essentials 7.0

Module 8: Matching

Module 8: Matching

64

QualityStage Essentials 7.0

Module 8: Matching

EXERCISE 8-1: Unduplication Match


Goal: Find duplicate customers from the policy data
Exercise Information:
Business rule for identifying unique customers are:
Same Federal ID
Same Primary Name at the same address
Same First Name at the same address
The Match stage can output up to three files:
Match Extract
Match Report
Match Statistics File
The Match Extract file requires a file definition, the Match report & Match
Statistics file do not need a file definition
Name the Match stage: MATCH
Name the job the same as the stage
Tasks:
1. Define the Match Extract file (Undup output file)
2. Build the Match stage: Add Pass 1
Task 1: Define the Match Extract File
1. From the QualityStage main window, on the left pane select the Datafile
Definition Stanout.
2. Select the Copy icon from the tool bar, select the folder Datafile Definitions,
select the Paste icon on the toolbar. Name the new Datafile Definition Undup
3. Select the Undup Datafile Definition, on the right pane right-click and add the
following fields:
FIELD

DESCRIPTION

START

LENGTH

SHIFT
SUBSEQUENT
FIELDS

TEMP

TEMP SHIFT FIELD

22

SETID

MATCH SET ID (GROUP ID)

TYPE

MATCH RECORD TYPE

11

PASS

MATCH PASS NUMBER

13

WEIGHT

MATCH COMPOSITE WEIGHT

15

Task 2: Define Match Stage and Pass 1


65

QualityStage Essentials 7.0

Module 8: Matching

1. Add Match Stage

Name: MATCH

Description: IDENTIFY DUPLICATE CUSTOMERS

Options: UNDUP

Data File: STANOUT

2. SELECT CUSTOM
2. Choose ADD to add Pass Number 1
Description: P1 NAME & STREET ADDRESS

3. Choose BLOCK criteria


N1USNAM NYSIIS Match Primary Word 1 (position 396)
ATUSADD Address Type (position 843)
NSUSADD NYSIIS Street Name (position 844)
ZIP3 First Three Bytes of Standardized Zip code
A.

Add this field overlay definition on the fly

B.

Right-click in the Available Data Fields window and choose


field

C.

Start position 1055

D.

Length: 3

ADD

4. Click NEXT

66

QualityStage Essentials 7.0

Module 8: Matching

5. Add the following fields for match comparison


START
POSITION

DESCRIPTION

TYPE OF
COMPARISON

GCUSNAM

Gender Code

CHAR

MNUSNAM

48

Middle Name

CHAR

MFUSNAM

203

Match First Name

NAME_UNCERT

800

LNUSNAM

73

Primary Name

UNCERT

800

HNUSADD

607

House Number

CHAR

HSUSADD

617

House Suffix

CHAR

PDUSADD

627

Pre-direction

CHAR

SNUSADD

650

Street name

UNCERT

800

SDUSADD

685

Suffix Directional

CHAR

RVUSADD

691

Rural Route Value

CHAR

BVUSADD

708

Box Value

CHAR

BNUSADD

763

Building Name

UNCERT

800

ZCUSARE

1055

Zip code

CNT_DIFF

FEDID

2040

Federal ID

CNT_DIFF

FIELD

PARAMETERS

6. Click OK

67

QualityStage Essentials 7.0

Module 8: Matching

EXERCISE 8-2: Custom Match Extract


Goal: Create a customer Match extract specification for the match Undup
Exercise Information:
The Match stage can output up to three files:
Match Extract
Match Report
Match Statistics File
The Match Extract file requires a file definition, the Match report &
Match Statistics file do not need a file definition
Tasks:
1. Define the Custom Extract
2. Run the Match job and review results
Task 1: Define Custom Match Extract
1. From the Match Wizard Match Specification screen choose the EXTRACT
button
2. From Select outputs box select the UNDUP file definition
3. Select the GROUPALL button

68

QualityStage Essentials 7.0

Module 8: Matching

4. Match Extract Specification


STATEMENT TYPE

ARGUMENT

LITERAL/VARIABLE

MOVE

Variable

@SET9

MOVE

Literal

Type in a space (no quotes)

MOVE

Variable

@TYPE

MOVE

Variable

@PASS

MOVE

Literal

Type in a space (no quotes)

MOVE

Variable

@WGT

MOVE

Literal

Type in a space (no quotes)

MOVEALL

Of A

Task 2: Run the Match Job


1. Choose FINISH on the Match Wizard Match Specifications screen
2. Define the Match Job
3. Add the Match stage to the Match job
4. RUN the Match job
5. From the QUALITYSTAGE JOB RUN OPTIONS screen choose EXECUTE FILE
MODE

69

QualityStage Essentials 7.0

Module 8: Matching

6. Check the RUN box


7. Check the EXTRACt box
8. Choose ADVANCED RUN OPTIONS

Next to Match Debug type in: STATS

9. Choose OK
10. Choose RUN FROM START

TO

END

Task 5: Review Results


1. Locate the UNDUP output file

70

QualityStage Essentials 7.0

Module 8: Matching

EXERCISE 8-3: Improve Match Pass 1: Set Critical Vartypes


Goal: Improve Match results by making the FEDID field a critical field. If the
FEDIDs are populated and do not match exactly then the records may not come
together.
Tasks:
1. Modify the Match stage and add a VarType
2. Re-run the match
Task 1: Modify Match Stage
1. On the left pane select the STAGES folder, on the right pane select the MATCH
STAGE and right-click, choose MODIFY
2. Click Next on the Match Stage Wizard screen
3. From the MATCH WIZARD MATCH SPECIFICATIONS screen select VARTYPE

4. Set the Action to CRITICAL MISSING OK

71

QualityStage Essentials 7.0

Module 8: Matching

5. Select the FEDID field


6. Click the arrow button the right of the FedID field
7. Select Add Vartype
8. Select OK
9. Click FINISH
10. RUN the Match job
11. From the QUALITYSTAGE JOB RUN OPTIONS screen choose EXECUTE FILE
MODE
12. Check the RUN box
13. Check the EXTRACt box
14. Choose ADVANCED RUN OPTIONS
15. Next to Match Debug type in: STATS
16. Choose OK
17. Choose RUN FROM START

TO

END

Task 2: Review Results


1. Locate the UNDUP output file

72

QualityStage Essentials 7.0

Module 8: Matching

EXERCISE 8-4: Set Match Cutoffs


Goal: Set cutoffs for Match pass 1
Tasks:
1. Modify the Match stage and add match cutoffs
2. Re-run the match
Task 1: Modify Match Stage
1. On the left pane select the STAGES folder, on the right pane select the MATCH
STAGE and right-click, choose MODIFY
2. Click Next on the Match Stage Wizard screen
3. From the MATCH WIZARD MATCH SPECIFICATIONS screen select PASS 1,
choose MODIFY PASS
4. Choose NEXT on the MATCH WIZARD BLOCKING VARIABLES screen
5. Add the following match cutoffs

MATCH: 25

CLERICAL: 25

6. Click OK
7. RUN the Match job
Task 2: Review Results
1. Locate the UNDUP output file

73

QualityStage Essentials 7.0

Module 8: Matching

EXERCISE 8-5: Match Pass 2


Goal: Add a second Match Pass
Tasks:
1. Modify the Match stage and add second match pass
2. Re-run the match
Task 1: Modify Match Stage and add Match Pass 2
1. On the left pane select the STAGES folder, on the right pane select the MATCH
STAGE and right-click, choose MODIFY
2. Click Next on the Match Stage Wizard screen
3. From the MATCH WIZARD MATCH SPECIFICATIONS screen select ADD
Description: P2 NAME & BOX ADDRESS
1. Choose BLOCK criteria
N1USNAM NYSIIS Match Primary Word 1
ATUSADD Address Type
BVUSADD Box Value
ZIP3 First Three Bytes of Standardized Zip code

74

QualityStage Essentials 7.0

Module 9: Survivorship

Module 9: Survivorship
EXERCISE 9-1: Survive Best of Breed Customer Record
Goal: Survive one customer record for load into Winns new CRM system
Tasks:
1. Add field overlay definitions to the input file ( UNDUP)
2. Copy UNDUP Datafile Definition to SURVOUT
3. Define Survive stage
4. Run Survive job and review results
Task 1: Add Field Overlay Definitions to the Input File (UNDUP)
1. On the left pane select the UNDUP DATAFILE DEFINITION folder, on the right
pane right-click, choose NEW FIELD
2. Add the following fields:
FIELD NAME

DESCRIPTION

START

LENGTH

SURVREC

Fields to Survive

1270

NAMEC

Name fields to survive

25

150

ADDRC

Address fields to survive

629

236

AREAC

Area fields to survive

1044

44

Task 2: Copy Datafile Definition


1. Copy the Datafile Definition UNDUP to SURVOUT
Task 3: Define Survive stage
1. Add a SURVIVE stage, name it: SURVIVE

Description: BEST OF BREED CUSTOMER

Options: PRE-SORT INPUT DATAFILE

Data File: UNDUP

Results File: SURVOUT

2. Click NEXT
3. From the QUALITYSTAGE SURVIVE WIZARD, choose the SETID field
4. Click NEXT

75

QualityStage Essentials 7.0

Module 9: Survivorship

5. On the Survivorship Rules Definition screen add the following rules:


TARGET FIELD

ANALYZE FIELD

TECHNIQUE

DATA

SURVREC

TYPE

EQUALS

XA

NAMEC

NAMEC

Most Frequent (Nonblank)

ADDRC

ADDRC

Most Frequent (Nonblank)

AREAC

AREAC

Most Frequent (Nonblank)

SURVREC

TYPE

EQUALS

RA

6. Click EXIT
Task 4: Add Stage to Job, Run & Review Results
1. Define the SURVIVE job
2. Add the SURVIVE stage to the SURVIVE job
3. RUN the SURVIVE job
Task 5: Review Results

76

QualityStage Essentials 7.0

Appendix A: Course Application Design

Appendix A: Course Application Design

77

QualityStage Essentials 7.0

Appendix B: File Layouts

Appendix B: File Layouts


Table Of Contents
Name
AUTOHOME
CNTRYOUT
COMBINED
INTLDATA
LIFE
PREPOUT
STANOUT
UNDUP
USDATA
SURVOUT

Description
Auto and Home Policies
Data with Appended ISO Country Code
Auto and Home Policies
Records with International Address Data
Life Policies
Pre-processed US Data
Standardized Name
Identify duplicate records
Records with USDATA for further
process
Survived Best of Breed Customer
Records

AUTOHOME - Auto and Home Policies


Name
SYSSRC
POLNUMB
NAME
ADDR1
ADDR2
CITY
STATE
ZIP
FEDID
DOB
DOD

Description
Source System
ID
Policy Number
Full Name
Address 1
Address 2
City Name
State
Abbreviation
Zip Code
Federal ID
Date of Birth
Date of Death

DataType

UseTyp
e

Missin
g

Star
t

En
d

Lengt
h

ALPHANUM
ALPHANUM
ALPHANUM
ALPHANUM
ALPHANUM
ALPHANUM

S
S
S
S
S
S

S
S
S
S
S
S

1
2
14
60
95
130

0
0
0
0
0
0

1
12
46
35
35
35

ALPHANUM
ALPHANUM
ALPHANUM
ALPHANUM
ALPHANUM

S
S
S
S
S

S
S
S
S
S

165
170
180
190
198

0
0
0
0
0

5
10
10
8
8

78

QualityStage Essentials 7.0

Appendix B: File Layouts

LIFE - Life Policies


Name

Description

DataType

UseTyp
e

Missin
g

Star
t

En
d

Lengt
h

SYSSRC

Source System ID

ALPHANUM

POLNUMB

Policy Number

ALPHANUM

NAME

Full Name

ALPHANUM

12

14

ADDR1

Address 1

ALPHANUM

46

60

ADDR2

Address 2

35

ALPHANUM

95

CITY

35

City Name

ALPHANUM

130

35

STATE

State

ALPHANUM

165

ZIPCODE

Zip Code

ALPHANUM

170

10

FEDID

Federal ID

ALPHANUM

180

10

DOB

Date of Birth

ALPHANUM

190

DOD

Date of Death

ALPHANUM

198

COMBIED - Auto and Home Policies


Name

Description

DataType

UseType

Missin
g

Star
t

En
d

Lengt
h

RECKEY

Unique Record Key

ALPHANUM

12

FILEID

ALPHANUM

RECNUM

Source File Indicator


Sequential Record
Number

SYSSRC

Source System ID

ALPHANUM

ALPHANUM

13

POLNUMB
NAME

Policy Number

ALPHANUM

14

12

Full Name

ALPHANUM

26

46

ADDR1

Address 1

ALPHANUM

72

35

ADDR2

Address 2

ALPHANUM

107

35

CITY

City Name

ALPHANUM

142

35

STATE

State Abbreviation

ALPHANUM

177

ZIP

Zip Code

ALPHANUM

182

10

FEDID

Federal ID

ALPHANUM

192

10

DOB

Date of Birth

ALPHANUM

202

DOD

Date of Death

ALPHANUM

210

79

QualityStage Essentials 7.0

Appendix B: File Layouts

CNTRYOUT - Data with Appended ISO Country Code


Name

Description

DataType

UseTyp
e

Missin
g

Star
t

En
d

Lengt
h

CCCOUNT

ISOCountryCode

ALPHANUM

IFCOUNT

IdentifierFlag

ALPHANUM

RECKEY

Unique Record Key

ALPHANUM

12

FILEID

Source File Indicator

RECNUM

Sequential Record Number

ALPHANUM

ALPHANUM

SYSSRC

Source System ID

ALPHANUM

18

POLNUMB

Policy Number

ALPHANUM

19

12

NAME

Full Name

ALPHANUM

31

46

ADDR1

Address 1

ALPHANUM

77

35

ADDR2

Address 2

ALPHANUM

112

35

CITY

City Name

ALPHANUM

147

35

STATE

State Abbreviation

ALPHANUM

182

ZIP

Zip Code

ALPHANUM

187

10

FEDID

Federal ID

ALPHANUM

197

10

DOB

Date of Birth

ALPHANUM

207

DOD

Date of Death

ALPHANUM

215

Star
t

En
d

Lengt
h
3

USDATA - Records with USDATA for further process


Name

Description

DataType

UseTyp
e

Missin
g

CCCOUNT

ISOCountryCode

ALPHANUM

IFCOUNT

IdentifierFlag

ALPHANUM

RECKEY

Unique Record Key

ALPHANUM

12

FILEID

ALPHANUM

RECNUM

Source File Indicator


Sequential Record
Number

ALPHANUM

SYSSRC

Source System ID

ALPHANUM

18

POLNUMB

Policy Number

ALPHANUM

19

12

NAME

Full Name

ALPHANUM

31

46

ADDR1

Address 1

ALPHANUM

77

35

ADDR2

Address 2

ALPHANUM

112

35

CITY

City Name

ALPHANUM

147

35

STATE

State Abbreviation

ALPHANUM

182

ZIP

Zip Code

ALPHANUM

187

10

FEDID

Federal ID

ALPHANUM

197

10

DOB

Date of Birth

ALPHANUM

207

DOD

Date of Death

ALPHANUM

215

80

QualityStage Essentials 7.0

Appendix B: File Layouts

PREPOUT - Pre-processed US Data


Name

Description

DataType

UseTyp
e

Missin
g

Star
t

En
d

Lengt
h

NAUSPRE

NameDomain

ALPHANUM

100

ADUSPRE

AddressDomain

ALPHANUM

ARUSPRE

AreaDomain

ALPHANUM

101

100

201

P1USPRE

Field1Pattern

100

ALPHANUM

301

P2USPRE

20

Field2Pattern

ALPHANUM

321

20

P3USPRE

Field3Pattern

ALPHANUM

341

20

P4USPRE

Field4Pattern

ALPHANUM

361

20

P5USPRE

Field5Pattern

ALPHANUM

381

20

P6USPRE

Field6Pattern

ALPHANUM

401

20

IPUSPRE

InputPattern

ALPHANUM

421

88

OPUSPRE

OutboundPattern

ALPHANUM

509

88

UOUSPRE

UserOverrideFlag

ALPHANUM

597

CFUSPRE

CustomFlag

ALPHANUM

599

CCCOUNT

ISOCountryCode

ALPHANUM

601

IFCOUNT

IdentifierFlag

ALPHANUM

604

RECKEY

Unique Record Key

ALPHANUM

606

12

FILEID

Source File Indicator

ALPHANUM

606

RECNUM

Sequential Record Number

ALPHANUM

608

SYSSRC

Source System ID

ALPHANUM

618

POLNUMB

Policy Number

ALPHANUM

619

12

NAME

Full Name

ALPHANUM

631

46

ADDR1

Address 1

ALPHANUM

677

35

ADDR2

Address 2

ALPHANUM

712

35

CITY

City Name

ALPHANUM

747

35

STATE

State Abbreviation

ALPHANUM

782

ZIP

Zip Code

ALPHANUM

787

10

FEDID

Federal ID

ALPHANUM

797

10

DOB

Date of Birth

ALPHANUM

807

DOD

Date of Death

ALPHANUM

815

81

QualityStage Essentials 7.0

Appendix B: File Layouts

STANOUT - Standardized Name, Address & Area


Name

Description

DataType

UseTyp
e

Missin
g

Star
t

En
d

Lengt
h

NTUSNAM

NameType

ALPHANUM

GCUSNAM

GenderCode

ALPHANUM

NPUSNAM

NamePrefix

ALPHANUM

20

FNUSNAM

FirstName

MNUSNAM

MiddleName

ALPHANUM

23

25

ALPHANUM

48

LNUSNAM

25

PrimaryName

ALPHANUM

73

50

NGUSNAM

NameGeneration

ALPHANUM

123

10

NSUSNAM

NameSuffix

ALPHANUM

133

20

ANUSNAM

AdditionalNameInformation

ALPHANUM

153

50

MFUSNAM

MatchFirstName

ALPHANUM

203

25

NFUSNAM

NYSIISofMatchFirstName

ALPHANUM

228

SFUSNAM

RSoundexofMatchFirstName

ALPHANUM

236

MLUSNAM

MatchPrimaryName

ALPHANUM

240

50

HKUSNAM

HashKeyofMatchPrimaryName

ALPHANUM

290

10

PKUSNAM

PackedKeyofMatchPrimaryName

ALPHANUM

300

20

NWUSNAM

NumberofMatchPrimaryWords

ALPHANUM

320

W1USNAM

MatchPrimaryWord1

ALPHANUM

321

15

W2USNAM

MatchPrimaryWord2

ALPHANUM

336

15

W3USNAM

MatchPrimaryWord3

ALPHANUM

351

15

W4USNAM

MatchPrimaryWord4

ALPHANUM

366

15

W5USNAM

MatchPrimaryWord5

ALPHANUM

381

15

N1USNAM

NYSIISofMatchPrimaryWord1

ALPHANUM

396

S1USNAM

RSoundexofMatchPrimaryWord1

ALPHANUM

404

N2USNAM

NYSIISofMatchPrimaryWord2

ALPHANUM

408

S2USNAM

RSoundexofMatchPrimaryWord2

ALPHANUM

416

UPUSNAM

UnhandledPattern

ALPHANUM

420

30

UDUSNAM

UnhandledData

ALPHANUM

450

100

IPUSNAM

InputPattern

ALPHANUM

550

30

EDUSNAM

ExceptionData

ALPHANUM

580

25

UOUSNAM

UserOverrideFlag

ALPHANUM

605

HNUSADD

HouseNumber

ALPHANUM

607

10

HSUSADD

HouseNumberSuffix

ALPHANUM

617

10

PDUSADD

StreetPrefixDirectional

ALPHANUM

627

PTUSADD

StreetPrefixType

ALPHANUM

630

20

SNUSADD

StreetName

ALPHANUM

650

25

STUSADD

StreetSuffixType

ALPHANUM

675

SQUSADD

StreetSuffixQualifier

ALPHANUM

680

SDUSADD

StreetSuffixDirectional

ALPHANUM

685

RTUSADD

RuralRouteType

ALPHANUM

688

RVUSADD

RuralRouteValue

ALPHANUM

691

10

BTUSADD

BoxType

ALPHANUM

701

BVUSADD

BoxValue

ALPHANUM

708

10

FTUSADD

FloorType

ALPHANUM

718

FVUSADD

FloorValue

ALPHANUM

723

10

UTUSADD

UnitType

ALPHANUM

733

UVUSADD

UnitValue

ALPHANUM

738

10

82

QualityStage Essentials 7.0

Appendix B: File Layouts

MTUSADD

MultiUnitType

ALPHANUM

748

MVUSADD

MultiUnitValue

ALPHANUM

753

10

BNUSADD

BuildingName

ALPHANUM

763

30

AAUSADD

AdditionalAddressInformation

ALPHANUM

793

50

ATUSADD

AddressType

ALPHANUM

843

NSUSADD

NYSIISofStreetName

ALPHANUM

844

SSUSADD

ReverseSoundexofStreetName

ALPHANUM

852

UPUSADD

UnhandledPattern

ALPHANUM

856

30

UDUSADD

UnhandledData

ALPHANUM

886

50

IPUSADD

InputPattern

ALPHANUM

936

30

EDUSADD

ExceptionData

ALPHANUM

966

50

UOUSADD

UserOverrideFlag

ALPHANUM

1016

CNUSARE

CityName

ALPHANUM

1022

30

SAUSARE

StateAbbreviation

ALPHANUM

1052

ZCUSARE

ZipCode

ALPHANUM

1055

ZIP3

First three bytes of zip code

ALPHANUM

1055

Z4USARE

Zip4AddonCode

ALPHANUM

1060

CCUSARE

CountryCode

ALPHANUM

1064

NCUSARE

CityNYSIIS

ALPHANUM

1066

SCUSARE

CityReverseSoundex

ALPHANUM

1074

UPUSARE

UnhandledPattern

ALPHANUM

1078

30

UDUSARE

UnhandledData

ALPHANUM

1108

50

IPUSARE

InputPattern

ALPHANUM

1158

30

EDUSARE

ExceptionData

ALPHANUM

1188

50

UOUSARE

UserOverrideFlag

ALPHANUM

1238

NAUSPRE

NameDomain

ALPHANUM

1244

100

ADUSPRE

AddressDomain

ALPHANUM

1344

100

ARUSPRE

AreaDomain

ALPHANUM

1444

100

P1USPRE

Field1Pattern

ALPHANUM

1544

20

P2USPRE

Field2Pattern

ALPHANUM

1564

20

P3USPRE

Field3Pattern

ALPHANUM

1584

20

P4USPRE

Field4Pattern

ALPHANUM

1604

20

P5USPRE

Field5Pattern

ALPHANUM

1624

20

P6USPRE

Field6Pattern

ALPHANUM

1644

20

IPUSPRE

InputPattern

ALPHANUM

1664

88

OPUSPRE

OutboundPattern

ALPHANUM

1752

88

UOUSPRE

UserOverrideFlag

ALPHANUM

1840

CFUSPRE

CustomFlag

ALPHANUM

1842

CCCOUNT

ISOCountryCode

ALPHANUM

1844

IFCOUNT

IdentifierFlag

ALPHANUM

1847

RECKEY

Unique Record Key

ALPHANUM

1849

12

FILEID

Source File Indicator

ALPHANUM

1849

RECNUM

Sequential Record Number

ALPHANUM

1851

SYSSRC

Source System ID

ALPHANUM

1861

POLNUMB

Policy Number

ALPHANUM

1862

12

NAME

Full Name

ALPHANUM

1874

46

ADDR1

Address 1

ALPHANUM

1920

35

ADDR2

Address 2

ALPHANUM

1955

35

CITY

City Name

ALPHANUM

1990

35

STATE

State Abbreviation

ALPHANUM

2025

ZIP

Zip Code

ALPHANUM

2030

10

FEDID

Federal ID

ALPHANUM

2040

10

83

QualityStage Essentials 7.0

Appendix B: File Layouts

DOB

Date of Birth

ALPHANUM

2050

DOD

Date of Death

ALPHANUM

2058

84

QualityStage Essentials 7.0

Appendix B: File Layouts

UNDUP - Identify duplicate records


Name

Description

DataType

UseTyp
e

Missin
g

Start

En
d

Lengt
h

TEMP

Temporary shift field

ALPHANUM

22

SETID

Match Set Number

ALPHANUM

SURVREC

Best of Breed Customer Data

ALPHANUM

1270

PASS

Match Pass Number

TYPE
WEIGHT

Match Record Type


Match Composite Weight (Match
Score)

ALPHANUM

11

ALPHANUM

13

NTUSNAM

NameType

ALPHANUM

15

ALPHANUM

23

GCUSNAM

GenderCode

ALPHANUM

24

NPUSNAM

NamePrefix

ALPHANUM

25

20

NAMEC

name fields to survive

ALPHANUM

25

150

FNUSNAM

FirstName

ALPHANUM

45

25

MNUSNAM

MiddleName

ALPHANUM

70

25

LNUSNAM

PrimaryName

ALPHANUM

95

50

NGUSNAM

NameGeneration

ALPHANUM

145

10

NSUSNAM

NameSuffix

ALPHANUM

155

20

ANUSNAM

AdditionalNameInformation

ALPHANUM

175

50

MFUSNAM

MatchFirstName

ALPHANUM

225

25

NFUSNAM

NYSIISofMatchFirstName

ALPHANUM

250

SFUSNAM

RSoundexofMatchFirstName

ALPHANUM

258

MLUSNAM

MatchPrimaryName

ALPHANUM

262

50

HKUSNAM

HashKeyofMatchPrimaryName

ALPHANUM

312

10

PKUSNAM

PackedKeyofMatchPrimaryName

ALPHANUM

322

20

NWUSNAM

NumberofMatchPrimaryWords

ALPHANUM

342

W1USNAM

MatchPrimaryWord1

ALPHANUM

343

15

W2USNAM

MatchPrimaryWord2

ALPHANUM

358

15

W3USNAM

MatchPrimaryWord3

ALPHANUM

373

15

W4USNAM

MatchPrimaryWord4

ALPHANUM

388

15

W5USNAM

MatchPrimaryWord5

ALPHANUM

403

15

N1USNAM

NYSIISofMatchPrimaryWord1

ALPHANUM

418

S1USNAM

RSoundexofMatchPrimaryWord1

ALPHANUM

426

N2USNAM

NYSIISofMatchPrimaryWord2

ALPHANUM

430

S2USNAM

RSoundexofMatchPrimaryWord2

ALPHANUM

438

UPUSNAM

UnhandledPattern

ALPHANUM

442

30

UDUSNAM

UnhandledData

ALPHANUM

472

100

IPUSNAM

InputPattern

ALPHANUM

572

30

EDUSNAM

ExceptionData

ALPHANUM

602

25

UOUSNAM

UserOverrideFlag

ALPHANUM

627

HNUSADD

HouseNumber

ALPHANUM

629

10

ADDRC

address fields to survive

ALPHANUM

629

236

HSUSADD

HouseNumberSuffix

ALPHANUM

639

10

PDUSADD

StreetPrefixDirectional

ALPHANUM

649

PTUSADD

StreetPrefixType

ALPHANUM

652

20

SNUSADD

StreetName

ALPHANUM

672

25

STUSADD

StreetSuffixType

ALPHANUM

697

SQUSADD

StreetSuffixQualifier

ALPHANUM

702

SDUSADD

StreetSuffixDirectional

ALPHANUM

707

85

QualityStage Essentials 7.0

Appendix B: File Layouts

RTUSADD

RuralRouteType

ALPHANUM

710

RVUSADD

RuralRouteValue

ALPHANUM

713

10

BTUSADD

BoxType

ALPHANUM

723

BVUSADD

BoxValue

ALPHANUM

730

10

FTUSADD

FloorType

ALPHANUM

740

FVUSADD

FloorValue

ALPHANUM

745

10

UTUSADD

UnitType

ALPHANUM

755

UVUSADD

UnitValue

ALPHANUM

760

10

MTUSADD

MultiUnitType

ALPHANUM

770

MVUSADD

MultiUnitValue

ALPHANUM

775

10

BNUSADD

BuildingName

ALPHANUM

785

30

AAUSADD

AdditionalAddressInformation

ALPHANUM

815

50

ATUSADD

AddressType

ALPHANUM

865

NSUSADD

NYSIISofStreetName

ALPHANUM

866

SSUSADD

ReverseSoundexofStreetName

ALPHANUM

874

UPUSADD

UnhandledPattern

ALPHANUM

878

30

UDUSADD

UnhandledData

ALPHANUM

908

50

IPUSADD

InputPattern

ALPHANUM

958

30

EDUSADD

ExceptionData

ALPHANUM

988

50

UOUSADD

UserOverrideFlag

ALPHANUM

1038

CNUSARE

CityName

ALPHANUM

1044

30

AREAC

area fields to survive

ALPHANUM

1044

44

SAUSARE

StateAbbreviation

ALPHANUM

1074

ZCUSARE

ZipCode

ALPHANUM

1077

Z4USARE

Zip4AddonCode

ALPHANUM

1082

CCUSARE

CountryCode

ALPHANUM

1086

NCUSARE

CityNYSIIS

ALPHANUM

1088

SCUSARE

CityReverseSoundex

ALPHANUM

1096

UPUSARE

UnhandledPattern

ALPHANUM

1100

30

UDUSARE

UnhandledData

ALPHANUM

1130

50

IPUSARE

InputPattern

ALPHANUM

1180

30

EDUSARE

ExceptionData

ALPHANUM

1210

50

UOUSARE

UserOverrideFlag

ALPHANUM

1260

NAUSPRE

NameDomain

ALPHANUM

1266

100

ADUSPRE

AddressDomain

ALPHANUM

1366

100

ARUSPRE

AreaDomain

ALPHANUM

1466

100

P1USPRE

Field1Pattern

ALPHANUM

1566

20

P2USPRE

Field2Pattern

ALPHANUM

1586

20

P3USPRE

Field3Pattern

ALPHANUM

1606

20

P4USPRE

Field4Pattern

ALPHANUM

1626

20

P5USPRE

Field5Pattern

ALPHANUM

1646

20

P6USPRE

Field6Pattern

ALPHANUM

1666

20

IPUSPRE

InputPattern

ALPHANUM

1686

88

OPUSPRE

OutboundPattern

ALPHANUM

1774

88

UOUSPRE

UserOverrideFlag

ALPHANUM

1862

CFUSPRE

CustomFlag

ALPHANUM

1864

CCCOUNT

ISOCountryCode

ALPHANUM

1866

IFCOUNT

IdentifierFlag

ALPHANUM

1869

RECKEY

Unique Record Key

ALPHANUM

1871

12

FILEID

Source File Indicator

ALPHANUM

1871

RECNUM

Sequential Record Number

ALPHANUM

1873

SYSSRC

Source System ID

ALPHANUM

1883

86

QualityStage Essentials 7.0

Appendix B: File Layouts

POLNUMB

Policy Number

ALPHANUM

1884

12

NAME

Full Name

ALPHANUM

1896

46

ADDR1

Address 1

ALPHANUM

1942

35

ADDR2

Address 2

ALPHANUM

1977

35

CITY

City Name

ALPHANUM

2012

35

STATE

State Abbreviation

ALPHANUM

2047

ZIP

Zip Code

ALPHANUM

2052

10

FEDID

Federal ID

ALPHANUM

2062

10

DOB

Date of Birth

ALPHANUM

2072

DOD

Date of Death

ALPHANUM

2080

87

QualityStage Essentials 7.0

Appendix B: File Layouts

SURVOUT - Survived Best of Breed Customer Records


Name

Description

DataType

UseTyp
e

Missin
g

Star
t

End

Lengt
h

TEMP

Temporary shift field

ALPHANUM

22

SETID

Match Set Number

ALPHANUM

SURVREC

Best of Breed Customer Data

ALPHANUM

1270

PASS

Match Pass Number

TYPE
WEIGHT

Match Record Type


Match Composite Weight (Match
Score)

ALPHANUM

11

ALPHANUM

13

NTUSNAM

NameType

ALPHANUM

15

ALPHANUM

23

GCUSNAM

GenderCode

ALPHANUM

24

NPUSNAM

NamePrefix

ALPHANUM

25

20

NAMEC

name fields to survive

ALPHANUM

25

150

FNUSNAM

FirstName

ALPHANUM

45

25

MNUSNAM

MiddleName

ALPHANUM

70

25

LNUSNAM

PrimaryName

ALPHANUM

95

50

NGUSNAM

NameGeneration

ALPHANUM

145

10

NSUSNAM

NameSuffix

ALPHANUM

155

20

ANUSNAM

AdditionalNameInformation

ALPHANUM

175

50

MFUSNAM

MatchFirstName

ALPHANUM

225

25

NFUSNAM

NYSIISofMatchFirstName

ALPHANUM

250

SFUSNAM

RSoundexofMatchFirstName

ALPHANUM

258

MLUSNAM

MatchPrimaryName

ALPHANUM

262

50

HKUSNAM

HashKeyofMatchPrimaryName

ALPHANUM

312

10

PKUSNAM

PackedKeyofMatchPrimaryName

ALPHANUM

322

20

NWUSNAM

NumberofMatchPrimaryWords

ALPHANUM

342

W1USNAM

MatchPrimaryWord1

ALPHANUM

343

15

W2USNAM

MatchPrimaryWord2

ALPHANUM

358

15

W3USNAM

MatchPrimaryWord3

ALPHANUM

373

15

W4USNAM

MatchPrimaryWord4

ALPHANUM

388

15

W5USNAM

MatchPrimaryWord5

ALPHANUM

403

15

N1USNAM

NYSIISofMatchPrimaryWord1

ALPHANUM

418

S1USNAM

RSoundexofMatchPrimaryWord1

ALPHANUM

426

N2USNAM

NYSIISofMatchPrimaryWord2

ALPHANUM

430

S2USNAM

RSoundexofMatchPrimaryWord2

ALPHANUM

438

UPUSNAM

UnhandledPattern

ALPHANUM

442

30

UDUSNAM

UnhandledData

ALPHANUM

472

100

IPUSNAM

InputPattern

ALPHANUM

572

30

EDUSNAM

ExceptionData

ALPHANUM

602

25

UOUSNAM

UserOverrideFlag

ALPHANUM

627

HNUSADD

HouseNumber

ALPHANUM

629

10

ADDRC

address fields to survive

ALPHANUM

629

236

HSUSADD

HouseNumberSuffix

ALPHANUM

639

10

PDUSADD

StreetPrefixDirectional

ALPHANUM

649

PTUSADD

StreetPrefixType

ALPHANUM

652

20

SNUSADD

StreetName

ALPHANUM

672

25

STUSADD

StreetSuffixType

ALPHANUM

697

SQUSADD

StreetSuffixQualifier

ALPHANUM

702

SDUSADD

StreetSuffixDirectional

ALPHANUM

707

88

QualityStage Essentials 7.0

Appendix B: File Layouts

RTUSADD

RuralRouteType

ALPHANUM

710

RVUSADD

RuralRouteValue

ALPHANUM

713

10

BTUSADD

BoxType

ALPHANUM

723

BVUSADD

BoxValue

ALPHANUM

730

10

FTUSADD

FloorType

ALPHANUM

740

FVUSADD

FloorValue

ALPHANUM

745

10

UTUSADD

UnitType

ALPHANUM

755

UVUSADD

UnitValue

ALPHANUM

760

10

MTUSADD

MultiUnitType

ALPHANUM

770

MVUSADD

MultiUnitValue

ALPHANUM

775

10

BNUSADD

BuildingName

ALPHANUM

785

30

AAUSADD

AdditionalAddressInformation

ALPHANUM

815

50

ATUSADD

AddressType

ALPHANUM

865

NSUSADD

NYSIISofStreetName

ALPHANUM

866

SSUSADD

ReverseSoundexofStreetName

ALPHANUM

874

UPUSADD

UnhandledPattern

ALPHANUM

878

30

UDUSADD

UnhandledData

ALPHANUM

908

50

IPUSADD

InputPattern

ALPHANUM

958

30

EDUSADD

ExceptionData

ALPHANUM

988

50

UOUSADD

UserOverrideFlag

ALPHANUM

1038

CNUSARE

CityName

ALPHANUM

1044

30

AREAC

area fields to survive

ALPHANUM

1044

44

SAUSARE

StateAbbreviation

ALPHANUM

1074

ZCUSARE

ZipCode

ALPHANUM

1077

Z4USARE

Zip4AddonCode

ALPHANUM

1082

CCUSARE

CountryCode

ALPHANUM

1086

NCUSARE

CityNYSIIS

ALPHANUM

1088

SCUSARE

CityReverseSoundex

ALPHANUM

1096

UPUSARE

UnhandledPattern

ALPHANUM

1100

30

UDUSARE

UnhandledData

ALPHANUM

1130

50

IPUSARE

InputPattern

ALPHANUM

1180

30

EDUSARE

ExceptionData

ALPHANUM

1210

50

UOUSARE

UserOverrideFlag

ALPHANUM

1260

NAUSPRE

NameDomain

ALPHANUM

1266

100

ADUSPRE

AddressDomain

ALPHANUM

1366

100

ARUSPRE

AreaDomain

ALPHANUM

1466

100

P1USPRE

Field1Pattern

ALPHANUM

1566

20

P2USPRE

Field2Pattern

ALPHANUM

1586

20

P3USPRE

Field3Pattern

ALPHANUM

1606

20

P4USPRE

Field4Pattern

ALPHANUM

1626

20

P5USPRE

Field5Pattern

ALPHANUM

1646

20

P6USPRE

Field6Pattern

ALPHANUM

1666

20

IPUSPRE

InputPattern

ALPHANUM

1686

88

OPUSPRE

OutboundPattern

ALPHANUM

1774

88

UOUSPRE

UserOverrideFlag

ALPHANUM

1862

CFUSPRE

CustomFlag

ALPHANUM

1864

CCCOUNT

ISOCountryCode

ALPHANUM

1866

IFCOUNT

IdentifierFlag

ALPHANUM

1869

RECKEY

Unique Record Key

ALPHANUM

1871

12

FILEID

Source File Indicator

ALPHANUM

1871

RECNUM

Sequential Record Number

ALPHANUM

1873

SYSSRC

Source System ID

ALPHANUM

1883

89

QualityStage Essentials 7.0

Appendix B: File Layouts

POLNUMB

Policy Number

ALPHANUM

1884

12

NAME

Full Name

ALPHANUM

1896

46

ADDR1

Address 1

ALPHANUM

1942

35

ADDR2

Address 2

ALPHANUM

1977

35

CITY

City Name

ALPHANUM

2012

35

STATE

State Abbreviation

ALPHANUM

2047

ZIP

Zip Code

ALPHANUM

2052

10

FEDID

Federal ID

ALPHANUM

2062

10

DOB

Date of Birth

ALPHANUM

2072

DOD

Date of Death

ALPHANUM

2080

90

Das könnte Ihnen auch gefallen