Sie sind auf Seite 1von 239

Pervasive Integration Platform Fundamental End User Training

2 Data Integrator Fundamentals Training

2010 Pervasive Software Inc. All rights reserved. Design by Pervasive. Pervasive is a registered trademark, and "Integrating the Interconnected World" is a trademark of Pervasive Software Inc. Cosmos, Integration Architect, Process Designer, Map Designer, Structured Schema Designer, Extract Schema Designer, Document Schema Designer, Content Extractor, CXL, Process Designer, Pervasive Integration Engine, DJIS, Data Junction Integration Suite, Data Junction Integration Engine, XML Junction, HIPAA Junction, and Integration Engineering are trademarks of Pervasive Software Inc.. All names of databases, formats and corporations are trademarks or registered trademarks of their respective companies. This exercise scenario workbook is written for Pervasives Integration Platform software, version 9.x. (Dewberry)

Table of Contents
Forward ............................................................................................................................................... 7 The Pervasive Integration Platform ................................................................................................. 8 Architectural Overview of the Integration Platform ........................................................................ 9 Design Tools ................................................................................................................................... 10 MetaData Tools .............................................................................................................................. 14 Production Tools ............................................................................................................................ 15 Inside a Simple Integration.............................................................................................................. 17 Course Setup Instructions ............................................................................................................... 19 Course Setup Instructions............................................................................................................... 20 Workspaces and Repositories.......................................................................................................... 24 Workspaces and Repositories Defined ........................................................................................... 25 Repository Explorer ......................................................................................................................... 26 Repository Explorer - Defined........................................................................................................ 27 Splash Screen Licensing and Version Information...................................................................... 28 Map Designer Fundamentals of Transformation ....................................................................... 30 Map Designer The Foundation ................................................................................................... 31 Interface Familiarization ................................................................................................................... 32 Basic Map .......................................................................................................................................... 33 Connectors and Connections Methods of Accessing Data .......................................................... 38 Factory Connections .......................................................................................................................... 39 Macro Definitions.............................................................................................................................. 41 User Defined Connections ................................................................................................................. 44 Basic Transformation Features ...................................................................................................... 47 Source Data Features Sort .............................................................................................................. 48 Source Data Features Filter ............................................................................................................ 51 Target Output Modes - Replace, Append, Clear and Append ........................................................... 54 Target Output Modes Delete .......................................................................................................... 57 Target Output Modes Update ......................................................................................................... 59 The Rapid Integration Flow Language (RIFL) Script Editor......................................................... 63 RIFL Script - Functions ..................................................................................................................... 64 RIFL Script Flow Control .............................................................................................................. 69 Transformation Map Properties ..................................................................................................... 74 Reject Connection Info ...................................................................................................................... 75 Event Handlers & Actions .............................................................................................................. 78 Understanding Event Handlers .......................................................................................................... 79 Source and Target Buffers ClearMapPut Action ............................................................................ 82 4 Data Integrator Fundamentals Training

Event Sequence Issues ....................................................................................................................... 85 Using Action Parameters Conditional Put ...................................................................................... 89 Using OnDataChange Events ............................................................................................................ 92 Trapping Processing Errors with Events ........................................................................................... 96 Error and Exception Handling Review ........................................................................................ 100 Comprehensive Review .................................................................................................................. 102 Metadata Using the Schema Designers...................................................................................... 104 Structured Schema Designer ........................................................................................................ 105 No Metadata Available (ASCII Fixed) ............................................................................................ 106 External Metadata (Cobol Copybook) ............................................................................................. 107 Extract Schema Designer ............................................................................................................. 110 Interface Fundamentals & CXL ...................................................................................................... 112 Data Collection/Output Options ...................................................................................................... 116 Extract Schema Designer: Extracting Variable Fixed Field Definitions ........................................ 118 Process Designer for Data Integrator ........................................................................................... 121 Process Designer Fundamentals .................................................................................................. 122 Creating a Process ........................................................................................................................... 123 Parallel vs. Sequential Processing ................................................................................................... 128 Conditional Branching The Step Result Wizard .......................................................................... 130 FileList - Batch Processing Multiple Files ...................................................................................... 132 Pervasive Integration Engine ........................................................................................................ 137 Syntax: Version Information ........................................................................................................... 138 Options and Switches ...................................................................................................................... 139 Execute a Transformation................................................................................................................ 141 Using a -Macro_File Option ........................................................................................................ 142 Executing a Process ......................................................................................................................... 143 Additional Sample Exercises Integration Engine ...................................................................... 144 Command Line Overrides Source Connection ............................................................................. 145 Ease of Use: Options File ................................................................................................................ 146 Checklist Integration Engine ..................................................................................................... 147 Intermediate Mapping Techniques ............................................................................................... 150 Multiple Record Type Structures .................................................................................................. 151 Multiple Record Type 1 One-to-Many ......................................................................................... 152 Multiple Record Type 2 Many-to-One ......................................................................................... 156 User Defined Functions................................................................................................................ 159 Code Reuse Save/Open a RIFL script Code Modules .................................................................. 160 Code Reuse - Code Modules ........................................................................................................... 161 Lookup Wizards ............................................................................................................................ 163 Incore Table Lookup ....................................................................................................................... 164 Relational Database Management System (RDBMS) Mapping ................................................... 168 Select Statements SQL Passthrough ............................................................................................. 169 DJX in Select Statements Dynamic Row sets .............................................................................. 171 5 Data Integrator Fundamentals Training

Multimode Introduction................................................................................................................... 173 Multimode Data Normalization .................................................................................................... 176 Multimode Implementation with Upsert Action ............................................................................. 181 Reference ......................................................................................................................................... 185 Checklist Starting Your Integration Project .............................................................................. 186 Upgrading from 8.x to 9.x ............................................................................................................ 188 Cosmos.ini Settings ...................................................................................................................... 189 Windows Default Installation Locations ...................................................................................... 190 Design Tool User Interfaces ......................................................................................................... 192 Setting Properties ......................................................................................................................... 194 Reading a Log File ....................................................................................................................... 195 Examples of Complex Process Layouts ........................................................................................ 197 Additional Documentation Resources .......................................................................................... 199 Glossary ........................................................................................................................................... 200 Appendix ......................................................................................................................................... 210 Additional Exercises ..................................................................................................................... 211 Extract Schema Designer: Extracting Fixed Field Definitions....................................................... 212 Integration Engine: Using the -Set Variable Option ................................................................... 214 Integration Engine: Scheduling Executions ................................................................................... 216 Lookup Wizard: Flat File Lookup .................................................................................................. 217 Lookup Wizard: Dynamic SQL Lookup ........................................................................................ 221 RDBMS: Integration Querybuilder ................................................................................................ 225 Structured Schema Designer: Binary Data and Code Pages .......................................................... 229 Structured Schema Designer: Reuse Metadata (Reusing a Structured Schema) ............................ 231 Structured Schema Designer: Multiple Record Type Support in Structured Schema Designer..... 233 Structured Schema Designer: Conflict Resolution ......................................................................... 237

6 Data Integrator Fundamentals Training

Forward

This course is designed to be presented in a classroom environment in which each student has access to their own computer that has the Pervasive Integration Products installed as well as the Fundamentals courseware. It could be used as a stand-alone tutorial course if the student is already familiar with the interface of the Pervasive tools. The Fundamentals course is not meant to be a comprehensive tutorial of all of our products. At the end of this course it is our intention that a student will have a basic understanding of Map Designer, Structured Schema Designer, Extract Schema Designer, Process Designer, and the Integration Engine. The student should know how to use and how to expand their own knowledge of these tools. Further training can be obtained from Pervasive Training Services. Any path mentioned in this document assumes a default installation of the Pervasive software and the Fundamentals courseware. If the student installs differently, that will have to be taken into account when doing exercises or following links. We hope that the student enjoys this class and takes away everything needed. We welcome any feedback.

7 Data Integrator Fundamentals Training

The Pervasive Integration Platform

This section describes the integration stack from the users perspective.

8 Data Integrator Fundamentals Training

Architectural Overview of the Integration Platform


This presentation depicts the architecture of the Integration Platform from the end-users perspective. It briefly discusses all of the Integration tools and how they work together. Integration General Overview.ppt

9 Data Integrator Fundamentals Training

Design Tools
Data Integrator includes 6 tools used to create maps (transformations), schemas, profiles and processes. Each of the tools is discussed below.

Map Designer Map Designer is the heart of the integration product tool set. It transfers data among a wide variety of data types. In Map Designer, to transfer data, the user designs and runs what is called a Transformation or a Map. Each Transformation created contains all the information Map Designer needs to transform data from an existing data file or table to a new Target data file or table, including any modifications made on the data during the transformation. Map Designer solves complex Transformation problems by allowing the user to: transform data between applications combine data from external Sources change data types add, delete, rearrange, split or concatenate fields parse and select substrings; pad or truncate data fields clean address fields and execute unlimited string and numerical manipulations control log errors and events define external table lookups Map Designer creates two files (tf.xml and map.xml) that contain all the information necessary to run a transformation. A transformation can be run from Map Designer, Process Designer or the Integration Engine. Map Designer is covered extensively in this course and is also explored in the Advanced and the EDI/HIPAA courses.

Process Designer Process Designer is a graphical data transformation management tool that can be used to arrange a complete transformation project. Listed below are some of the Steps that a user can put into a process: Map Designer Transformation SQL Command Decision RIFL Scripting Command Line Application SQL Server DTS Package Sub-process Validation 10 Data Integrator Fundamentals Training

XSLT Queue Iterator Aggregator Invoker Transformer Once the user has organized these Steps in the order of execution, the entire workflow sequence can be run as one unit. This workflow is saved as an .ip.xml file which can be run from the Process Designer or from Integration Engine. Process Designer processes can also be packaged using the Repository Manager. This packaging gathers all of the files that are required by the process and puts them into a single DJAR file that can then be run from the Integration Engine. This courseware covers some basic functionality of the Process Designer. Both the Advanced and the EDI/HIPAA courses cover the more advanced functionality of this tool.

Structured Schema Designer The Structured Schema Designer provides a visual user interface for designing structural data files. The resulting metadata is stored as Structured Schema files with an .ss.xml extension. The .ss.xml files include schema, record recognition rules and record validation rule information. The Data Parser is used to manually parse flat Binary, fixed-length ASCII, or record manager files. The Data Parser defines Source record length, Source field sizes and data types, and Source data properties. It also assigns Source field names, and defines Schemas with multiple record types. Structured Schema Designer can be used to import and read schemas from outside sources such as Cobol Copybooks, XML DTDs, or Oracle DDLs. The ss.xml files that are created by Structured Schema Designer are used as input in Map Designer as part of a source or target connection. There are courseware and exercises on the Structured Schema Designer in this document.

Extract Schema Designer The Extract Schema Designer is a parser tool that allows the user to visually select fields and records from text files that are of an irregular format. Some examples are: Printouts from programs captured as disk files Reports of any size or dimension ASCII or any type of EBCDIC text files Spooled print files Fixed length sequential files Complex multi-line files Downloaded text files (e.g., news retrieval, financial, real estate...) HTML and other structured documents Internet text downloads E-mail header and body 11 Data Integrator Fundamentals Training

On-line textual databases CD-ROM textbases Files with tagged data fields Extract Schema Designer creates schemas that are stored as CXL files. These files are then used as input in Map Designer as part of a source connection. There are courseware and exercises on the Extract Schema Designer in this document.

Document Schema Designer Document Schema Designer is a Java-based tool that allows you to build templates for E-document files. You can custom-build schema subsets for specific EDI Trading Partner and TranType scenarios. In addition, the Document Schema Designer is also very useful to those working with HL7, HIPAA, SAP (IDoc), SWIFT and FIX data files. You can develop schema files for all e-documents that are compatible with Map Designer. The document schemas serve several useful purposes: File Structure Metadata Support Parsing Capabilities Validation Support In an easy-to-use GUI interface, the user selects desired segments from the "template" document schemas that are generated from the controlling standards documentation. The segments are saved in a schema file that can be edited. The user may also add segments from a "master" segment library, add loops/segments/composites/elements by hand, add discrimination rules for distinguishing loops/segments of the same type at the same level, and use code tables for data validation. The user can copy, paste and delete any part of the structure, including the segments, elements, composites loops, and fields (and their subordinate loops/segments/subcomponents). The Document Schema Designer produces DS.XML document schema files that can be used as input in Map Designer as part of a source or target connection. These files can also be used in a Process as part of a Validation step. This document does not have exercises or courseware on Document Schema Designer, though there is a one-day course available from Pervasive Training Services.

Join Designer Join Designer is an application that allows the user to join two or more single-record type data sources prior to running a Map Designer Transformation on them. These sources do not have to be of the same type. For example, an SQL database table could be joined with a simple ASCII text file. The user first uses Source View Designer to create Source View Files that hold metadata about the Sources. From these a Join View File is created, which contains the metadata needed by Map Designer to treat the Source files as if they were a single Source. The user then supplies this Join

12 Data Integrator Fundamentals Training

View File to Map Designer using "Join Engine" as the connection type. The original Source files and the Source View Files must still be available in the locations specified in the Join View File. When a join is saved, a Join View File (.join.xml) is created. This can be supplied to Map Designer as a Source file or used to create further joins. While a join is limited to two Source files, you can use another join as a Source, thus building nested joins to any level of complexity. This document does not have exercises or courseware on Join Designer. There are exercises in the Advanced course available from Pervasive Training Services.

13 Data Integrator Fundamentals Training

MetaData Tools
The design tools create artifacts that are XML files (except Extract Schema Designer). The Metadata tools organize these file during development, and manipulate these files to be used for production.

Repository Explorer The Repository Explorer is the central location from which the user can launch all of the Designers, including the Map Designer, Process Designer, Join Designer, Extract Schema Designer, Structured Schema Designer, Source View Designer and Document Schema Designer. The User can also open any Repository that has been created, and then open Transformations, Processes or Schema files in that Repository list. The Repository Explorer can also access the version control functionality of CVS or Microsoft Visual SourceSafe, and can check files in and out of repositories using commands in Repository Explorer. There is courseware about the Repository Explorer in this document.

Repository Manager Repository Manager is designed to facilitate the tasks of managing large numbers of Pervasive design documents, contained in multiple repositories in multiple workspaces. Repository Manager provides a single application to directly access any number of Pervasive design documents, view their contents, make simple updates, bundle them into a package, and generate reports. The features of Repository Manager include: Open and work with any number of defined Workspaces. Browse the hierarchy of Workspaces, Repositories, Collections, and Documents. Search for documents based on text strings, regular expressions, date ranges, Document Types, document-specific fields. Make minor updates to documents. Generate an impact analysis of proposed document modifications. Import and export Documents and Collections. Package Processes and related documents into a single entity (DJAR) that can be more easily managed and transported. View and print documents and Reports. This document does not have exercises or courseware on Repository Manager, though there is an exercise in the Advanced course available from Pervasive Training Services.

14 Data Integrator Fundamentals Training

Production Tools
These are the tools that allow the user to automate their Transformations and Processes in their production environment.

Integration Engine Integration Engine is an embedded data Transformation engine used to deploy runtime data replication, migration and Transformation jobs on Windows or Unix-based systems. Because Integration Engine is a pure execution engine with no user interface components, it can perform automatic, runtime data transformations quickly and easily, making it ideal for environments where regular data transformations need to be scheduled and launched. Integration Engine supports the following operating systems: Windows 2000, Windows XP, Windows Server 2003, HPUX, Sun Solaris, IBM AIX, and Linux. The Integration Engine has the capability to work with multiple threads if a multi-threaded license is purchased. There is courseware about the Integration Engine in this document.

Integration Server Integration Server is actually an SDK that is installed by default when the integration platform is installed. The core components of the Integration Server SDK are the Engine Controller, Engine Instances (Managed Pool), and the Client API that accesses the Engine Controller through a proxy. Server stability is maintained, scalability enhanced, and resources are spared through the use of a control-managed pool of EngineExe objects. This allows the Integration Engine to be called as a service. This document does not have exercises or courseware on the Integration Server, though there is a one-day course available from Pervasive Training Services that covers the Integration Server and the Integration Manager.

Integration Manager Through a browser-based interface, Integration Manager performs deployment, scheduling, on-going monitoring, and real-time reporting on individual or groups of distributed Integration Engines. Since all management is performed from a single administration point, Integration Manager improves operational efficiency in the management of geographically distributed Integration Engines. With the ability to remotely administer any number of integration points throughout the organization, customers can build out their integration infrastructure as required, using a flexible and scalable architecture designed for easy manageability. In other words, the Integration Manager allows the user to schedule and deploy multiple packages (DJAR) amongst multiple Integration Servers across an enterprise.

15 Data Integrator Fundamentals Training

This document does not have exercises or courseware on the Integration Manager, though there is a one-day course available from Pervasive Training Services that covers the Integration Server and the Integration Manager.

16 Data Integrator Fundamentals Training

Inside a Simple Integration

17 Data Integrator Fundamentals Training

18 Data Integrator Fundamentals Training

Course Setup Instructions

19 Data Integrator Fundamentals Training

Course Setup Instructions


Installing the Software
When installing on a Windows system you may be required to log on as a local administrator for the installation to succeed. Exit all programs before running the setup. Run the setup, and follow the wizard instructions. Select one of the 2 following options for installation. 1. Design Studio Installs the Designers, utilities, and the Integration Engine. 2. Integration Engine Installs the Integration Engine and its utilities. For the purposes of this course we will install the Design Studio. If you are taking this class on site with Pervasive, the software has already been installed on the training modules. Launch the software by clicking on the Repository Explorer 9 icon on the desktop. You will be prompted to load a valid license.

For more information on Windows Default Installation Locations see the Reference Section.

Licensing
A temporary license file will be provided to you by the training services manager. This temporary license will allow you to utilize all of the capabilities of the Integration Platform for at least two weeks. If you are receiving training on site with Pervasive software, the license may appear on your desktop. The license file will have a .slc extension. You may store your license in any directory that you wish. The default location for storing a license on a Windows machine is C:\Documents and Settings\All Users\Application Data\Pervasive\Cosmos9\Common\License. You can store the license to this directory. After you have determined where the license will reside, double click on the Repository Explorer9 icon on your desktop to launch the software. Choose the option to Browse to a valid license file on disc, and browse to the location where you have stored the license.

Setup Course Directory Structure


Create a folder directly on the C Drive and name it Cosmos9_ Work. We will use this directory to store all of the course materials. You should posses a zip file named Fundamentals9.zip. This zip file should be provided for you via download by the training services manager. If you are taking this class on site with Pervasive the zip should be on your desktop. Unzip the contents of the Fundamentals9.zip into the Cosmsos9_Work directory that was just created. The resulting directory structure should be C:\Cosmos9_Work\Fundamentals. See the image below.

20 Data Integrator Fundamentals Training

Configuring Database Connectivity for Hands-on Exercises


The exercises in the Solutions folder of the Fundamentals training bundle are built using an ODBC connection. To set up, the student must establish an ODBC connection called TrainingDB to their preferred database. Any relational non-production database can be used in the classroom. Be aware that the login used for the connection must have sufficient permissions for creating and deleting tables in the database. There is an Access database provided in the training bundle. The Access database requires no additional software. Follow the steps below to create the ODBC connection to the Access Database in the training bundle. While ODBC allows us to use a more flexible middleware connection to databases, please be aware that you will generally have better performance and more functionality if you use the native client interfaces instead of an ODBC driver. 1. From the Start Menu choose Programs Administrative Tools Data Sources (ODBC), or from the Start Menu choose Control Panel Administrative Tools Data Sources (ODBC). 2. In the ODBC Data Source Administrator create a new User DSN by clicking Add. See image below.

21 Data Integrator Fundamentals Training

3. Choose the Microsoft Access Driver (*.mdb ). 4. Set the Data Source name to TrainingDB, and click the Select button. 5. Browse to the folder C:\Cosmos9_Work\Fundamentals\Data and select the TrainingDB.mdb database.

6. Click OK. 22 Data Integrator Fundamentals Training

23 Data Integrator Fundamentals Training

Workspaces and Repositories

24 Data Integrator Fundamentals Training

Workspaces and Repositories Defined


Workspaces A workspace is a directory location on your system that allows you to organize your integration designs. All workspaces will reside inside of a common Workspace Root Directory. Your default workspace root directory is C:\Documents and Settings\username\Cosmos9_Work. Your default workspace location is C:\Documents and Settings\username\Cosmos9_Work\Workspace1. Every workspace must have at least one repository.

Repositories Repositories are used to store the maps, processes, and schemas that make up your integration designs. The repository is typically a folder in a workspace directory; however the repository folder does not have to physically reside within the workspace folder to belong to the workspace. You may have many repositories within a workspace. You are required to have at least one. A default repository is created within the default workspace. There is more information about repositories contained in the next section. Your default repository location is C:\Documents and Settings\username\Cosmos9_Work\Workspace1\xmldb.

25 Data Integrator Fundamentals Training

Repository Explorer

26 Data Integrator Fundamentals Training

Repository Explorer - Defined


Repository Explorer is at the heart of the integration product design environment. In this central location, you can launch all of the Design Tools. These tools include Map Designer, Process Designer, Join Designer, Extract Schema Designer, Structured Schema Designer, Source View Designer and Document Schema Designer. Repository Explorer allows you to create and explore multiple Repositories for any given Workspace. This functionality allows you to separate your metadata according to your specific project specifications. The following are two scenarios that you might choose. 1. Create separate repositories for the Development, QA, and Production phases of your project, and promote your specification files from one repository to the next as you advance through each of these phases. You should choose this scenario when projects are defined to belong to separate Workspaces. Projects should belong to separate workspaces when the data that is being transformed for projects does not access the same input and output directories, or the same databases. 2. Create separate repositories representing different projects altogether. You should choose this scenario when projects are designed to belong to the same Workspace. These projects should have common threads. I.E. the same input/output directories or they are accessing the same databases.

Change the Current Workspace Root Directory Select File Manage Workspaces (Ctrl+Alt W). Change the Workspaces Root Directory to the Cosmos9_Work folder that was created on your C Drive. This will allow you to use a list of Repositories and Macro definitions specific to your current Workspace.

Modify the Default Repository in the Current Workspace Click on the Repositories button in the bottom right-hand corner of the Workspaces dialog box. When you change the Root Directory, a default Workspace and Repository will be created. We are going to modify the default for use during training. Change the name xmldb to Fundamentals and navigate to the folder C:\Cosmos9_Work\Fundamentals by clicking the Find button. We will use this Repository to store all of the XML schema and metadata for the training exercises.

27 Data Integrator Fundamentals Training

Splash Screen Licensing and Version Information


Description Splash Screen - Shows the Splash Screen for Repository Explorer. Credits - Gives a list of credits for third party software components used by the Product. Version - Displays the following sections: o License Name: Displays the PATH to the Product License file and the License file name. o Serial Number: Displays the Product serial number. o Version: Displays the Product build version number. o Subscription Ends: Displays the date the license file will expire. o Users: Displays the number of users licensed for the Product. o Single User License For: Name: Name of the person licensed for the Product. Company: Name of the company licensed for the Product. Licensed Features - Displays all of the Connectors, Features and Products that are licensed in the Product. Support - Displays the Technical Support address, phone/fax number, and web address.

28 Data Integrator Fundamentals Training

29 Data Integrator Fundamentals Training

Map Designer Fundamentals of Transformation

30 Data Integrator Fundamentals Training

Map Designer The Foundation


The Map Designer delivers the ease of an intuitive GUI for visually and directly mapping Source data to Target structures while allowing the user to manipulate the data in virtually limitless ways. The Map Designer tool enables the user to create the specifications for a transformation. A transformation reads one or more source files record by record, applies to each record whatever calculations, filters, checks, etc., are defined and then may write one or more records to one or more target files. The user employs a three-tab, graphical interface to describe the source(s), target(s) and processing logic. Source connectors describe the source file(s) and target connectors describe the target file(s).

31 Data Integrator Fundamentals Training

Interface Familiarization
Objectives The Map Designer icons offer you shortcuts when you are creating, modifying, and viewing maps. Here is information pulled from the Help File about the icons and their descriptions. Descriptions

32 Data Integrator Fundamentals Training

Basic Map
Objectives At the end of this lesson you should understand the Source and Target tabs and be able to use the new Simple Map view to create a Transformation. Keywords: Drag and Drop Mapping Description In this exercise we will follow the basic steps in the flow chart below and create a simple map.

Exercise

Define the Source: 1. Open Map Designer. 2. There are 3 tabs. The first tab is selected for defining the source. 3. Locate the textbox labeled Source Connection, click the down arrow. This will open the Select Connection Dialog box pictured below. Notice there are three additional tabs.

33 Data Integrator Fundamentals Training

Note: The first time you open this it will open on the Factory Connections tab. Afterwards it will default to the Most Recently Used tab. We will discuss the User Defined Connection tab in a future exercise. 4. Choose the ASCII (Delimited) connector and click OK. 5. Next to the textbox labeled Source File/URI, click the down arrow to select a file. Browse to the Accounts.txt file in the C:\Cosmos9_Work\Fundamentals\Data folder. 6. In the ASCII (Delimited) Properties box on the right side of the Source tab, find the Header property and set it to True. Then click the Apply button under the Properties list. Note: Any time you make a change in the source or target properties, you will have to click Apply to save the changes. 7. Use the toolbar Icon to open the Source Data Browser. If you see data records, then you have connected to the source. Close the Browser. Define the Target: 8. Click on the Target Connection tab. 9. In steps 3- 6 above, we chose a source connection. Create a Target connection similar to the way we created a Source Connection. This time choose ACSII (Fixed) as the connector type. 10. In the Target File/URI drop down browse to the C:\Cosmos9_Work\Fundamentals\Data folder. 11. Type Accounts_Fixed.txt as the file name, and click Open. Note: This file does not exist and will be created when we run the transformation. Map the Fields: 34 Data Integrator Fundamentals Training

12. Click on the Map Tab (Yellow Tab). 13. If you see two quadrants on this page, then you are set to the Map Fields view and you will need to follow the next steps. If not, you can skip to step 16. 14. From the Menu click View Preferences. Click the General tab. Check Always show Map All view.

We will be working in the Map All view for the remainder of the course. 15. To return to the Simple Map View, simply click on the Simple Map View icon in the toolbar. 16. To map the fields, drag the asterisk from the box labeled All Fields in the source, and drop it under the Target field name header. 17. Notice that the target has been filled out with field names identical to the source, and that the Target Field Expressions are filled out as well. Validate the Transformation using the check mark icon on the toolbar. 18. If the map is valid click OK. 19. Save the Map as m_BasicMap.map.xml in the C:\Cosmos9_Work\Fundamentals\Development folder. 20. Click the Run Map Icon to run the transformation.

21. Click the Target Data Browser and note your results.

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: C:\Cosmos9_Work\Fundamentals\Data\Accounts.txt

35 Data Integrator Fundamentals Training

Source Options:

Header = True

Define the Target: Target Connector: Target Data: ASCII(Fixed) C:\Cosmos9_Work\Fundamentals\Data\Accounts_Fixed.txt

Target OutputMode: Replace

Target Field Expressions


R1.Account Number R1.Name R1.Company R1.Street R1.City R1.State R1.Zip R1.Email R1.Birth Date R1.Favorites

Records("R1").Fields("Account Number") Records("R1").Fields("Name") Records("R1").Fields("Company") Records("R1").Fields("Street") Records("R1").Fields("City") Records("R1").Fields("State") Records("R1").Fields("Zip") Records("R1").Fields("Email") Records("R1").Fields("Birth Date") Records("R1").Fields("Favorites")

R1.Standard Payment Records("R1").Fields("Standard Payment") R1.Payments R1.Balance

Records("R1").Fields("Payments") Records("R1").Fields("Balance")

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPut Record

Target R1

36 Data Integrator Fundamentals Training

37 Data Integrator Fundamentals Training

Connectors and Connections Methods of Accessing Data


The connectors are what the Integration Platform uses to read and write data in Map Designer and the other design tools. They are an integral part of the software in that all of the low-level, complex data access programming has been abstracted to a simple form for the user to complete by using drop-down menus and pick lists.

38 Data Integrator Fundamentals Training

Factory Connections
Objectives: At the end of this lesson you will be able to find and use the appropriate data access Connector. Keywords: Connectors List, Connection Menu, and Source Connection tab

Description Factory Connections contains a list of all of the Connectors available to you in Map Designer. Type the first letter of a Connector name to jump to that Connector in the list (or the first one in the list with that letter). For instance, you want to choose Btrieve v7. Type "B", and BAF will appear. From there, you can scroll down to Btrieve v7 and select it.

The Map Designer Connector Toolbar Here are the icons and their descriptions:

offers you shortcuts to this dialog.

New - Allows you to clear the Source tab and define a new source connection. Open Source Connection Allows you to open the Select Connection dialog to access the:

39 Data Integrator Fundamentals Training

o o o

Most Recently Used Tab Factory Connections Tab User Defined Connections Tab

Save Source Connection - Allows you to save the selected connector type, and any properties hat you have defined for a source as a sc.xml file. The advantage of doing this is that you can reuse the Connection in any subsequent Map design in the future. This saved connection will become a User Defined connection. We will discuss user Defined Connections in the next topic Source Connector Properties - opens the Source Properties dialog box. These are the same properties available via the Source Connection tab, and are dependent upon the Connector to which you are connected. This icon will be active only when you are on the Map tab.

40 Data Integrator Fundamentals Training

Macro Definitions
Objectives At the end of this lesson you will be able to define and use Macros in connection strings. Keywords: Macros, Macro Definition File, Workspace Description Macros are symbolic names assigned to text strings, and are usually used to represent file paths. You should use macros as a tool to aid in the movement of integration files from one life cycle environment to the next. A macro definition file is an XML file that contains name value pairs. This file is named macrodef.xml and resides in your Workspace directory. Each workspace will only read one macrodef file. Therefore, the scope of macros contained in a single macrodef file is across a workspace. These macro names can be used throughout a map or process to provide connection information. For example, a macro name can be substituted in the following connector options: Server Name or IP Address Database name UserID Password File or table connection paths We will create a new macro that we can use to represent the Data sub-directory for our Training Repository. This will allow us to port the schema files more readily from one workstation to another or deploy to servers for execution by Integration Engine.

Exercise 1. Select the menu item Tools Define Macros. Notice there is already a macro that is set to the default location of the current Workspace. 2. Click New. 3. Enter a Macro Name value as FUN_DATA. 4. Click the Macro Value drop-down button and navigate to our workspace and highlight the C:\Cosmos9_Work\Fundamentals\Data folder, and click OK. 5. Add a back slash \ to the end of the macro value. 6. Enter a description if you wish and click OK. 7. On the source connection tab, highlight the portion of the connection string you wish to replace (e.g., C:\Cosmos9_Work\Fundamentals\Data\). 8. From the menu bar, select Tools Paste Macro String. 9. Click on the row of the Macro you want to use (e.g., FUN_DATA). Map Designer uses the syntax $(FUN_DATA) to represent the entire path to the Data folder.

41 Data Integrator Fundamentals Training

Root Macro If you will be selecting files from the same directory often you can set the Root Macro for automatic substitution. Click Tools Define Macros. Highlight the Macro you want to use as the root directory and click the Set as Root button.

42 Data Integrator Fundamentals Training

Then set the automatic substitution switch in Map Designer View Preferences Directory Paths: Choose Substitute root MACRO. This step is optional, and is a design preference.

43 Data Integrator Fundamentals Training

User Defined Connections


Objectives At the end of this lesson you will be able to define and reuse a User Defined Connection. Description User Defined Connections are created by saving a Source or Target Connection along with any property options. The Connections are saved as either a .sc.xml (Source) or .tc.xml (Target) file in your Workspace/connections directory. User Defined Connections are reusable. You can create as many as you would like.

Exercise 1. Reopen the Transformation built previously named m_BasicMap.map.xml and view the Source Connection tab. 2. Use the Macro created in the last exercise for the path to the Accounts.txt file on the Source Tab. 3. Using the Connector Toolbar to the right side of the Connection field, click the Save icon.

4. Save the source connection as Accounts_Delimited.sc.xml. 5. Close the current Map and open a new map design. 6. Select the Source Connection dropdown and click the User Defined Connections tab. Click on the Connections folder and select the Accounts_Delimited.sc.xml connection.

44 Data Integrator Fundamentals Training

45 Data Integrator Fundamentals Training

46 Data Integrator Fundamentals Training

Basic Transformation Features


This section describes certain features for manipulating data that are built into the Map Designer such as sorting, filtering, updating or deleting data.

47 Data Integrator Fundamentals Training

Source Data Features Sort


Objectives At the end of this lesson you should be able to apply a sorting function to your source data. Keywords: Source Key and Sorting Dialog Description We view sorting in our transformations from two angles. First, it is often necessary that the target file be in a certain order. While this doesnt usually matter in database targets, it can be essential when other file structures are being produced. Secondly, and perhaps more importantly, transformations may be designed much more efficiently if we can rely on the source file being in a certain sequence. Assume that you have a code of some sort in each source record and that you must do a lookup or some complicated processing using that code. If the source file were in code sequence, we could perform this logic only once for each code, and then save and use the results until a new code was encountered. At the outset, we realize that it is not possible to sort the target file itself. Transformations write target records one at a time. However, it is quite possible to sort the source file before it is processed. Doing so will achieve either of the requirements for sorting mentioned above. If the source file is already in the sequence needed for the target, then writing the target one record at a time is no problem. Also, we could sort the source file into a sequence that would enable us to minimize processing time. To sort the source file before processing, we simply use the Source Keys and Sorting dialog. In this dialog we can specify the field(s) by which we want to sort the source file before processing. We can even sort on a constructed or calculated value. We should realize, however, that when we use the Source Data Browser to view the file, we will not see it in its sorted order, since the sorting is performed once the transformation begins and is done dynamically, in memory. The original file is not changed. Sorting has its own overhead. Extremely large files can take a long time to sort. If this time becomes a factor, then other strategies may need to be employed. But the benefits gained from having the source file in sequence can be even greater. We will learn in later lessons how sorted source files are a requirement for on data change processing- a processing strategy that can dramatically reduce the execution time of a transformation.

Exercise 1. Connect to the ASCII Delimited file, Accounts.txt as your source. Hint: Use the User Defined Connection created in class in an earlier exercise. 2. Click the Source Keys and Sorting icon in the toolbar. 3. On the Sort Options Tab, click in the Key Expression box to see the down arrow. Click on the down arrow. 4. Choose the State Field to use as a key. Note: You can choose Build if you want to build a key using an expression to parse out or concatenate parts of different fields. Also, the sort will default to ascending order. If you would prefer to sort in descending order, select "Descending" from the dropdown list.

48 Data Integrator Fundamentals Training

5. Create a target connection to an ASCII Delimited file called AccountsSortedbyState.txt. This file doesnt yet exist, so youll have to type in the file name. 6. Set the header to true and click the Apply button. 7. Go to the Map Step by clicking on the Map All Tab. 8. Validate the Map. Note: You may see a dialog box that looks like this. We will go into greater detail on the Default Event Handler and Event Handlers in general later in this courseware.

9. Click OK to accept the Default Event Handler. 10. Save this Map as m_SourceDataFeatures_Sort.map.xml in the Development folder. 11. Run the Map.

12. Notice the results in the status bar. 13. Open the Target Data Browser and notice that the records are sorted by state.

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True

Automatic Transformation Feature: Sort Fields on Source Data: Fields("State") type=Text ascending=yes length=2

49 Data Integrator Fundamentals Training

Define the Target: Target Connector: Target Data: Target Options: ASCII(Delimited) $(FUN_DATA)AccountsSortedByState.txt Header = True

Target OutputMode: Replace

Target Field Expressions


R1.Account Number R1.Name R1.Company R1.Street R1.City R1.State R1.Zip R1.Email R1.Birth Date R1.Favorites

Records("R1").Fields("Account Number") Records("R1").Fields("Name") Records("R1").Fields("Company") Records("R1").Fields("Street") Records("R1").Fields("City") Records("R1").Fields("State") Records("R1").Fields("Zip") Records("R1").Fields("Email") Records("R1").Fields("Birth Date") Records("R1").Fields("Favorites")

R1.Standard Payment Records("R1").Fields("Standard Payment") R1.Payments R1.Balance

Records("R1").Fields("Payments") Records("R1").Fields("Balance")

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPut Record

Target R1

50 Data Integrator Fundamentals Training

Source Data Features Filter


Objectives At the end of this lesson you should be able to apply simple filters to your source data. Keywords: Source Filter Window, Sample Size, and Target Filter Window Description There are two ways to restrict the target file to contain only certain source records. The most flexible way is to supply processing logic in the body of your transformation. Using this approach allows you to implement any desired business rules for filtering data. For example, you might want to exclude from the Target file any account records with invalid Zip codes. The second way to restrict the number of Source records that are placed in the Target file is to use a filter. You can do almost anything in a filter that you can do in processing logic, however the virtue of filters is that they are usually easier to establish, change and remove. For example, you may be testing a new Transformation against a file with more than a million records. You have a complex calculation that needs to work properly, but you wont be able to tell if it is working until you look at the very first record in the Target file. You will not have to process all the records just to see the results for the first one. Filters are available for this type of situation. A source or target filter is a simple criterion that determines whether a source record is to be processed or if a target record is to be written. The user has the option of using one of four methods to test each source or target record to see if it should be processed or written. You may (1) process/write only the first N records, (2) process/write all records from record number X to record number Y, (3) process/write every Nth record or (4) supply an expression which, if evaluated to True, causes the record to be processed/written. All of these options are controlled through the Source Filters and Target Filters dialogs. The user can use either type of filter or even both types at once. Using both types in the same transformation, however, requires some thought. If your objective is to obtain a target file with 100 records, you can use either a source or target filter. You will get the result you want, but only if you do not bypass any records in your own processing logic. As another example, if you filter a 5000record source file to process only the first 1000 records, and then also supply a target filter to write every 10th record to the target, you will only get 100 target records, not 500. The target filter will be applied to those source records that make it through the source filter. As in sorting, filtering is performed dynamically when the transformation runs. Therefore source filter results are not shown when the Source Data Browser is used.

Exercise 1. Connect to the ASCII Delimited file, Accounts.txt as your source. 2. Click the Source Filters icon in the toolbar. Note: The radio buttons in the bottom of the window where it says Define Source Sample. We can choose a range of records. We can choose to process ever Nth record from the source. (The behavior of this is that you always get the first record, then every Nth record like so, 1, N+1, 2N+1, 3N+1)

51 Data Integrator Fundamentals Training

3. In this exercise we will filter all Account Records from the state of Texas. We will use the Source Record Filtering Expressions box. This allows us to use the RIFL Scripting Language (see The RIFL Script Editor section) to write an expression that will evaluate to True or False. We will process the records that cause the expression to evaluate to True. The expression to use is: Records("R1").Fields("State") == "TX" . 4. Create a target connection to an ASCII Delimited file called AccountsinTX.txt. This file does not exist, so type in the file name. 5. Set the header property to true and click the Apply button. 6. Go to the Map Step. Drag all Source Fields to the Target. 7. Validate the Map. Note: You may see the dialog box pictured below. We will go into greater detail about the Default Event Handler and Event Handlers in general later in this course.

8. Click OK to accept the Default Event Handler. 9. Save this Map as m_SourceDataFeatures_Filter.map.xml in the Development folder. 10. Run the Map, and notice the Results Status Bar.

11. Open the Target Data Browser and notice that the target data set only contains records from Texas.

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True

Automatic Transformation Feature:

Filter Expression: Records("R1").Fields("State") == "TX"

52 Data Integrator Fundamentals Training

Define the Target: Target Connector: Target Data: Target Options: ASCII(Delimited) $(FUN_DATA)AccountsinTX.txt Header = True

Target OutputMode: Replace

Target Field Expressions


R1.Account Number R1.Name R1.Company R1.Street R1.City R1.State R1.Zip R1.Email R1.Birth Date R1.Favorites

Records("R1").Fields("Account Number") Records("R1").Fields("Name") Records("R1").Fields("Company") Records("R1").Fields("Street") Records("R1").Fields("City") Records("R1").Fields("State") Records("R1").Fields("Zip") Records("R1").Fields("Email") Records("R1").Fields("Birth Date") Records("R1").Fields("Favorites")

R1.Standard Payment Records("R1").Fields("Standard Payment") R1.Payments R1.Balance

Records("R1").Fields("Payments") Records("R1").Fields("Balance")

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPut Record

Target R1

53 Data Integrator Fundamentals Training

Target Output Modes - Replace, Append, Clear and Append


Objectives At the end of this lesson you should be able to understand and implement each of the target output modes: Replace, Append, and Clear Append. Keywords: Output Mode, Replace Mode, Append Mode, Clear and Append Mode, and Schema Mismatch Description The target output mode Replace is used in two situations. In the first, the file or table does not yet exist, and in this case Map Designer creates it using the layout you have specified on the Map tab. In the second situation, where the file or table already exists, the replace mode deletes the file (or drops the table) first, and then recreates it using the layout you have specified on the Map tab. The target output mode Append adds additional rows to a target file or table that already exists. If you are working with flat files as your targets, then the only available output modes are Replace and Append. The output modes are different when a database is the target. Database tables can have indexes and constraints built into them and there is a critical difference between Replace and Clear and Append. That difference is that Replace mode effectively drops the table and then recreates it, whereas the Clear and Append mode truncates the table only. When you drop a table, you also drop any indexes or constraints that the table might have, while truncation preserves them. You can use Clear and Append as an output mode even if the table does not exist, and the table will be created automatically. Usually when mapping to a database, one will choose an existing table from a dropdown of the tables that the database contains. As soon as you choose a table the Target Output Mode will change to Append and the structure of the table will be defined on the Map Tab. You can then change the Output Mode to Clear and Append and map your target fields on the Map Tab.

Exercise 1. Connect to the ASCII Delimited file, Accounts.txt as your source. 2. Create a target connection to the TrainingDB database that we have set up previously. The table is called tblAccounts. Note that when we connect to this table, the output mode was automatically set to Append because the table already existed. Lets change the output mode to Replace. 3. Go to the Map Step. Note: In this case we already have target fields defined. This metadata (Field names, Field lengths, and Data types) is defined by the database. Notice also that some fields are mapped and some are not. The Simple Map view does an automatic Match by Name that pulls in field names that are exact matches from source to target. We will have to do the rest by hand. 4. For the AccountNumber field we click inside the target field expression, and then click the down arrow. 5. We can then choose Account Number (note the space that is not there in the target field. Thats why Match by Name failed). 54 Data Integrator Fundamentals Training

6. Now we do the same for each of the remaining fields. Look at the charts below for specific mapping if needed. 7. Alternatively, we could have right clicked in the AccountNumber Target Field Expression and chosen Match by Position. In this case, we would have mapped all of our source fields into the target fields correctly. However, it is not always the case that filed names will be in perfect position order between the source and target. 8. Run the map by clicking the Run button. 9. Accept the Default Event Handler if necessary. 10. Notice the results in the Target Data Browser. Note the number of records in the table. 11. Now lets go back to the Target Connection Tab and set the Output Mode to Append. 12. Click the Run button. 13. Notice Results in the Target Data Browser. Note the number of records in the table. 14. Now change the Output Mode to Clear File/Table contents and Append. 15. Run the map and note the results. 16. Save this map as m_OutputModes_Clear_Append.map.xml.

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True

Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB Table: tblAccounts Target Options: none

Target OutputMode: Clear File/Table Contents and Append

Target Field Expressions


R1.AccountNumber R1.Name

Fields("Account Number") Fields("Name")

55 Data Integrator Fundamentals Training

R1.Company R1.Street R1.City R1.State R1.Zip R1.Email R1.BirthDate R1.Favorites

Fields("Company") Fields("Street") Fields("City") Fields("State") Fields("Zip") Fields("Email") Fields("Birth Date") Fields("Favorites")

R1.StandardPayment Fields("Standard Payment") R1.LastPayment R1.Balance

Fields("Payments") Fields("Balance")

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPut Record

Target R1

56 Data Integrator Fundamentals Training

Target Output Modes Delete


Objectives At the end of this lesson you should be able to understand and implement the target output mode: Delete. Keywords: Output Mode, Delete Description The Delete mode is only available when your target is a relational database or an ODBC data Source. When you select Delete From File/Table, Map Designer will search Target data for a match in a key field or fields which you have defined. Therefore, when you select Delete File/Table, you must also define a key using the Target Keys/Index window. When you want to delete specific records from an existing table, you should use the target output mode Delete. Using the Delete mode requires that at least one field in the existing target contains values that match those in one field in the source file. Since the target table already exists, as soon as you specify it and set the output mode to Delete on the Target Tab, you will find the target files fields listed on the Map Tab. Map Designer assumes that the first target field is the key field. If there are additional key fields, highlight them, right-click in the highlight, and choose the Set as Action Key option. (If you need to remove a key, simply highlight the field, right-click in the highlight and choose the Unset Action Key option. Next, you must map values to each of the fields that have action keys. Finally, use the Target Keys and Output Mode Options button to specify whether you want to delete all matching records from the target or just the first one found. When the ClearMapPut Action is triggered, the contents of the key field(s) in the target buffer are compared to all records in the target file, and either the first match or all matches are deleted.

Exercise 1. Connect to the ASCII Delimited file, InactiveAccounts.txt as your source. 2. Set the Header Property to True and click Apply as we have done previously. 3. Create a target connection to the TrainingDB database that we have set up previously. Connect to tblAccounts. Note that when we connected to this table, because it already existed, our output mode was automatically set to Append. Lets change it to Delete. 4. Note that Target Filed AccountNumber was automatically set as the key field. Map the Source Field Account Number to the Target Field AccountNumber. 5. Validate the map, and accept the Default Event Handler. 6. Click the Run button. 7. Notice Results in the Target Data Browser. Note the number of records in the table. 8. Be aware that you will only see results the first time you run the Map. This is because we will remove the matching records the first time and they will no longer exist. You will need to load the original source records into the target table before you run the Delete Mode map a second time. Assuming that you correctly ran the previous Map in Clear and Append mode, you can run it again to prime the table. 9. Save this map as m_ OutputModes_Delete.map.xml. 57 Data Integrator Fundamentals Training

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)InactiveAccounts.txt Header = True

Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB Table: tblAccounts Target Options: none

Target OutputMode: Delete

Target Field Expressions


R1.AccountNumber Fields("Account Number")

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPut Record

Target R1

58 Data Integrator Fundamentals Training

Target Output Modes Update


Objectives At the end of this lesson you should be able to explain exactly what the Target Output Mode Update does. You should also be able to write a transformation that will update specific records in an existing file or table. Keywords: Output Mode, Update File and Schema Mismatch Description The Update mode is only available when your target is a relational database or an ODBC Data Source. When you select Update File/Table, Map Designer will search the target data for a match with a key field or fields that you have defined. You must also define a key using the Target Keys/Index window. Update Mode is similar in operation to Delete Mode. The user must indicate which of the current target fields will be used as the key to identify records to be updated. Each of these key fields must have a mapping expression. You must also determine whether the target table may contain records with duplicate keys. You may wish to update all of them or just the first one found. When Map Designer finds a matching record, the options set in the Target Keys and Output Mode Options dialog control whether and how an update is performed. You may update just the first matching record found or all of them (if the target contains records with duplicate keys). For each of those options, you may decide to insert new records (ones that dont match any record in the target) or not. Finally, you can ignore matching records and simply insert those that dont match any record currently in the Target file. Finally, and most importantly, you must specify in your design which fields will be updated, and with what new values. Your options are to update each Target field with the current value in the Target Field Expression (even if that result is null) or to just update the fields that actually have Target Field Expressions. This is faster if the number of fields you need to update is a small subset of all the fields in the Target file. Mapping plays a much more critical role in Update Mode than in Delete Mode where no mapping other than the key fields has any meaning. In Update, you can choose to update all the fields or just the ones with expressions. You should be careful, however, when choosing the Update All Fields option. Although you may want to do this, it is not the common practice, so you will have to click the radio button in the Target Keys, Indexes and Options dialog box that is marked Allow null values to overwrite data in target fields. When you do, fields that dont have expressions wont simply be left alone- they will be cleared.

Exercise From within the Transformation Map Designer: 1. Connect to the ASCII Delimited file, AccountsUpdate.txt as your source. 2. Set Header to true and apply as we have done previously. 3. Create a target connection to the TrainingDB database that we have set up previously. The table is called tblAccounts. Note that when we connected to this table, because it already existed, our output mode was automatically set to Append. Lets set it to Update. 59 Data Integrator Fundamentals Training

4. Go to the Map Step. Note: In this case the target fields are already defined. This metadata (Field names, Field lengths, Datatypes) is coming from the description of the table in the database. 5. Click inside the target field expression for the Account Number field, and then click the down arrow. 6. Choose Fields(Account Number). Note the space that is not there in the target field. Thus selecting Match by Name to map the data will fail. 7. Choose the corresponding source field names for each Target Field Expression as we have just done. Look at the charts below for specific mapping if needed. 8. Alternatively we could have right clicked in the AccountNumber Target Field Expression and chosen Match by Position. In this case, we would have mapped all of our source fields into the target fields correctly. However it will not always be the case that source fields and target fields will be in the same position. 9. Notice that the Target Field AccountNumber was automatically set as the key field. 10. Open the Target Keys, Indexes and Options dialog box. Note all the options that are possible using Update Mode. In this case the defaults Update all matching records and insert non-matching records and Update only mapped fields are sufficient. Although the Update All fields would give us the same results since we have mapped all fields. 11. Click the Run button. 12. Accept the Default Event Handler. 13. Notice Results in the Target Data Browser. Note the number of records in the table. 14. When we run this map we will be updating the records, so unless you restore the table to its original contents before you run the map again, you wont see any change. You can just run the map we created for the Clear and Append Mode exercise and then run the Delete mode map before re-running this map. 15. Save this map as m_ OutputModes_Update.map.xml.

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)AccountsUpdate.txt Header = True

Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB Table: tblAccounts

60 Data Integrator Fundamentals Training

Target Options:

none

Target OutputMode: Update

Target Field Expressions


R1.AccountNumber R1.Name R1.Company R1.Street R1.City R1.State R1.Zip R1.Email R1.BirthDate R1.Favorites

Fields("Account Number") Fields("Name") Fields("Company") Fields("Street") Fields("City") Fields("State") Fields("Zip") Fields("Email") Fields("Birth Date") Fields("Favorites")

R1.StandardPayment Fields("Standard Payment") R1.LastPayment R1.Balance

Fields("Payments") Fields("Balance")

Define Events: Source R1 Event Handlers Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPut Record

Target R1

61 Data Integrator Fundamentals Training

62 Data Integrator Fundamentals Training

The Rapid Integration Flow Language (RIFL) Script Editor


The RIFL Script Editor is the location where you can write your own scripts (expressions) to include with your Transformations. This Editor includes a list of all of the functions available to you in the Rapid Integration Flow Language (RIFL). In addition, it gives you the syntax for each function. Examples for each function are included in the help files. The RIFL Script Editor allows a user to use point and click and drag and drop with very little typing to create accurate and valid RIFL Scripts to manipulate and validate data during transformations.

63 Data Integrator Fundamentals Training

RIFL Script - Functions


Objectives At the end of this lesson you should be able to use one-line RIFL scripts that are simple function calls or pre-defined functions like NamePart, DateValMask, or concatenating two source fields together into a single target field. Local variables, line continuation and comments will also be discussed. Keywords: Function Builder, Len, Trim, NamePart, Datevalmask, comments, continuation character. Description In this exercise we will manipulate our data as we run the Transformation. To do this we will use RIFL in the Target Field Expressions of the fields that we want to manipulate. Well work with both the Field Mapping Wizard and the RIFL Script Editor. Well be working with the name and the date field.

Exercise In Map Designer: 1. Connect to the ASCII Delimited file, Accounts.txt as the source using the user defined connection Accounts_Delimited.sc.xml. 2. Create a target connection to the TrainingDB database using the ODBC 3.x connector. The table is tblAccounts. Set the Outputmode to Clear File/Table contents and Append. 3. Go to the Map Step. 4. Map all fields correspondingly except for the Name and the BirthDate fields. The first field that well work with is the Birthdate field. In our source, the birth date field has string data in it that appears as: 11/12/1975. Most databases will not accept a string value into a date or datetime field. We will have to convert the date using the RIFL function, Datevalmask, in the Target Field Expression. 5. Double Click in the Birth Date fields Target Field Expression, or select the drop down and choose Build Expression.

6. The RIFL Script Editor for the Birth Date field will open. A list of built in functions is listed in the lower right hand side 7. Find the function DateValMask. And select it. In the windows below you will see a description of the function, and its parameters.

64 Data Integrator Fundamentals Training

8. Double Click on the function to add it to the Script Editor. The function will appear along with its parameters. Use the next steps to define the parameters. 9. In the Script, highlight the parameter DateString. In the lower left pane click on Source R1.

Then in the lower right pane double click Birth Date. 10. Highlight Mask and type mm/dd/yyyy. Masks are used in many RIFL functions. In order to know what values to use for masks, look in the Help files for the topic Picture Mask.

11. The next field we will manipulate is the Name target field. The source data names are in the format, First Middle Last. A sample from the first record is George P Schell. We would like the Name Field in the target to have the format, Last, First Middle Initial. Example: Schell, George P. 12. In the RIFL Script Toolbar Click on the Show Expression Tree icon.

65 Data Integrator Fundamentals Training

13. Select the Name field for the Target.

14. Delete the Fields(Name) value or any other value in the Editor pane so that its blank. 15. In the lower right pane select the NamePart function. 16. Double click the NamePart function to add it to the Scripting Editor. 17. In the editor window select the Mask parameter. Type in l ( a lower case L in double quotes) . 18. Select the Name parameter. Pull in the source field Name as we did above for the Birth Date. (See step 9.) 19. The script that we have created will return only the last name. We will have to parse the other parts of the name and use the concatenation icon format. to create the full name in the desired

20. Use the concatenation operator to add a comma and whitespace to the name format. Write the following script: NamePart(l, Records(R1).Fields(Name)) & , & _ NamePart(f, Records(R1).Fields(Name)) & & _ NamePart(mi, Records(R1).Fields(Name)) Note: For logic purposes this script would need to be all one line. We use the space and the underscore characters as a continuation that allows us to write script on the next line. This makes the script easier to read. 21. Validate the Script Syntax by selecting the Validation Icon. Troubleshooting Tips: 66 Data Integrator Fundamentals Training

Verify that the Script is written as it appears above. Make sure there arent any trailing spaces after the continuation characters (underscores) 22. Click OK in the RIFL Script Editor and save this map as m_RIFLScript_Functions.map.xml in the Development folder. 23. Run the Map and note the results.

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True

Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB Table: tblAccounts Target Options: none

Target OutputMode: Clear File/Table contents and Append

Target Filed Expressions:


R1.AccountNumber R1.Name

Fields("Account Number") NamePart("l", Records("R1").Fields("Name")) & ", " & _ NamePart("f", Records("R1").Fields("Name")) & " " & _ NamePart("mi", Records("R1").Fields("Name")) Fields("Company") Fields("Street") Fields("City") Fields("State") Fields("Zip") Fields("Email")

R1.Company R1.Street R1.City R1.State R1.Zip R1.Email

67 Data Integrator Fundamentals Training

R1.BirthDate R1.Favorites

DateValMask(Fields("Birth Date"), "mm/dd/yyyy") Fields("Favorites")

R1.StandardPayment Fields("Standard Payment") R1.LastPayment R1.Balance

Fields("Payments") Fields("Balance")

Define Events: Source R1 Event Handlers Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPut Record

Target R1

68 Data Integrator Fundamentals Training

RIFL Script Flow Control


Objectives At the end of this lesson you should be able to use multi-line RIFL scripts that utilize one of the flow-control structures like an If-Then-Else statement. Keywords: Flow Control, If then Else, Discard, IsDate, and DateValMask Functions, Editor Properties Description Flow Control is the management of Data flow. As used in RIFL, it is the management of where and/or how a particular piece of source data is mapped into the target. The most commonly used Flow Control function is the If Then Else statement:
If this statement about my data is true then Execute this statement. Else Execute this statement. End if

This exercise will evaluate the dates in the source file to determine if they are valid. If the date for a record is valid then we will write the record to the target. If the date is not valid, we will write a message to the log file and discard the record so it isnt written to the target.

Exercise 1. Create a new Map and connect to the source and target listed below. 2. On the Map tab, map all fields as before except for the Birth Date field. 3. Open the RIFL Script Editor in the Birth Date field. 4. In the lower left pane of the RIFL Script Editor, above All Functions, click Flow Control. In the lower right pane, double click IfThenElse.

69 Data Integrator Fundamentals Training

Notice that the RIFL Script Editor puts the syntax for the If Then Else Statement into the editor window. We will replace condition with a statement that will evaluate to true or false. The statement block one will become actions that will take place if the statement is true. The statement block two will become actions that take place if the statement is false. 5. Enter the following script, replacing what is in the editor.
Dim d d = Records("R1").Fields("Birth Date") If IsDate(d) then DateValMask(d, "mm/dd/yyyy") Else Logmessage("Warn", "Account Number " & Records("R1").Fields("Account Number") &_ " has an invalid date: " & d) Discard() End if

70 Data Integrator Fundamentals Training

Line 01 declares a local variable d that will be available to us only in this script. Line 02 sets d to the value contained in the Birth Date field in the source. Line 04 uses the IsDate function to determine if the string can be converted to a valid date. Line 05 converts the date for use in the target using the DateValMask function. Lines 07 and 08 use the LogMessage function. The first parameter of a LogMessage function is always either Info, Warn, Error, or Debug. The second parameter is the string written to the log file. In this case there are a combination of literal strings and data contained in the source record. Note the continuation character at the end of line 7. Line 09 uses the Discard function which causes the source record not to be written to the target. 6. Click the Validate icon . We should see Expression contains no syntax errors at the bottom of the RIFL Script Editor. Click OK. 7. Validate the map and save it as m_RIFLScript_FlowControl.map.xml. 8. Run the Map and note results in the target. There are only 201 records in the target. 9. Click on the Transformation Log icon function. . Note the results of the LogMessage

Map Summary:
71 Data Integrator Fundamentals Training

Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True

Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB Table: tblAccounts Target Options: none

Target OutputMode: Clear File/Table contents and Append

Target Field Expressions:


R1.AccountNumber R1.Name R1.Company R1.Street R1.City R1.State R1.Zip R1.Email R1.BirthDate

Fields("Account Number") Fields("Name") Fields("Company") Fields("Street") Fields("City") Fields("State") Fields("Zip") Fields("Email")


Dim d d= Records("R1").Fields("Birth Date") If IsDate(d) then DateValMask(d, "mm/dd/yyyy") Else Logmessage("Warn", "Account Number " & _ Records("R1").Fields("Account Number") & "has an invalid date: " & d) Discard() End If

R1.Favorites R1.StandardPayment

Fields("Favorites") Fields("Standard Payment")

72 Data Integrator Fundamentals Training

R1.LastPayment R1.Balance

Fields("Payments") Fields("Balance")

Define Events: Source R1 Event Handlers Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPut Record

Target R1

73 Data Integrator Fundamentals Training

Transformation Map Properties


The property-sheet tool bar button accesses the Properties dialog for all global settings.

Using the Transformation and Map Properties dialog affects many areas of the Transformation Map. These areas include the log file settings, runtime execution properties, error handling and definitions of external code-modules.

74 Data Integrator Fundamentals Training

Reject Connection Info


Objectives Create an additional target file that contains rejected records. Keywords: Reject function, Reject Connect Info, Connection String, and OnReject Event Handler. Description The Reject Connect Info dialog (pictured below) allows you to specify a reject file. You must specify connection string to define where the rejected records will be written. You can manually type the connection string, or you can use the buttons to Build New Connection String, Build Connection String from Source, Build Connection String from Target or Clear Reject.

Exercise 1. Using the previous Map, change the Discard() function call to a Reject() function call. 2. Go to the Map Properties dialog and click Build Connection String from Source.

3. Change the file name in the connect string to BadDateRejects.txt. 4. Using the Target Event Handler OnReject, add a ClearMapPut Record action. 5. Change the target name parameter from Target to Reject. 6. Save the map as m_RIFL_RejectConnectInfo.map.xml. 7. Execute the map by clicking the Run icon. 75 Data Integrator Fundamentals Training

8. Note the results in the Target Data Browser. 9. Navigate to the reject file BadDateRejects.txt. You should see all records that contain invalid dates.

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True

Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB Table: tblAccounts Target Options: none

Target OutputMode: Clear File/Table contents and Append

Target Field Expressions:


R1.AccountNumber R1.Name R1.Company R1.Street R1.City R1.State R1.Zip R1.Email R1.BirthDate

Fields("Account Number") Fields("Name") Fields("Company") Fields("Street") Fields("City") Fields("State") Fields("Zip") Fields("Email")


Dim d d = Records("R1").Fields("Birth Date") If IsDate(d) then Datevalmask(d, "mm/dd/yyyy") Else Logmessage("Warn", "Account Number " & _ Records("R1").Fields("Account Number") & " has an invalid date: " &

76 Data Integrator Fundamentals Training

d) Reject() End If
R1.Favorites R1.StandardPayment R1.LastPayment R1.Balance

Fields("Favorites") Fields("Standard Payment") Fields("Payments") Fields("Balance")

Define Events: Source R1 Event Handlers Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPut Record

Target R1

Define Events: Target R1 Event Handlers Event Name Event Actions Event Parameters
target name record layout

OnReject

ClearMapPut Record

Reject R1

77 Data Integrator Fundamentals Training

Event Handlers & Actions


The event handling capabilities in the Map Designer are designed to allow tremendous flexibility in the handling of data. Actions can be triggered at virtually any point in the Transformation process. Messages can be logged, expressions executed, possible errors can be traced, normal data manipulation and memory clearing can be done, and the Transformation itself can be terminated. You have complete control over when these Actions occur, what Actions occur, and how many Actions occur.

78 Data Integrator Fundamentals Training

Understanding Event Handlers


Objectives At the end of this lesson you should understand the relationship between an event and an Event Handler, between an Event Handler and Event Actions and between Event Actions and Event Action Parameters. Keywords: Event Action, Event Handlers, Event Precedence, Default Event Handler, ClearMapPut Action, Execute Action Event Concepts An Event is a point in time in the life of a transformation, similar to an event in your lifetime. As in your own life, some events only occur once (e.g., you graduate from high school) while other events occur repeatedly (e.g., you have a birthday). In a transformation, two events are at the start and end of the transformation, each of these events only occurs once. Another event might be when a source record is read, which will probably occur many times. A transformation may be thought of as a long sequence of events. Some events occur one time while others occur many times. And other groups of events may repeat over and over. As part of your transformation design process, you may choose to perform one or more tasks when one or more of these events occur. Your transformation will use at least one event and that event will perform at least one task. The tasks that events perform are called Actions. There is a wide range of actions available for each event. When you decide to perform an action, you have the ability to control just how that action is performed. These control specifications are called Action Parameters. As a simple example, you might decide to use the event that occurs every time a source record is read (the AfterEveryRecord event) and you might decide to perform the action that causes a target record to be written (the ClearMapPut action). But you might have multiple target record layouts from which to choose, so you might supply an action parameter for the action to specify the target record layout you wish to write to. The diagram below demonstrates a logical flow of an event, its actions, and the action parameters.

Event Event Action Action Parameter

A source record has been read from the source file. Transform the data and write a record to the target file. Fire this action only if Fields(Status) of the source record == Active.

Using an Event The first task is to choose an event to use. Events are grouped in a number of places. There are events that apply to the transformation as a whole (e.g., BeforeTransformation). These can be found in the Transformation and Map Properties dialog. Next, there are source record events that apply to each specific source record type (e.g., AfterEveryRecord). These can be found in the source hierarchy on the Map Tab under each record types heading. Next, there are source record events that apply to each and every source record that is read, and these can be found under the General Event Handlers heading in the source hierarchy. Finally, there are two groups of target record events - one group that applies to target records of a specific type and one that applies to each and every target 79 Data Integrator Fundamentals Training

record (no matter what type). These are found in the target hierarchy under headings like those for the source record events. The Default Event Handler A transformation must have at least one event, and that event must have at least one action. To ensure that your transformations meet this requirement, Map Designer will define an event and an action if you do not create any events. The event that it creates is the AfterEveryRecord event for the source file, and the action that it supplies is the ClearMapPut action for the target file. This event and its associated action are collectively referred to as the Default Event Handler. This Default Event Handler will automatically read every source record and clear the target buffer, execute all of the mapping expressions and then write the target buffer contents to the target file for each source record. When Map Designer supplies this default event handler, you are informed via an on-screen message box. However, Map Designer supplies the default event handler ONLY if you do not create any event handlers. If you do, then Map Designer WILL NOT ADD the default event handler. Map Designer will, however, warn you when you are about to run a transformation that has no event action that will cause a target record to be written. Commonly Used Events Some events are very basic and are used frequently. Most of these events will be discussed and used in the exercises in this course module. You should be aware of these events and when they occur. BeforeTransformation: This is the first event that occurs in any transformation, and is very useful for all the housekeeping and set-up tasks that you may wish to perform. After Transformation: This is the last event that occurs before a transformation ends, and it is very useful for accessing final totals and other values, and performing housekeeping and clean-up tasks. Specific AfterEveryRecord: The word specific refers to an event that is tied to a particular source or target record type. This event occurs whenever a source record of a specific type is read, and is the ideal place to perform the action you want to do using the values from each source record. Specific AfterFirstRecord: This event only occurs when the first record of a specific type is read, and it is the ideal event in which to perform housekeeping and set-up tasks that relate to a single record type. General AfterFirstRecord: The word General refers to an event that is not tied to a particular source or target record type. This particular event occurs only when the first record is read from the source file and is again a great place to perform general housekeeping and set-up tasks that relate to all record types. General AfterEveryRecord: This event occurs whenever a source record is read from the source file- no matter what type it may be. It is the best place to put common tasks- those that will apply to all source records. Commonly Used Actions There are many actions that you can perform when a particular event occurs. Some actions are used very often and are common to many events. The two most common actions are: 80 Data Integrator Fundamentals Training

ClearMapPut: This action is three actions in one. The first action clears the target buffer (for the record type specified in its Layout parameter). Next, it executes all the mapping expressions that you have supplied for each field in the target buffer, in effect filling the target buffer fields with the data specified by the Target Field Expressions. Finally, it writes the contents of the buffer to the target file. A visual representation of these actions is pictured in the next topic. Execute: This action executes a script created with the RIFL Script Editor. The scripts you write and execute perform the work of your transformation. For additional documentation of using Event Handlers, read the Event Management Guide: http://docs.pervasive.com/products/integration/download/events.pdf.

81 Data Integrator Fundamentals Training

Source and Target Buffers ClearMapPut Action

I. Initial Buffer State before the transformation begins.

II. State of Buffers after n number or records are processed.

82 Data Integrator Fundamentals Training

III. State of Buffers after a new Source record is read and the data is stored in the Source Buffer.

IV. The first action of the ClearMapPut causes the Target Buffer to be cleared.

83 Data Integrator Fundamentals Training

V. The second action of the ClearMapPut causes the data in the Source Buffer to be mapped to the Target Buffer.

VI. The third action of the ClearMapPut causes the data to be written to the Target File or Table.

84 Data Integrator Fundamentals Training

Event Sequence Issues


Objectives At the end of this lesson you should understand how to define Events, and have a general understanding of the rules governing the sequence in which Events occur in a typical transformation. Keywords: Event Precedence, Null Connector, Global variables Description There are many Events available in a transformation. Using the appropriate events will depend on the sequence in which they are activated. There is an Event Precedence Framework that dictates the sequence in which events will occur based on the Map instructions provided. The Events that are activated depend on the Events you have chosen to utilize, and the data in the source file. In order to derive the general rules of the Event Precedence Framework, you can perform your own tests. You can create a transformation that uses the Null source connector and that produces a target file with two fields: (1) The record number from the source; (2) The contents of a global variable. Then, choose the events you are interested in testing. For each one, set the global variable equal to the name of the event, and then write a target record. When you examine the target file, you will see the order in which the events were activated. This exercise will introduce the use of global variables. Using the Global Variables option, you can specify scalar variables, internal objects, or ActiveX objects at the Private or Public level in your Transformations. Global variables are defined in the Map Properties dialog.

85 Data Integrator Fundamentals Training

Exercise 1. Create a map based on the specifications given below. Save the Map as m_Events_SequenceTest.map.xml. 2. Run the map and observe the results. Most of our exercises make some attempt to mimic a real world situation in a simplified fashion. This exercise, however, is pure classroom.

Map Summary:

Define the Source: Source Connector: Null Source Options: Record count = 5

Define the Target: Target Connector: Target Data: Target Options: ASCII(Delimited) $(FUN_DATA)EventNames.txt Header = True

Target OutputMode: Replace

Variables: Name Type Public Value

eventName Variant no

Define Events: Transformation Events Event Name Event Actions Event Parameters
Expression:

Before Transformation

Execute

eventName = "Before Transformation"

ClearMapPut Record

target name record layout

Target R1 86

Data Integrator Fundamentals Training

After Transformation

Execute

Expression:

eventName = "After Transformation"

ClearMapPut Record

target name record layout

Target R1

Define Events: Source R1 Events Event Name Event Actions Event Parameters
Expression:

AfterEveryRecord

Execute

eventName = "R1 AfterEveryRecord"

ClearMapPut Record

target name record layout

Target R1

Define Events: Source General Events Event Name Event Actions Event Parameters
Expression:

AfterEveryRecord

Execute

eventName = "General AfterEveryRecord"

ClearMapPut Record

target name record layout

Target R1

BeforeFirstRecord

Execute

Expression:

eventName = "General BeforeFirstRecord"

ClearMapPut Record

target name record layout

Target R1

OnEOF

Execute

Expression:

eventName = "General OnEOF"

ClearMapPut Record

target name record layout

Target R1

Note: The following target fields should be created manually through the user interface.

87 Data Integrator Fundamentals Training

Target Schema: Record R1


Name Type Length Description 16 30

RecordNumber Text EventName Text

Target Field Expressions


R1.RecordNumber Fields("Record Number") R1.EventName

eventName

88 Data Integrator Fundamentals Training

Using Action Parameters Conditional Put


Objectives At the end of this lesson you should be able to open the action list for an event, choose an action and add it to the list, supply mandatory and optional parameters for the action and place the action in the correct sequence within the action list. For those actions that allow it, you should also know how to make the execution of an action conditional. Keywords: AfterEveryRecord event, and ClearMapPut action Description Actions are controlled by setting Action Parameters. For example, changing the Target Record layout parameter determines what expressions are used and what kind of target record will be written. Many actions can be performed conditionally. These actions will have a Count and/or Counter Variable parameters. The Count parameter accepts any expression, the result of which must be a numeric value. When this value is zero, the action is not performed; when it is one, the action is performed. When the value is greater than one, the action is performed repetitiously based on the value returned. By default the Count Parameter has a value of 1. The Counter Variable parameter provides and index for the current repetition count. This exercise will use the Count Parameter to write an expression that checks for invalid birthdates in the source file. By returning a 0 when we find a record with an invalid date, ClearMapPut will not fire and the source record will not appear in the target. By returning a 1when the date is valid, the ClearMapPut will fire once, and we will see the record in the target. In order to become more familiar with the use of global variables, we will increment a variable that is keeping track if the number of invalid dates we have. At the end of the Map we will display a message box to see how many birthdates are invalid.

Exercise 1. Create our map based on the specifications given below. 2. Save the map as m_Events_ConditionalPut.map.xml. 3. Run the map and observe the results.

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True

89 Data Integrator Fundamentals Training

Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB Table: tblAccounts Target Options: none

Target OutputMode: Clear File/Table contents and Append

Variables:
Name Type Public Value 0

varBadDates Variant no

Target Filed Expressions


R1.AccountNumber R1.Name R1.Company R1.Street R1.City R1.State R1.Zip R1.Email R1.BirthDate R1.Favorites

Records("R1").Fields("Account Number") Records("R1").Fields("Name") Records("R1").Fields("Company") Records("R1").Fields("Street") Records("R1").Fields("City") Records("R1").Fields("State") Records("R1").Fields("Zip") Records("R1").Fields("Email") DateValMask(Records("R1").Fields("Birth Date"),"mm/dd/yyyy") Records("R1").Fields("Favorites")

R1.StandardPayment Records("R1").Fields("Standard Payment") R1.LastPayment R1.Balance

Records("R1").Fields("Payments") Records("R1").Fields("Balance")

90 Data Integrator Fundamentals Training

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout count Dim d d = Records("R1").Fields("Birth Date") ' Use flow control to test for a valid date If IsDate(d) Then ' Enable the Put action by returning 1 1 Else ' Invalid date, log a message Logmessage("Error", "Account number: " & Records("R1").Fields("Account Number") & _ " has an invalid date: " & d) ' Increment counter varBadDates = varBadDates + 1 ' Suppress the Put action by setting to zero 0 End If

AfterEveryRecord

ClearMapPut Record

Target R1

91 Data Integrator Fundamentals Training

Using OnDataChange Events


Objectives At the end of this lesson you should be able to use an OnDataChange event to execute certain actions whenever the value of a field by which the input file is in order changes. You should also be able to manipulate the first and last data change events to achieve your desired results. Keywords: OnDataChange, and Record Type Event Handlers Description When source files are sorted by one or more data items, Map Designer gives you the ability to monitor the value of one of those sort keys as source records are put into the source buffer. When the value changes from one record to the next, you have the ability take whatever actions you wish. This is accomplished by suing an OnDataChange Event and defining actions for that event. Processing of this type is very common. There are three situations in which it is often used. To produce summary information in the target file. To optimize transformations performing lookups. When the target has a hierarchical structure such as an XML file. When using an OnDataChange Event, first specify the source field by which the source file is sorted. The transformation will then monitor the position that field occupies in the source buffer. Whenever a new source record is placed in the source buffer, the transformation will compare the value of that field in the new record to the value of the field in the previous record. When the values are different, the OnDataChange Event Handler will execute the list of actions specified. For all of this to work, the source file should be in order by the value(s) being monitored. If it is not, you can either (1) physically sort it prior to its input into the transformation or (2) allow the transformation to dynamically sort it. For flat files, using the Source Keys and Sorting dialog will perform this dynamic sort. For an SQL source, you can use the Order By clause in your SQL query. It is true that sorting the data at the beginning of your transformation increases execution time, but the reductions in execution time that are possible with the OnDataChange strategy will usually far outweigh the overhead of the sort itself. This is particularly true when IO-intensive operations, such as lookups, are involved. You can monitor up to five different data items in a single transformation. There is also an event that is activated whenever any monitored field changes and an event that is only activated when all monitored fields change at the same time. This exercise builds a map that sorts our Accounts.txt file by state. Our target file will have one record for every state in the source file. Each record will have three fields, the state, the number of accounts in that state, and the total balance of all records in that state.

Exercise 1. Create our map based on the specifications given below. Save the Map as m_Events_OnDataChange.map.xml. 2. Run the map and observe the result. 92 Data Integrator Fundamentals Training

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True

Automatic Transformation Feature: Sort Fields on Source Data: Fields("State") type=Text ascending=yes length=2

Define the Target: Target Connector: Target Data: Excel 2000 or Excel XP File: $(FUN_DATA)AccountSummariesByState.xls Sheet: Sheet1 Target Options: Header Record Row: 1

Target OutputMode: Replace

Target Schema
Field Name State Number_of_Accounts Type Length Description Text Text 16 16 16 48

Total_Balance_of_Accounts Text Total

Variables:
Name Type Public Value

93 Data Integrator Fundamentals Training

varState varCounter varBalance

Variant Variant Variant

no no no 0 0

Target Field Expressions


R1.State R1.Number_of_Accounts R1.Total_Balance_of_Accounts

varState varCounter varBalance

Define Events: Source R1 Event Handlers Event Name Event Actions Event Parameters
Expression:

AfterEveryRecord

Execute

' Set the state value for the current record since it will change when we are ready to write the data to the target varState = Records("R1").Fields("State") ' Increment the counter varCounter = varCounter + 1 ' Accumulate the balance varBalance = varBalance + Records("R1").Fields("Balance")

There are also two special situations that should be considered. When the very first record is placed in the buffer the value of the field being monitored will have changed since the source buffer is always filled with null values at the start of a transformation. So the OnDataChange Event will fire after the first record is read. Similarly, when the source buffer is cleared after the last record has been processed, the value of the field being monitored will change from some real value to a null value, and again the OnDataChange Event will be fired. However, these situations may or may not be useful in any given transformation. Therefore, you have the option of suppressing one or the other, or both of them. This is controlled in the Data Change Event Management Options.

94 Data Integrator Fundamentals Training

Define Events: Source R1 OnDataChangeEvent Monitor: Records("R1").Fields("State") Management: Suppress first ODC event, Fire Extra ODC event at EOF Event Name Event Actions Event Parameters
target name record layout

OnDataChange1

ClearMapPut Record

Target R1

Execute

Expression:

' Reset the variables for the records belonging to the next state varCounter = 0 varBalance = 0

After reviewing the target data you may notice that the precision of the decimal places is not formatted correctly. This precision becomes distorted because the Balance field, which is stored as a text value is converted to a numeric value, addition of the values is performed, and then the data is again converted into text. In order to fix the precision, we can change the Source Data type to Decimal and set the number of decimal places to 2.

95 Data Integrator Fundamentals Training

Trapping Processing Errors with Events


Objectives At the end of this lesson you should be able to use the OnError event handler to trap and handle processing errors yourself, including using file management functions to record information in a file independent of the Source, Target or Reject connections. Keywords: Error Message Reference Chart, Error and Event Preferences, Chr, MacroExpand, FileAppend, and File Functions Description In order to handle errors, we need to be aware of the types of errors that can exist, the options we have in dealing with them, the list of specific error codes that we may wish to deal with and then some strategies for dealing with the records that may cause these errors. In general, there are three types of errors that we may be concerned with. Fatal errors are those that will cause a map to terminate. General errors are those that are not fatal but may affect the transformation process. An example might be a read error for a specific source record. Warnings are not necessarily errors, and include data truncation, field name changes, loss of precision, and so on. In the Error Logging Preferences, we can set these errors (and other messages) to be logged or not. We also have some control over when the transformation terminates. In other cases, we may wish to deal with certain errors, and it would be useful to know what all the errors are and how they can be identified. There are too many to list here, but you can find a complete list of the errors and error codes in the Help System in the series of pages entitled Errors. To give you flexibility in handling these errors, there are a number of individual events that are tied to specific errors; when that error occurs, the Map Design checks the matching event handler to see whether you have supplied actions to be performed. If so, they are executed. If not, then whatever would have happened as regards that error (e.g., the transformation aborts) will happen. When we wish, we can use the specialized error event handlers, such as the OnTruncateError Event. The transformation will automatically take care of identifying the error and will transfer control to that event handler instead of aborting the transformation. In the event handler, we can perform any tasks we wish to deal with the error. Once we have done so, we can either allow the transformation to terminate or cause the transformation to pick up where it left off by using the Resume action. In this exercise, our source will be the Accounts.txt file. We will create a target file to show how many months each customer will take to pay off their balance if they continue to make payments of a certain amount. We will derive this value by dividing the Balance field by the Payments field. However, there will be a problem if we have a customer that has a 0 value in the payment field since we will be attempting to divide by zero. In this case Map Designer will throw an error. We can catch this error using the error handling event handlers. Another goal of this exercise is to write an additional file that contains the values of the payments and balances that cause the error. We will do this with the FileAppend function as well as other file manipulation functions.

Exercise 1. Create our map based on the specifications given below. Save the map as m_Events_OnError_Event.map.xml. 2. Run the map and observe the results. 96 Data Integrator Fundamentals Training

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True

Define the Target: Target Connector: Target Data: Target Options: ASCII(Delimited) File: $(FUN_DATA)PaymentsRemaining.txt Header = True

Target OutputMode: Replace

Target Fields:
Name Account Number Payments Balance Type Length Description Text Text Text 9 7 6 16

Payments Remaining Text

Target Field Expressions


R1.Account Number R1.Payments R1.Balance

Records("R1").Fields("Account Number") Records("R1").Fields("Payments") Records("R1").Fields("Balance")

R1.Payments Remaining Dim pmt, bal

pmt = Records("R1").Fields("Payments") bal = Records("R1").Fields("Balance") If Int(bal/pmt) == bal/pmt then bal/pmt Else

97 Data Integrator Fundamentals Training

Int(bal/pmt) + 1 End if

Variables
Name Type Public no no Value 0

flagFirstTime Variant errorFile Variant

Define Events: Source General Events Event Name Event Actions Event Parameters
Expression:

BeforeFirstRecord

Execute

Set the value of the file variable errorFile = MacroExpand("$(FUN_DATA)DivideByZero.txt") If FileExists(errorFile) Then FileDelete(errorFile) End If

Note: This example shows the functionality of the MacroExpand. FileExists and FileDelete functions, though similar results could be had by using:
FileWrite(errorFile, "AcctNumber" & sep & "Payt" & sep & "Bal" & crlf) where sep = | and crlf = Chr(13)&Chr(10)

This would replace any existing file with a file that contains only the header. This would also make the flagFirstTime variable unnecessary.

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPut Record

Target R1

98 Data Integrator Fundamentals Training

Define Events: Target General Events Event Name Event Actions Event Parameters
Expression:

OnError

Execute

Dim sep, crlf sep = "|" crlf = Chr(13) & Chr(10) If flagFirstTime == 0 Then 'Write the header for the error file FileAppend(errorFile , "AcctNumber" & sep & "Payt" & sep & "Bal" & crlf) ' set flag to 1 so header will not be written next time flagFirstTime = 1 End If FileAppend(errorFile, Records("R1").Fields("Account Number") & sep & _ Records("R1").Fields("Payments") & sep & _ Records("R1").Fields("Balance") & crlf)

Resume

none

Do not forget the Resume Action. The Resume Action is what causes the map to continue processing the remaining records after the error is handled. 3. Observe the DividebyZero.txt file that is created in the Data folder.

99 Data Integrator Fundamentals Training

Error and Exception Handling Review

Types of Errors and Log Messages Errors occur during data integration at design time and run time.

Design-Time Errors Design-time errors occur while you are using an application such as Map Designer and Process Designer. Run-Time Errors Run-time errors occur during the execution of a transformation or process. Because the designers can run transformations and processes through Integration Engine, run-time errors can also be displayed in the designer interfaces. Run-time errors are generated from the following places: Designers RIFL scripts Integration Engine command line console SDK code 100 Data Integrator Fundamentals Training

Error Log Messages All errors originate from the Integration Engine, but the log to which they are written depends on the interface being used. For instance, if you are using Map Designer, then errors are logged to an error file. If you are using Integration Engine, then error messages are displayed in the console and written to a log file. See the following topics for more information on error logging. Map Designer Errors that occur in Map Designer are displayed in a dialog box and logged to a TransformMap.log file. Process Designer Errors that occur in Process Designer are logged to a process log file named by the designer. Integration Engine In the Integration Engine, the last error message logged while loading, changing or running a transformation or process is logged to the command line interface console. All error messages are logged to a log file. The default name for the log file depends upon which interface you are calling. For instance, if you are running a transformation on the command line, errors are logged to the TransformMap.log file; if you are running a process, errors are logged to a process log file. RIFL The Rapid Integration Flow Language (RIFL) includes functions and statements that return information about errors to the error log files. For instance, you can use the LogMessage Function to write messages to an error log file, and the On Error GoTo Statement to trap run-time errors.

101 Data Integrator Fundamentals Training

Comprehensive Review

To test our knowledge and review the introductory module for the Cosmos Integration Essentials courses we want to design a Map to load data in the Accounts.txt file into a target database table. Basic Map specifications: Source Connector: ASCII (Delimited) Source File: Accounts.txt Header property: True Target Connector: ODBC 3.x Data Name Source: TrainingDB Table: tblIllini Output Mode: Replace Table

Exercise 1. Map the four target fields with the appropriate data from the source. 2. Use the appropriate Event and Action that will write all source records to the target. Hint: This is also the Default Event Handler. 3. In the Target BirthDate Field, use an appropriate Date/Time function to convert the formatted date strings into a real date-time data type. 4. Test for invalid dates using the IsDate function, and reject the invalid records to an ASCII Delimited file named Reject_Accounts.txt. 5. Reject all records from the state of Illinois (IL) into the Reject_Accounts.txt file as well. Hint: You will have to use a Target Event to write the rejected record to the file. 6. Aggregate the Balances from all rejected records using a global variable. 7. Report the aggregated balance (total balance) in the log file using the LogMessage function.

The solution to this review is in the Solutions folder. It is named m_Comprehensive_Review.map.xml. Open it and look only if you get stuck. It should be noted that the solution map shows only one way to complete this exercise. There are several.

102 Data Integrator Fundamentals Training

103 Data Integrator Fundamentals Training

Metadata Using the Schema Designers

104 Data Integrator Fundamentals Training

Structured Schema Designer


The Structured Schema Designer provides a visual user interface for designing structural data files. The resulting metadata is stored as Structured Schema files with an .ss.xml extension. The .ss.xml files include schema, record recognition rule and record validation rule informati on. In the Structured Schema Designer, you can create or modify schemas that can be accessed in the Map Designer to provide structure for Source or Target files. You can also use the Data Parser to manually parse Binary, fixed-length ASCII, or any other files that do not have internal metadata. The Data Parser defines Source record length, defines Source field sizes and data types, defines Source data properties, assigns Source field names, and defines Schemas with multiple record types.

105 Data Integrator Fundamentals Training

No Metadata Available (ASCII Fixed)


Objectives At the end of this lesson you should be able to define and create a structured schema. Keywords: Data Parser, Modify a Schema, and Hex Browser Description The first step will be to tell the Structured Schema what type of file is being defined, so pick the appropriate connector first. You can change the name of the default record type from R1 to something more meaningful for your own task. This is done by choosing Record Types in the hierarchy and overtyping the existing default name. To enter the field information for your record layout, select the Fields entry in the navigation tree and enter the first field name. Continually tab through the grid to enter a description, if desired, select a data type from the dropdown and enter the field length. This method assumes you have documentation that describes the structure of the file. If you do not already know the structure of the file, you can use the visual parser to make educated guesses until the file is parsed correctly. When youre done, you can browse the data to ensure that your definitions were accurate and then save the structured schema using a name of your choosing.

Exercise 1. Start a New Structured Schema design and choose the ASCII Fixed connector. 2. Click the Visual Parser toolbar button (Red Knife). 3. Navigate to the file named Payments.txt. 4. Click in the current row (blue highlight) between the fields and name the fields by overtyping in the Field Name drop down list. 5. Save the Structured Schema as s_Payments.ss.xml.

Record Layouts
Record R1
Name Type Length 9 8 10 27

AccountNumber Text PaymentDate Amount Total Text Text

106 Data Integrator Fundamentals Training

External Metadata (Cobol Copybook)


Objectives At the end of this lesson you should be able to use a Cobol Copybook to create a new schema. Keywords: Copybook Description You can quickly import the structure from an external definition file. If you do this from a Map Design session, it will not create a Structured Schema for reuse later. If you do this from a Structured Schema Design session, you will have the ss.xml file for reuse. If you decided to use a Cobol Copybook to create a new schema, you will define this schema in the Enter External Connection Info Window. External Connector Section This displays the connector you choose from the Connection pull down. You cannot change the connector from this pull down but must make the change in the toolbar options pull down, Connections. Layout/Record Name When you choose the External File to duplicate, the data will populate the Layout/Record Name section. You will need to select layouts required for the schema by clicking each item in the Add to Layouts column. There are two buttons to aid in selecting Layout/Record items: Select all Click Select all to choose all of the Layout/Record items. Unselect all If you need to make a change to the Layout/Record Name section, Click Unselect all to start over.

Exercise 1. Start a New Structured Schema design session and choose the Binary connector. 2. Using the drop-down menu in the upper right hand of the window, choose Cobol 01. 3. Navigate to the file named Accounts_Cobol.cbk. 4. Click the Layout/Record Name(s) you want to import. 5. Click OK. 6. Review the structure in the grid view. 7. Save the Structured Schema as s_CobolCopyBook_Accounts.ss.xml.

107 Data Integrator Fundamentals Training

Record Layouts
Record ACCOUNT_INFO
Name ACCTNUM NAME COMPANY STREET CITY STATE POSTCODE EMAIL Type Display Display Display Display Display Display Display Display Length 9 21 31 35 16 2 10 25 10 11 6 6 6

BIRTHDATE Display FAVORITES STDPAYT LASTPAYT BALANCE Display Display sign leading Display sign leading Display sign leading

108 Data Integrator Fundamentals Training

109 Data Integrator Fundamentals Training

Extract Schema Designer


The Extract Schema Designer is a software product that has the ability to read complex text files of many kinds. The amount of computer data is exploding all around us and grows vastly each year, and much of it is provided in raw text formats. Some examples of the many sources handled by the Extract Schema Designer follow:

Printouts from programs captured as disk files Reports of any size or dimension ASCII or any type of EBCDIC text files Spooled print files Fixed length sequential files Complex multi-line files Downloaded text files (e.g., news retrieval, financial, real estate...) HTML and other structured documents Internet text downloads E-mail header and body On-line textual databases CD-ROM textbases Files with tagged data fields XML HL7 Swift

Extract Schema Designer does NOT use the XML repository that all of our other Design Tools use. Extract Schema Designer saves extracts in two ways. The first is in a script file in Content Extractor Language with a .cxl extension. This file is only useful as part of a Source Connection in Map Designer. It cannot be imported into Extract Schema Designer to be edited. The second way that an Extract is saved is in an Access Database. The default path and filename for this database is C:\Program Files\Pervasive\Cosmos9\Common\extractor900.mdb. . Extracts stored here can be reopened and edited. Content Extractor Language is very rich and expressive, and provides many advanced data manipulation and formatting capabilities. CXL can be used to create or customize complex scripts necessary for text files whose patterns and rules may be beyond the functionality of the user interface supplied with the Extract Schema Designer. More information about this language is available in the Content Extraction Language Help file under the SDK Help Files. The default path and filename for this file is C:\Program Files\Pervasive\Cosmos9\Common\Help\SDKs\cxl_sdk.pdf. Former users of Data Junction Content Extractor should be aware that the script files are no longer called DJP files. They are known as CXL files now.

110 Data Integrator Fundamentals Training

There are several legacy names that may be used in place of the default connector name of Extract Schema Designers Connector. This list includes: Cambio, Content Extractor, Extractor, and Report Reader. There are also two connectors that have a pre-designed script included with the software that parse statistical information from the log file automatically. These are Data Junction Log File and Integration Log File.

111 Data Integrator Fundamentals Training

Interface Fundamentals & CXL

Keywords: Extract Schema Designer Mechanics: Line Styles,


Fields, Accept Record, Automatic Parsing Description The first file that we will be parsing is Purchases_Phone.txt. We should take a look at it first in a text viewer. Although it might be possible to use this report file as a direct input for a transformation, we would have to define it as a multiple-record-type file. With so many record types and so much processing involved with them, writing the transformation would be time consuming. So what we will do is use the Extract Schema Designer to create an extract specification that will transform the report file into a more familiar row/column format, and then use that formatted data as input to the transformation that adds these purchases to the database table. We dont even have to have a twostep procedure; neither do we have to read the report file twice. Once the extract schema is defined, we can create a transformation, specify the report file as the Source, and apply the Extract Schema to it. The file will then be presented to the transformation in simple rows and columns- complete with headers.

Exercise Start Extract Schema Designer. 1. From the Repository Explorer, select New Object Extract Schema. 2. At the prompt, navigate to the file you will be working with, in this case, Purchases_Phone.txt. 3. Choose OK to accept the Source Options defaults. 4. Highlight the word Category on one of the Category lines and right-click in the highlight. 5. Select Define Line Style New Line Style. 6. Verify that all defaults are acceptable and click Add. Weve now defined a Line Style for the Category field. 7. Highlight the Category code on one of the Category lines and right-click in the highlight. 8. Select Define Data Field New Data Field. 9. Change the field name to Category. 10. Verify that all other defaults are acceptable and click Add. Weve now defined the Category Data Field. 11. Highlight a ProductNumber and the rest of the spaces on the line and right-click in the highlight. 12. Select Define Data Field New Data Field. 13. Change the field name to ProductNumber. 14. Verify that all other defaults are acceptable and click Add.

112 Data Integrator Fundamentals Training

15. Highlight a Quantity and all but one of the spaces between the actual digits of the Quantity and the colon following the literal Quantity (if any). 16. Right-click in the highlight and select Define Data Field New Data Field. 17. Change the field name to Quantity. 18. Verify that all other defaults are acceptable and click Add. Now lets ensure that Source Options will allow parsing: 19. Select Source Options from the Menu bar. 20. On the Extract Design Choices tab, look in the Tag Separator dropdown to see if there is a character sequence that matches the sequences used in your data to separate Line Style tags from actual data. If there is, select it. If there is not, then automatic parsing is not available. Also on this tab, ensure that the Trim Leading and Trailing Spaces checkbox is selected. 21. On the Display Choices tab, ensure that the Pad Lines checkbox is selected. 22. Click OK to accept the selections. Now lets define the UnitCost Line Style and Data Field simultaneously. 23. Highlight an entire UnitCost line in the data and right-click in the highlight. 24. Select Define Data Field Parse Tagged Data. Note: When Line Styles and Fields are defined in this way, the default name for the Field is exactly the same as that for the Line Style, so no change to the field name is usually necessary. If a change is desired, however, point your cursor to the actual field data in the display and double-click on the data. This will bring up the Field Definition dialog box and you can change the name (or other characteristics) here. Now well define the TotalCost and ShipmentMethodCode Line Styles and Data Fields simultaneously. 25. Highlight an entire TotalCost line and ShipmentMethodCode line in the data. 26. Right-click in the highlight and select Define Data Field Parse Tagged Data. The next thing is to define the Line Style that determines the end of a row of data for the Extract File. 27. Locate the Line Style that contains the Field that will be the last column in each row in the eventual extract file (in this case, ShipmentMethodCode). 28. Double-click on the Line Style name to bring up the Line Style Definition dialog. 29. On the Line Action tab, choose ACCEPT Record, and accept the remaining defaults. 30. Click Update. Test the Extract to ensure that your definitions are correct. 31. Click on the Browse Data Record button. 32. Choose OK to allow assignment of all Fields to the Extract File. 33. Examine the data to ensure that your Field definitions are correct. 113 Data Integrator Fundamentals Training

34. Close the browser window. 35. Use the Parse Tagged Data functionality to define the Account Number, Purchase Order Number and PODate fields. 36. Double-click on a Purchase Order Number to access the Field Definition dialog. Note: The options at this dialog determine how the Extract Schema Designer will process the data in this particular field from record to record. The use of these options makes a distinction between the data fields and the contents of those fields. When the Extract Schema Designer is collecting data fields, it collects all the fields that have been defined on lines of text whose line action is either COLLECT Fields or ACCEPT Record and assembles those fields into a data record. The options at this dialog determine how data within a data field is handled.

37. On the Data Collection/Output tab, ensure that Propagate Field Contents has been selected. 38. Double-click on a PODate to access the Field Definition dialog. 39. On the Data Collection/Output tab, select Flush Field Contents. 40. Click Update. 41. Click on the Browse Data Record button. 42. Choose OK to allow assignment of all Fields to the Extract File. 43. Examine the data to see the effect of Propagate versus Flush. 44. Close the browser window. 45. Redefine the PODate field to propagate it as well. 46. Browse the data record again to ensure the data is being propagated.

114 Data Integrator Fundamentals Training

Note: In this case we do want the data to propagate, but you will need to decide which behavior you want for any situation. We can specify an order for the columns in your Extract File (if desired). 47. Choose Field Export Field Layout from the menu bar. 48. To reposition a column, left-click and drag a column name up or down in the list, dropping it on top of another column name. Note: When you drag upward, the column you are dragging will be placed before the column on which you drop it. When you drag downward, the column you are dragging will be placed after the column on which you drop it. 49. Put the six columns in the order they appear in the source file. 50. Click OK. 51. Exclude columns from the Extract File (if desired). 52. Select Record Edit Accept Record from the menu bar. 53. Clear the check boxes for the columns that you do not wish to appear in the Extract File. 54. Click Update. 55. Save the Extract Schema Definition: If the Extract Schema Definition has already been saved before, click the Save Extract button to save it again under the same name. You may also choose File Save Extract to perform the same function. If the Extract Schema Definition has not yet been saved, click the Save Extract button. In the Save Extract dialog, supply the name Phone_Purchases.cxl and verify the location where the Definition will be stored (changing it if necessary). You may also choose File Save Extract to perform the same function. If the Extract Schema Definition has been saved before, but you have modified it and want to save it as a different Definition, then choose File Save Extract As. In the Save Extract dialog, supply a name for the Definition and verify or supply the save location. 56. Close the Extract Schema Designer. 57. Open Map Designer and establish a source connection based on the information below. 58. Open the Source Data Browser and note the results. Note that this source could now be used in the same way that any other source would be in a transformation. 59. Close Map Designer without saving.

115 Data Integrator Fundamentals Training

Data Collection/Output Options

Keywords: Data output properties: Flush or Propagate field


contents Description Most of the time, all you will want to extract from files such as our report file is the actual data that describes the business objects- in this case, the purchases. But sometimes, there will be other information in the file that you would also like to capture. For example there may be header or footer information that you would like to have available in the transformation. With the Extract Schema Designer we can define header and/or footer information, and add it, as additional columns, to the row/column file specification.

Exercise 1. From the Repository Explorer, select New Object Extract Schema. 2. At the file selection prompt, click Cancel. 3. Double-click on the Purchases_Phone.cxl script to open it. 4. Choose File Save Extract As and save the extract again as Purchases_Phone2.cxl. 5. Highlight the first slash in the ReportDate. 6. Right-click in the highlight and select Define Line Style New Line Style. 7. Change the proposed name to ReportDate. 8. Choose Add. 9. Highlight the second slash in the ReportDate. 10. Right-click in the highlight and choose Define Line Style Append Line Pattern. 11. Double-click on the ReportDate line style name to view the results. Note: This Line Style definition will be sufficient so long as there is no other line of information in the file that has slashes in positions 24 and 27 and which does not contain a Report Date. If there were, we could use the same procedure to add the spaces in front of and after the actual date. If that were still not sufficient, then we could use additional techniques that we will learn in later exercises to make the Line Style definition a unique one. 12. Highlight the Report Date. 13. Right-click in the highlight and select Define Data Field New Data Field. 14. In the Field Definition dialog, change the name of the Field to ReportDate. 15. Click Add. 16. Use the Browse Data Record button to view the results. 17. Highlight the entire Order File Creator text line at the bottom of the file. 116 Data Integrator Fundamentals Training

18. Right-click in the highlight and select Define Data Field Parse Tagged Data. 19. Double-click on the Order_File_Creator Line Style to change its name (if desired). 20. Double-click on the actual email address to open the Field Definition dialog. 21. Change the Field Name to OrderFileCreatorEmailAddress. 22. Click Update. 23. Use the Browse Data Record button to view the results. 24. Close the browser then Double-click on the Order_File_Creator Line Style name to open the Line Style Definition dialog. 25. On the Line Action tab, change the action to ACCEPT Record. 26. Click Update. 27. Choose Record Edit Accept Record from the menu bar. 28. Choose Order_File_Creator for the Current Accept Record. 29. Select the OrderFileCreatorEmailAddress checkbox. 30. Choose ShipmentMethodCode for the Current Accept Record. 31. De-select the Order_File_Creator checkbox. 32. Click Update. 33. Use the Browse Data Record button to view the results. 34. Save the Extract Schema Design as Purchases_Phone2.cxl and close the Extract Schema Designer. Note: When an Extract Schema Design like this one is used as part of the Source specification for a transformation, the transformation Map tab will look as if the input file had been defined to have multiple record types. The email address will be in the last record read by the transformation, of course. If your requirements dictate that the email address be available as actual purchase records are processed, then you will have to use other techniques in a more complex transformation.

117 Data Integrator Fundamentals Training

Extract Schema Designer: Extracting Variable Fixed Field Definitions

Keywords: Extract Schema Designer: Multiple Fields per Line


Style (variable) Description The next file that we will be parsing is Purchases_Fax.txt. We can examine it in a text viewer. Notice that this file has fields with variable lengths so that any given field may not occupy the same column position as it did in the previous record. What we plan to do is use the Extract Schema Designer to create an extract specification that will transform the report file into a more familiar row/column format, and then use that formatted data as input to the transformation that adds these purchases to the database table. As before, we dont require multiple passes of the input file. We will just create the extract schema and apply it to the input on the Source tab of our eventual transformation.

Exercise 1. From the Repository Explorer, select New Object Extract Schema. 2. At the prompt, navigate to the file you will be working with, in this case, Purchases_Fax.txt. 3. In the Source Options dialog, choose OK to accept the defaults. 4. Highlight the literal Order Header and right-click in the highlight. 5. Select Define Line Style Auto New Line Style Action - Collect fields. 6. Highlight an Account Number and Right-click in the highlight. 7. Select Define Data Field New Data Field. 8. Change the Field Name to AccountNumber. 9. For the Start Rule, choose Floating Tag. 10. Enter the tag Account Number(. 11. Use first tag starting at position 0. 12. For the End Rule, choose Floating Tag. 13. Enter the tag ) (a single closing parenthesis). 14. Use first tag starting at position 0. 15. Choose Add. 16. Highlight a PO Number and right-click in the highlight. 17. Select Define Data Field New Data Field. 18. Change the Field Name to PONumber. 19. For the Start Rule, select the first floating tag of PO Number( starting at position 0. 20. For the End Rule, select the first floating tag of ) starting at position 0. 21. Choose Add. 118 Data Integrator Fundamentals Training

Note: When working with Floating Tags, the starting position for the End Rule is relative to the beginning of the Field being defined- not the beginning of the record. So even though the closing parenthesis for the PONumber is the second one from the beginning of the file, it is only the first one from the beginning of the PONumber. 22. Highlight a PO Date, right-click and select Define Data Field New Data Field. 23. Change the Field Name to PODate. 24. For the Start Rule, select the first floating tag of PO Date: starting at position 0. Please note that there is a space after the colon. 25. For the End Rule, choose End of Line. 26. Choose Add. 27. Highlight the literal Item and right-click in the highlight. 28. Select Define Line Style Auto New Line Style Action - Collect fields. 29. Highlight a Category and right-click in the highlight. 30. Select Define Data Field New Data Field. 31. Change the Field Name to Category. 32. Choose Add. 33. Highlight a Product Number and right-click in the highlight. 34. Select Define Data Field New Data Field. 35. Change the Field Name to ProductNumber. 36. For the Start Rule, select the first floating tag of / starting at position 0. 37. For the End Rule, select the first floating tag of (a single space) starting at position 0. 38. Choose Add. 39. Highlight a Quantity, right-click and select Define Data Field New Data Field. 40. Change the Field Name to Quantity. 41. For the Start Rule, select the third floating tag of (a single space) starting at position 0. 42. For the End Rule, select the first floating tag of / starting at position 0. 43. Choose Add. 44. Highlight a Unit Cost, right-click and select Define Data Field New Data Field. 45. Change the Field Name to UnitCost. 46. For the Start Rule, select the second floating tag of / starting at position 0. 47. For the End Rule, select the first floating tag of / starting at position 0. 48. Choose Add. 49. Highlight a Shipment Method Code, right-click and select Define Data Field New Data Field. 50. Change the Field Name to ShipmentMethodCode. 51. For the Start Rule, select the third floating tag of / starting at position 0. 119 Data Integrator Fundamentals Training

52. For the End Rule, choose End of Line. 53. Choose Add. 54. Locate the Line Style that contains the Field that will be the last column in each row in the eventual extract file (in this case, Item). 55. Double-click on the Line Style name to bring up the Line Style Definition dialog. 56. On the Line Action tab, choose ACCEPT Record, and accept the remaining defaults. 57. Click Update. 58. Click on the Browse Data Record button. 59. Choose OK to allow assignment of all Fields to the Extract File. 60. Examine the data to ensure that your Field definitions are correct. 61. Close the browser window. 62. Ensure that the Fields are in the order they appear in the input data. 63. Save the Extract Schema Design as Purchases_Fax.cxl. 64. Close the Extract Schema Designer. 65. Remember that this schema can be used as part of a source connection in Map Designer.

120 Data Integrator Fundamentals Training

Process Designer for Data Integrator

Process Designer is a graphical data transformation management tool you can use to arrange your complete transformation project. With Process Designer, you can organize Map Designer Transformations with logical choices, SQL queries, global variables, Microsoft's DTS packages, and any other applications necessary to complete your data transformation. Once you have organized these Steps in the order of execution, you can run the entire workflow sequence as one unit. IntegrationArchitect_ProcessDesigner.ppt

121 Data Integrator Fundamentals Training

Process Designer Fundamentals


The heart of the integration product tool set is Map Designer. The main function of Map Designer is to transform data from one format, layout, or application to another. Process Designer integrates the Transformations created by Map Designer with any other applications or processes that need to be done to complete an entire job. In order to create a Process, first consider what is necessary to accomplish the complete transformation of your data. You should form a general idea of the logical steps to reach your goal. This includes the applications you will need, and what decisions must be made during the Process. Once you have a good idea of what will be involved, open Process Designer (via the Start Menu or Repository Explorer) and begin. Remember that Process Steps can be re-arranged, deleted, added, or edited as you build your design.

122 Data Integrator Fundamentals Training

Creating a Process
Objectives At the end of this lesson you should be able to create a simple Process Design. Keywords: Process Designer, Transformation Map, and Component Description Process Designer can be used from beginning to end to make your data transformation task simpler and more streamlined. Map Designer is one of the applications that can be called from within Process Designer. Process Designer allows you to create new Transformations, use existing Transformations, or use a copy of an original transformation file; where the original transformation file remains unchanged. Follow the steps below to create a simple process.

Exercise 1. Open Process Designer. 2. Add a Transformation step to the Process Design. 3. Right Click on the Transformation Map and choose Properties. 4. Click Browse and choose m_OutputModes_Clear_Append.map.xml from a previous exercise or from the solutions folder. Note: A Process Designer SQL Session is a particular method of connecting to the given SQL application's API. We can use the same session in multiple steps or create new sessions wherever needed. We must have at least one session if any connection to a relational database is made during the process.

123 Data Integrator Fundamentals Training

5. A SQL Session is created based upon the maps target connection. Accept the default session for the target and click OK. 6. Name this step Load_Accounts. 7. Add another Transformation step to the Process Design. 8. Right Click on the Transformation Map and choose Properties. 9. Click New to open the Map Designer. 10. Create a new map that loads the ASCII Delimited file Category.txt into the tblCategories table in the TrainingDB Database. Use the report below for specifications.

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Category.txt Header = False

Define the Target: Target Connector: ODBC 3.x

124 Data Integrator Fundamentals Training

Target Data:

Database: TrainingDB Table: tblCategories

Target Options:

None

Target OutputMode: Clear File/Table contents and Append

Target Field Expressions


R1.Code R1.Category R1.ProductManager

Fields("Field1") Fields("Field2") Fields("Field3")

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPutRecord

Target R1

11. Accept the default for the Transformation Step dialog. 12. Choose Use an existing session for the target in the Sessions Dialog. 13. Name step Load_Categories. 14. Create a new map that loads ShippingMethod.txt into the tblShippingMethod table in the TrainingDB Database. Use the report below for specifications.

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)ShippingMethod.txt Header = True

Define the Target: Target Connector: ODBC 3.x

125 Data Integrator Fundamentals Training

Target Data:

Database: TrainingDB Table: tblShippingMethod

Target Options:

None

Target OutputMode: Clear File/Table contents and Append

Target Field Expressions


R1.Shipping Method Code R1.Shipping Method Description

Fields("Shipping Method Code") Fields("Shipping Method Description")

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPutRecord

Target R1

15. Accept the default for the Transformation Step dialog. 16. Choose Use an existing session for the target in the Sessions Dialog. 17. Name step Load_ShippingMethod. 18. Establish the Step Sequence as described below (Use the corresponding image if necessary). 19. Start Load_Accounts Load_Categories Load_ShippingMethod Stop 20. Validate the Process Design. 21. Save the Process as p_Load_Tables.ip.xml. 22. Run the Process Design. 23. Examine the Target Tables.

126 Data Integrator Fundamentals Training

127 Data Integrator Fundamentals Training

Parallel vs. Sequential Processing


Objectives At the end of this lesson you should be able to create a parallel process. Keywords: Multi-threaded, Single-threaded Description The Integration Engine can execute single-threaded or multi-threaded processes, depending on your license. Process Designer now utilizes the power and speed of multithreading when running a Process. If you own the multithreaded Integration Engine, you can allow the operating system to control the load balancing across CPUs for more efficient processing. This will even work to spool multiple threads off of a single processor. If you set up the Maps within your Process to run in parallel, the Process Designer will launch each Map in parallel on its own thread. There is no need to code anything within your Maps or Processes. It is all done for you behind the scenes as long as you set the Max Concurrent threads property in the Process Design. Multithreading allows parallel execution of Process Designer Steps, where several transformation steps in Process Designer can be simultaneously executed across multiple CPUs on a server.

Exercise 1. Open the p_Load_Tables process from the previous exercise or from the solutions folder. 2. Run the process and check the log file for the length of time the process took to run. 3. Change the format of linking the steps in the process as pictured in the figure below. 4. Create a separate SQL Session for the target in each map. 5. Open the Process Properties Dialog and set Max Concurrent Execution Threads to 3. 6. Validate the process, and then save it as p_ParallelProcessing.ip.xml. 7. Run the process and check the log file for the length of time it took to run.

128 Data Integrator Fundamentals Training

There is a limit to the number of execution threads that can be executed per license. The Max number of execution threads allowed can be found on the Splash Screen. From the Toolbar choose Help About Process Designer Licensed Features. Under the list of Features you will find the feature Max allowed threads.

129 Data Integrator Fundamentals Training

Conditional Branching The Step Result Wizard


Objectives At the end of this lesson you should be able to add a conditional statement to the Decision Step Keywords: Error Handling; Conditional Branching; Metadata Execution Variables; Step Result Wizard; Boolean Expressions Description The Decision Step allows you to design a conditional expression to determine which work flow path the Process will follow. Generally, this is done with a Boolean expression. In this exercise we will use the Sep Result Wizard to create a Boolean expression that will determine the work flow path for the steps in the process pictured below.

Exercise 1. Open Process Designer. 2. Add a Transformation step to the Process Design. 3. Right Click on the Transformation Map and choose Properties. 130 Data Integrator Fundamentals Training

4. Click Browse and choose m_Reject_Connect_Info.map.xml from a previous exercise or from the solutions folder. 5. Accept the default for the Transformation Step and the Sessions dialog. 6. Name step LoadAccounts_CheckDates. 7. Add a Decision step to the Process Design. 8. Right-click on the Decision icon and select Properties. 9. Name the step Eval_RejectRecordCount. 10. Using the Step Result Wizard, create and add the following code:
project("LoadAccounts_CheckDates").RejectRecordCount > 0

11. Click OK to close. 12. Add a Scripting Step to the Process Design. 13. Right-click on the Scripting icon and select Properties. 14. Name the step NotificationBadDates. 15. Use the Build button to build an expression that will display There are STILL invalid dates!!" in a message box with a stop icon and an OK button and the title Invalid Date Warning:
MsgBox("There are STILL invalid dates!!", 16, "Invalid Date Warning")

16. Click OK to close. 17. Link the steps as follows: 18. Start LoadAccounts_CheckDates Eval_RejectRecordCount (False) Stop 19. Link the remaining steps as follows: 20. Eval_RejectRecordCount (True) NotificationBadDates Stop 21. Validate the Process Design. 22. Save your Process Design as p_ConditionalBranching_StepResultWizard.ip.xml 23. Run the Process and observe the results.

131 Data Integrator Fundamentals Training

FileList - Batch Processing Multiple Files


Objectives At the end of this lesson you should be able to build a Filelist that gathers a list of file names and stores them in an array variable. Keywords: Change Source Action, NUL: connection string, File List Function, Array Variables, DefineMacro, and Looping Description Builds a list of user-specified file types. Returns a 'Type Mismatch' error if the results parameter is not an array. The FileList Function returns both file AND DIRECTORY names within a given directory. If you want to work only with file names, you will need to test the return names using the IsFile Function to determine which files you want to use. Note: You cannot use FileList to return a list of files via FTP.

132 Data Integrator Fundamentals Training

Exercise 1. Create process variables as described below.


Variables
Name files fileCounter filePath Type Variant - Array Variant Variant Public No No No -1 Value Description Array. Elements contain names of files passed from the FileList function. Counter. Index counter for the files() array. Variable. Stores the path of the Inbox directory where batch files are located. Variable. Stores the name of the next file to be processed.

currentFile

Variant

No

2.

Add a Transformation step onto the Canvas.

3. Name the step LoadAccountsTable. 4. Click Browse to locate the m_OutputModes_Clear_Append.map.xml from a previous exercise or from the solutions folder. 5. Accept the defaults in the Sessions dialog to Create a New Session for the target. 6. Add a scripting step as described below. Name the step BuildFileList:

Expression:

' Set directory for incoming files. ' Consider using lookup or user input for this value.
filePath = MacroExpand("$(FUN_DATA)") & "Inbox\"

' Gather list of file names. Use wildcards if needed.


FileList(filePath & "AddrChg*.*", files())

' Set array index counter (Zero based).


fileCounter = UBound(files)

7. Add a decision step as described below. Name the step GotFiles?:


Expression: fileCounter > -1

133 Data Integrator Fundamentals Training

8. Add a scripting step as described below. Name the step Notification_NoFiles:


Expression: MsgBox("No Files to Process: Exiting")

9. Use the Line Builder to connect the steps created thus far. 10. Start LoadAccountsTable BuildFileList GotFiles (False) Notification_NoFiles Stop 11. Add a scripting step as described below. Name the step SetCurrnentFile:

Option Explicit ' Trap runtime errors (e.g., Array Index Out of Bounds) ON ERROR GOTO ErrorScript ' Set variable for the current file. Define Macro for use within the map. currentFile = filePath & files(fileCounter) DefineMacro("SOURCE_FILE", currentFile) ' Verification... Dim f f = Ubound(files) - fileCounter MsgBox("Processing File: " & files(fileCounter) & ". File " & f + 1 & " of " & Ubound(files)+1) ' Use the Return statement to exit this module Return ' Error handler ErrorScript: ' Get the error info and check variable values LogMessage("ERROR","Err.Number = " & Err.Number & " " & _ "Err.Description = " & Err.Description & " " & "FileDirectory=" & filePath & " " & _ "fileCounter=" & fileCounter & " " & "CurrentFile=" & files(fileCounter)) Terminate()

12. Add a Transformation step and name the step UpdateAddresses. 13. Click New and build a map based on the specifications below:

134 Data Integrator Fundamentals Training

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(SOURCE_FILE) Header = True

Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB Table: tblAccounts Target Options: None

Target OutputMode: Update

Source Schema
Filed Name
Account Number New Street Total

Type
Text Text

Length
9 34 43

Description

Target Field Expressions


R1.AccountNumber Records("R1").Fields("Account Number") R1.Address

Records("R1").Fields("New Street")

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPutRecord

Target R1

135 Data Integrator Fundamentals Training

14. Save the map as m_UpdateAddresses.map.xml and close Map Designer. 15. Use the SQL session as was created in the first transformation step. 16. Add a decision step and name the Step SuccessCheck. 17. Use the Step Result Wizard to build the expression below:
Expression: Project("UpdateAdds").ReturnCode == 0

18. Add a scripting step as described below. Name the step UpdateFileCounter:
' Decrement the file counter variable Expression: fileCounter = fileCounter 1

19. Add a scripting step as described below. Name the step Notification_UpdateFailure:
Expression: MsgBox("Update Address Map Failed")

20. Connect the remaining steps as in the screen shot above the exercise instructions. 21. Validate the process. 22. Save the process as p_FileListLoop.ip.xml. 23. Run the process and observe the results.

136 Data Integrator Fundamentals Training

Pervasive Integration Engine

Pervasive Integration Engine is an embedded data Transformation engine used to deploy runtime data replication, migration and Transformation jobs on Windows or UNIX-based platforms quickly and easily without costly custom programming. It fills the need for a low-cost, universal data transformation engine. The Integration Engine is a 32-bit data transformation engine written in C++, containing the core data driver modules that are the foundation for the transformation architecture. Because the Integration Engine is a pure execution engine with no user interface components, it can perform automatic, runtime data transformations quickly and easily, making it ideal for environments where regular data transformations need to be scheduled and launched on Windows or UNIX-based systems. Maps and Processes can be scheduled through the command line or invoked through an API. APIs are documented in the Integration Engine SDK.

137 Data Integrator Fundamentals Training

Syntax: Version Information


Objectives This lesson shows how to retrieve version and licensing information from the engine via the command line interface. Keywords: djengine, Executable and Version Information

Exercise 1. Open a command window by typing cmd in the Windows Run dialog. 2. Use a cd command to navigate to directory where the engine is installed. 3. The default directory is: C:\Program Files\Pervasive\Cosmos9\Common. 4. To get the current engine version information, type: djengine version

138 Data Integrator Fundamentals Training

Options and Switches


Objectives This lesson shows how to get the usage syntax and all options, or switches, available through the command line interface of Integration engine. Keywords: Syntax and Option Overrides

Exercise View the different options and parameters available for executing transformations and processes by using the -? switch. To see all the available options, at the command prompt type: djengine help

139 Data Integrator Fundamentals Training

140 Data Integrator Fundamentals Training

Execute a Transformation
Objectives This lesson demonstrates how to execute a Transformation Map via the command line interface. Keywords: Executing a Map Description At the command prompt type: djengine MapName.tf.xml . Note: Be sure to use the file that has the extension .tf.xml. This is the transformation file. The transformation file contains all of the connection information that the engine needs to connect to the source and target. It also contains a link to the map file. If you provide the engine with the name of the map file (the file with a .map.xml extension), you will receive errors. Tip: You can browse to the file in Windows Explorer and drag and drop the file onto the command line. Add verbose at end of command to get statistics printed to the console during runtime. At the command prompt type: djengine C:\Cosmos9_Work\ Fundamentals\Solutions\IntegrationEngine_CommandLine\EngineTest.tf. xml -verbose

141 Data Integrator Fundamentals Training

Using a -Macro_File Option


Objectives At the end of this lesson you should be able to utilize a Macro Definition file for porting Maps and Processes from one Integration Engine installation to another. Keywords: Macro Definition, Macro Manager and Macro File Description There are two ways to use a Macro to define a connection on the command line. The Macro_File (-mf) command shows the path to the Macrodef.xml file (the default location for that file is in the Workspace1 folder) that holds the values of all the macros that we have defined. If you have multiple macros defined, this may be your preferred method. At the command prompt type: djengine -Macro_file C:\Cosmos9_Work\Workspace1\macrodef.xml C:\Cosmos9_Work\Fundamentals\Solutions\IntegrationEngine_CommandLin e\EngineTestwithMacro.tf.xml verbose

The Define_Macro command allows you to define individual Macros on the command line. At the command prompt type: djengine -Define_Macro FUN_DATA=C:\Cosmos9_Work\ Fundamentals\Data\ C:\Cosmos9_Work\Fundamentals\Solutions\IntegrationEngine_CommandLin e\EngineTestwithMacro.tf.xml -verbose

142 Data Integrator Fundamentals Training

Executing a Process
Keywords: Using the Process Design Option

Command syntax is djengine -process_execute file name (include path) At the command prompt type (code below is wrapped around the command line): djengine pe verbose -Macro_File C:\Cosmos9_Work\Workspace1\macrodef.xml C:\Cosmos9_Work\Fundamentals\Solutions\ProcessDesigner_DataIntegrat or\CreatingAProcess.ip.xml Notes We are using the Macro_File command because some of the Maps in the process uses a Macro as part of the source connection. Every process should contain the pe switch as the first switch. It should always be used even if you notice there are times when a process runs without it. The process name being called should always be the last item in the command. Any extra switches used should be entered after the pe switch and before the path to the process.

143 Data Integrator Fundamentals Training

Additional Sample Exercises Integration Engine

144 Data Integrator Fundamentals Training

Command Line Overrides Source Connection


Keywords: Dynamic Override for Source File

Let's substitute a different source file from the original file defined in the Transformation to show how overrides can be performed at execution time. The syntax of the command is: djengine -Source_Connect_Info string (include path) At the command prompt type: djengine -Source_Connect_Info C:\Cosmos_Work\ Fundamentals\Data\AccountsSmall.txt C:\Cosmos_Work\ Fundamentals\Solutions\IntegrationEngine_CommandLine\EngineTestw ithMacro.tf.xml -verbose AccountsSmall.txt is a file that has the same format as Accounts.txt, but it only has 54 records.

Note that only 54 records were written. Note also that we did not need to define the Macro or the path to the Macro File. The Macro in the map was only used in the source connection and we defined a new source with a complete path. So the Macro was no longer relevant.

145 Data Integrator Fundamentals Training

Ease of Use: Options File


Keywords: Using Text Editor for Command Line Options Type the command from the previous exercise (leave out the first word, DJEngine) in a text editor and save the file as Options.bas in Cosmos root directory. This is called an Optfile. Note: If you did not save this file in the Cosmos root directory, youll have to include the path of the file as well as the file name in the command. At the command prompt type: djengine @Options.bas

Including the DJEngine command in the batch file will allow you to use the batch file with third party scheduling tools. Save the file Options.bas as Options.bat. Include the djengine call in Options.bat so that the entire text of the file reads: djengine pe verbose -Macro_File C:\Cosmos9_Work\Workspace1\macrodef.xml C:\Cosmos9_Work\Fundamentals\Solutions\ProcessDesigner_DataIntegrat or\CreatingAProcess.ip.xml If you are using a Windows machine, use Windows Task Scheduler or schtask to schedule this process. Tip: You may choose to add a pause command at the bottom of the script so that the command prompt will remain open, and you can verify that the process ran.

146 Data Integrator Fundamentals Training

Checklist Integration Engine

Troubleshooting Review the Integration Engine Command Line Interface Error Messages in the Help Files as well as the Error Code Reference Check the command line syntax Verify that the tf.xml is being used for executing maps Be sure the map or process file is specified last in the command Check spelling Verify the license has not expired (run djengine V from the command line) Confirm the appropriate version is installed Does the process/map run from the Design Tools? Are your Environment variables setup correctly (i.e. PATH, connector specific such as Oracle, Java paths)? Use the SET command to see a quick list of Windows environment variables. Try a backup or previous copy of your file Are you using the correct case? The following can be case sensitive Macro names Platforms (Unix) Switches (i.e. -V vs. v)

Does it run on one platform and not on another ? Check your file path slashes Windows - back slashes: \ Unix - forward slashes: /

Setting an Environment Variable in Windows This setting allows the user to call the DJEngine command from any path and eliminates the need to include the full path the command each time. 1. Right Click My Computer, and Choose the Properties option 2. Click on Advanced tab 3. Click the Environment Variables button 4. Under System variables, scroll down to Path 5. Double click Path 6. In Variable Value, put the following path followed by a semicolon in front of the first path: 147 Data Integrator Fundamentals Training

C:\Program Files\Pervasive\Cosmos9\Common; Illustration: Setting Environment Variable in Windows

Engine Profiler The Engine Profiler is a tool designed to fine tune your Transformations and Processes. There is an excellent document that goes into detail of the functionality and use of the Engine Profiler at C:\Program Files\Pervasive\Cosmos9\Common\Help\PDF\engine_profiler.pdf

148 Data Integrator Fundamentals Training

149 Data Integrator Fundamentals Training

Intermediate Mapping Techniques

This section explores the capabilities of Transformation Map Designer in more detail.

150 Data Integrator Fundamentals Training

Multiple Record Type Structures

151 Data Integrator Fundamentals Training

Multiple Record Type 1 One-to-Many


Objectives At the end of this lesson you should be able to create a target file that has multiple record types from a source file that has a single record type. You will also become more familiar with the OnDataChange event. Keywords: OnDataChange Event, Data Change Event, Parse and Format functions Description There are two possible scenarios in creating a multiple-record-type target file from a single-recordtype source. In the simplest, you want to break down each source record into n-different target records. You might want to take source fields 1-5 from the source and put them in target record A and take source fields 6-10 and put them in target record B. To perform this task, you define your target record types as you learned in an earlier lesson and then, in an AfterEveryRecord event, just perform two ClearMapPut actions- one for each target record type. The second and more complex situation occurs when you dont necessarily want to create both target records A and B from each source record. Lets assume that the source file contains customer information in fields 1-5 and sales information in fields 6-10. Well assume that the source is in order by field 1- the customer number. In this case, we only want to write a target record A when we encounter a new customer, although we want to write a target record B for each source record. In order to perform this task, we can use the OnDataChange Event. For our solution, well set up the event to monitor the customer number. Each time it changes (including the change from an empty source buffer to the valid value from the very first record), we will write out a target record A. Well use our AfterEveryRecord event to write out target record B. Keep in mind that this transformation is assuming a single target file with multiple record types. This is very similar to a target with a single database with two tables, and in this latter situation different techniques would be used, though the events will be very similar. In this exercise, the source file contains records that have employee demographic information and vehicle lease information. If an employee has a lease to more than one car, there is one source record for every car. Thus the employee demographic information is redundantly written in these records. In our target file we will eliminate the redundancy by creating two distinct record types. We will write one target record for each employee and we will write one child record of a different record type for each vehicle. For example our source data has the following format: Employee1 Data, Auto1 Data Employee1 Data, Auto2 Data Employee1 Data, Auto3 Data Employee2 Data, Auto1 Data Employee2 Data, Auto2 Data

The resulting target file will have the following format: 152 Data Integrator Fundamentals Training

Employee1 Data Auto1 Data Auto2 Data Auto3 Data Employee2 Data Auto1 Data Auto2 Data In order to achieve this it will be pertinent to know which event handlers require a ClearMapPut Action to write the target records. You can only make this decision by knowing what is contained in the source buffer or buffers. (The Source Buffer is the internal object that stores the values that have just been read in from a source record. There is one buffer for each source record type.) As a general rule, you will create at least one write action (usually a ClearMapPut) for every record type in the target.

Exercise 1. Create a map based on the specifications given below. 2. Save the map as m_One_to_Many.map.xml. 3. Run the map and observe the results.

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Autos_Sorted.txt Header = True

Define the Target: Target Connector: Target Data: Target Options: ASCII(Fixed) File: $(FUN_DATA)Autos_MultiRecType.txt None

Target OutputMode: Replace

153 Data Integrator Fundamentals Training

Create 2 record types in the target through the Map Designer user interface. The layouts for both record types are described below:

Record Employee
Name Type Length Description 1 2 10 9 2 24

RecordID Text Initials Phone City State Total Text Text Text Text

Record Auto
Name Type Length Description 1 2 4 10 5 22

RecordID Text Initials Year Make Color Total Text Text Text Text

Target Field Expressions Employee


Employee.RecordID "E" Employee.Initials Employee.Phone Employee.City Employee.State

Records("R1").Fields("Initials") Records("R1").Fields("Phone") Records("R1").Fields("City") Records("R1").Fields("State")

154 Data Integrator Fundamentals Training

Target Field Expressions Auto


Auto.RecordID Auto.Initials Auto.Year Auto.Make Auto.Color

"A" Records("R1").Fields("Initials") Records("R1").Fields("Year") Records("R1").Fields("Make") Records("R1").Fields("Color")

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPut Record

Target Auto

Define Events: Source R1 OnDataChangeEvent Monitor: Records("R1").Fields("Initials") Management: Fire first ODC event, Suppress Extra ODC event at EOF Event Name Event Actions Event Parameters
target name record layout

OnDataChange1

ClearMapPut Record

Target Employee

Note: It would be a good idea to create recognition rules for each target record type. It would also be a good idea to save the schema that was created in the target through the Map Designer Interface. The trainer can walk you through these steps before moving on to the next exercise.

155 Data Integrator Fundamentals Training

Multiple Record Type 2 Many-to-One


Objectives At the end of this lesson you should be able to work with a multiple record type source file and create a single record type target file. Youll gain an understanding of the special nature of the source buffer for these transformations, and the relationship of the various event handlers to their individual record types. Keywords: Multi to one record type, and multiple record layouts Description When you specify a source file that contains multiple record types (either by applying an existing multiple-record-type structured schema to it or by creating a new multiple-record-type structured schema) you will find that the Source Hierarchy on the Map Tab will display the individual record types and give you access to the individual fields for each record type. To create a single-record-type target file, simply map the fields from their record types within the Source to the target field list- just as you do in any other Map Design. The Map Designer will take care of precisely identifying each field with its name and also the source record type it belongs to. The key to mapping multiple-record-type source files is an understanding how the source buffer is structured. When the source file specifies multiple record types, your transformation will automatically set up a large source buffer that contains a section for each different record type. Each section has a holder for each field defined in that record type. As your transformation reads the source file, it uses the structured schema and the recognition rules to identify the record type for each record. Once the record type is identified, it is placed into its proper section of the source buffer. Another key to working with multiple-record-type source files is using the right event handler. Each source record type has its own set of event handlers. For example, you may perform a set of actions each time a record of a particular type is read, or after the first occurrence of each record type is read and so on. (Using the General Event Handlers, you can also perform actions globally for all record types.) If your target layout is going to contain fields from three different record types, you will not want to attempt to write a target record until the source buffer sections for all three of those record types have been filled. If we assume that the source file always contains all three record types for each object (e.g., customer, account, sale), and if we know that the order is always 1-2-3, then we can simply use the AfterEveryRecord Event for record type 3 to write a target record. The situation is a bit more complex if some record types might not exist for a given object yet we want to write a target record with the data we do have. Assuming the same order restriction, if record type 2 is the one that is missing, then the problem is that data from a previous record type 2 may still be present in the source buffer, and we may have to clear that section of the source buffer ourselves. But if record type 3 is missing we have a different problem. The AfterEveryRecord Event will not be triggered and no target record will be written. The trainer can discuss methods for solving these problems. Similar problems exist if the sequence of records in the source file can change from object to object, but again, these problems can be solved if we understand the operation of the source buffer, use the right event handlers, and also manually clear sections of the source buffer when necessary. In this exercise well take the file that we created in the last exercise and change it back into the format it had before.

156 Data Integrator Fundamentals Training

Exercise 1. Create a map based on the specifications given below. 2. Save the map as m_Many_to_One.map.xml. 3. Run the map and observe the results.

Map Summary:
Define the Source: Source Connector: ASCII(Fixed) Source Data: Source Options: Source Schema: File: $(FUN_DATA)Autos_MultiRecType.txt none s_Autos_MultiRecType.ss.xml

Define the Target: Target Connector: Target Data: Target Options: ASCII(Delimited) $(FUN_DATA)Autos_Combined.txt Header = True

Target OutputMode: Replace

Target R1 Record Layout


Name Initials Phone City State Year Make Color Total Type Text Text Text Text Text Text Text Length 2 10 9 2 4 10 5 42 Description

157 Data Integrator Fundamentals Training

Target Field Expressions


R1.Initials Records("Employee").Fields("Initials") R1.Phone R1.City R1.State R1.Year R1.Make R1.Color

Records("Employee").Fields("Phone") Records("Employee").Fields("City") Records("Employee").Fields("State") Records("Auto").Fields("Year") Records("Auto").Fields("Make") Records("Auto").Fields("Color")

Define Events: Source Auto Events Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPut Record

Target R1

158 Data Integrator Fundamentals Training

User Defined Functions


The Rapid Integration Flow Language (RIFL) encompasses functions, statements and keywords that are used in Source/Target Filters, Target Field Expressions, and Code Modules. You will recognize VBScript and Visual Basic functions and some SQL Statements. Map Designer also employs many unique functions, which were designed to help you get the most out of your data. One of the powerful features of this language is the ability to abstract and reuse scripts in the form of User-Defined Functions. These functions can be stored and edited in a text file (code module) in a centralized location so that all of your Maps have access to them.

159 Data Integrator Fundamentals Training

Code Reuse Save/Open a RIFL script Code Modules


Objective At the end of this lesson you should be able to save and reopen an extract script. Keywords: RIFL Script Editor Description The first level of code reusability is simply to save a script to file. You will need to make any necessary changes when you reopen it in a different map but the script is still intact.

Exercise 1. Simply open any RIFL Script in the Editor Window and click the Save button on the toolbar. This saves a text file with a RIFL extension somewhere on your network. 2. To reuse the script, click the Open Folder toolbar button in another Script editor window. You will need to manually change any parameters for use in the new Script window. 3. Next, we will show you how to make the functions more flexible by abstracting them into User Defined Functions and storing them in Code Modules. 160 Data Integrator Fundamentals Training

Code Reuse - Code Modules


Objectives At the end of this lesson you should be able to call a user-defined function from a code module. Keywords: User Defined Functions, Code Modules, and RIFL Script Editor Description You may call user-defined functions from an external Code Module in Map Designer. Code modules may be saved as text-only files with a RIFL (Rapid Integration and Flow Language) file extension. Expressions may be written using the RIFL expression language, and saved with a RIFL extension. External code modules can be moved to any other machine with the Map Designer or Integration Engine without a problem. This will allow you to develop a user-defined "library" for use among different members of your team.

Exercise 1. Create a map based on the specifications given below. Save the map as m_CodeReuse.map.xml. 2. Run the map and observe the results.

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: File: $(FUN_DATA)Accounts.txt Header = True

161 Data Integrator Fundamentals Training

Define the Target: Target Connector: Target Data: Target Options: ASCII(Delimited) $(FUN_DATA)ZipReport.txt Header = True

Target OutputMode: Replace

Target R1 Record Layout


Name Type Length Description 9 10 25 44

Account Number Text Zip ZipReport Total Text Text

Define Code Modules: Code Modules: $(FUN_DATA)Scripts\ZipCodeLogic.rifl

Target Field Expressions


R1.Account Number Records("R1").Fields("Account Number") R1.Zip R1.ZipReport

Records("R1").Fields("Zip") zipTest(Records("R1").Fields("Zip"))

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPut Record

Target R1

162 Data Integrator Fundamentals Training

Lookup Wizards
Lookup Wizards automate the process of creating lookups for your Transformations. You select that data that needs to be looked up, browse to those files or tables to automatically build connection strings, and select the key and returned fields. After using the Lookup Wizard, a reusable code module is created in your workspace containing the functions you need for performing lookups. The Code Module files generated by these wizards can then be reused in any Map you create. There are three types of Lookup methodologies and each has their advantages in certain situations. They are: 1. Static Flat File Lookups are fast but not very portable or dynamic. 2. Dynamic SQL Lookups are portable and dynamic but not very fast. 3. Incore Table Lookups are extremely fast and can be made more dynamic with extra RIFL code but they use core memory to store the data.

163 Data Integrator Fundamentals Training

Incore Table Lookup

Keywords: Lookup Wizard, Incore Memory Table & Lookup, Count


& Counter Variable parameters, One-to-Many records (unrolling occurrences), and referencing Target Field values Description An Incore memory table lookup can be utilized when speed is of the utmost importance. The primary method of creating the incore table is through use of a DJImport object. The memory table will then be accessed to perform the lookup. For the purposes of this exercise we will use the Lookup Wizard to create a Code Module with the desired functions. These functions will create the incore table, allow us to reference values in the table, and clear the table for memory when we are finished using it.

Exercise 1. Create a map based on the specifications given below.

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: File: $(FUN_DATA)Accounts.txt Header = True

Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB tblFavoriteInfo Target Options: none

Target OutputMode: Clear File/Table contents and Append

Note: The following code module should be built through the Lookup Wizard. Follow the instructions below to use the Lookup Wizard.
Define Code Modules: Code Modules: $(FUN_DATA)Scripts\Categories.itable.rifl

164 Data Integrator Fundamentals Training

2. From the Menu click Tools Define Lookup Functions to open the Lookup Wizard. 3. Choose the Incore Table Lookup Wizard and click Next. 4. Create a new Incore Table Definition named Categories and click Next.

5. Click Build to build a new Connection String and then click Next. 6. Connect to the data source defined below for the lookup and click Next.
Define the Connection String: Connector: Access 2000 File: C:\Cosmos9_Work\Fundamentals\Data\TrainingDB.mdb Table: tblCategories Properties: None

7. Choose the appropriate Key Field, and Fields that should be returned by the lookup.

165 Data Integrator Fundamentals Training

8.

Click Finish.

9. The Wizard will create several Incore Table lookup functions in a code module. Use the following functions in the appropriate event handlers as described below.
Categories_Init() Initializes the DJImport object, makes the connection to the data source as defined by the connection string, and builds the Incore table. Categories_Category_Lookup(KeyValue, DefaultValue) Creates the SQL call needed to retrieve a value from the Category field based on a Key value. Categories_ProductManager_Lookup(KeyValue, DefaultValue) Creates the SQL call needed to retrieve a value from the ProductManager field based on a Key value. Categories_Clear() Clears the Incore Table from memory.

Define Events: Transformation and Map Properties Events Event Name Event Actions Event Parameters
Expression: Categories_Init() Expression: Categories_Clear()

BeforeTransformation AfterTransformation

Execute Execute

166 Data Integrator Fundamentals Training

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout count

AfterEveryRecord ClearMapPut Record

Target R1

CharCount("|",Records("R1").Fields("Favorites")) + 1
counter variable cntr

Target Field Expressions


R1.FavoritesID

Serial()

R1.Account Number Records("R1").Fields("Account Number") R1.CategoryCode R1.CategoryLiteral

parse(cntr, Records("R1").Fields("Favorites"), "|") Categories_Category_Lookup _ (Targets(0).Records("R1").Fields("CategoryCode"), "NoMatches") Categories_ProductManager_Lookup _ (Targets(0).Records("R1").Fields("CategoryCode"), "NoManagers")

R1.ProductManager

167 Data Integrator Fundamentals Training

Relational Database Management System (RDBMS) Mapping

168 Data Integrator Fundamentals Training

Select Statements SQL Passthrough


Keywords: SQL Select Statements The Transformation Map designer source connectors allow for passing Select statements through to a database server to obtain a row set. The resultant row set that is returned by the query then becomes the source data for your Map. Alternatively, you can use the SQL script that generates this source record set by using the SQL File connection option and pointing to the matching SQL Script file in the Data folder. This exercise creates a simple transformation that takes only the records from Texas and puts them into our target.

Exercise 1. Create a map based on the specifications given below. 2. Save the map as m_SQL_Passthrough.map.xml. 3. Run the map and observe the results.

Map Summary:
Define the Source: Source Connector: ODBC 3.x Source Data: Database: TrainingDB SQL Statement: SELECT * FROM tblAccounts WHERE State = TX Source Options: none

Define the Target: Target Connector: Target Data: Target Options: ASCII(Delimited) $(FUN_DATA)TXAccounts.txt Header = True

Target OutputMode: Replace

Target Field Expressions


R1.AccountNumber R1.Name

Fields("AccountNumber") Fields("Name")

169 Data Integrator Fundamentals Training

R1.Company R1.Street R1.City R1.State R1.Zip R1.Email R1.BirthDate R1.Favorites

Fields("Company") Fields("Street") Fields("City") Fields("State") Fields("Zip") Fields("Email") Fields("BirthDate") Fields("Favorites")

R1.StandardPayment Fields("StandardPayment") R1.LastPayment R1.Balance

Fields("LastPayment") Fields("Balance")

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPut Record

Target R1

170 Data Integrator Fundamentals Training

DJX in Select Statements Dynamic Row sets

Keywords: Integration Query Builder, DJX Syntax, and Dynamic Row Sets via User Interaction, InputBox Description DJX is used to escape into the RIFL expression language to design SQL statements dynamically. This allows you to use variables and macros in SQL Statement. This exercise will select records from the tblAccounts table that are from a particular state. That state will be determined at runtime.

Exercise 1. Create a map based on the specifications given below. 2. Save the map as m_SQL_DynamicRowset.map.xml. 3. Run the map and observe the results.

Map Summary:
Define the Source: Source Connector: ODBC 3.x Source Data: Database: TrainingDB SQL Statement:SELECT * FROM tblAccounts WHERE State = DJX(varState)

Define the Target: Target Connector: Target Data: Target Options: HTML $(FUN_DATA)AccountsByState.html index = false; mode = table; table border = true

Target OutputMode: Replace

Variables
Name Type Public Value

varState Variant no

171 Data Integrator Fundamentals Training

Target Field Expressions


R1.AccountNumber R1.Name R1.Company R1.Street R1.City R1.State R1.Zip R1.Email R1.BirthDate R1.Favorites

Fields("AccountNumber") Fields("Name") Fields("Company") Fields("Street") Fields("City") Fields("State") Fields("Zip") Fields("Email") Fields("BirthDate") Fields("Favorites")

R1.StandardPayment Fields("StandardPayment") R1.LastPayment R1.Balance

Fields("LastPayment") Fields("Balance")

Define Events: Transformation and Map Properties Events Event Name Event Actions Event Parameters

BeforeTransformation

Execute

Expression:
varState = InputBox("Enter the two letter code for the State:", "State Input", "TX")

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPut Record

Target R1

172 Data Integrator Fundamentals Training

Multimode Introduction
Keywords: Multimode Functionality, Insert Action, and Count Parameter Multimode is a functionality that allows us to write to more than one table in the same database within the same Transformation. The use of the Multimode connector provides us with greater capabilities when mapping to a database. Since we now have the option to map to multiple tables within a database there isnt an option to set output modes such as Replace, Append, Clear and Append, Update, or Delete. We may want to append records to one table, but delete records from another. Therefore, this functionality now exists as Actions that can be taken with specific record layouts and table names. The Account Numbers in the Accounts.txt file all start with either 01 or 02. The ones that start with 01 are trading partners. We want to create a Transformation that will insert those records into the tblTradingPartners table in the TrainingDB Database. The records that start with 02 are individual customers, and we will insert them into the tblIndividuals table.

Exercise 1. Create a map based on the specifications given below. 2. Save the map as m_Multimode_Intro.map.xml. 3. Run the map and observe the result.

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True

Define the Target: Target Connector: Target Data: ODBC 3.x Multimode Database: TrainingDB Tables: tblIndividuals, tblTradingPartners Target Options: none

In order to remove any previous data residing in these tables, we can use a SQL Statement Action to write literal SQL deleting all records in these tables. 173 Data Integrator Fundamentals Training

Define Events: Transformation Events Event Name Event Actions Event Parameters
target name statement Delete from tblIndividuals; Delete from tblTradingPartners

BeforeTransformation

SQL Statement

Target

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout count

AfterEveryRecord

ClearMapInsert Record

Target tblIndividuals

If Left(Records("R1").Fields("Account Number"), 2) == "02" Then 1 Else 0 End if

ClearMapInsert Record

target name record layout count

Target tblTradingPartners

If Left(Records("R1").Fields("Account Number"), 2) == "01" Then 1 Else 0 End if

Target Field Expressions tblIndviduals


tblIndividuals.Account Number tblIndividuals.Name tblIndividuals.Street tblIndividuals.City

Records("R1").Fields("Account Number") Records("R1").Fields("Name") Records("R1").Fields("Street") Records("R1").Fields("City")

174 Data Integrator Fundamentals Training

tblIndividuals.State tblIndividuals.Zip tblIndividuals.Email tblIndividuals.Birth Date

Records("R1").Fields("State") Records("R1").Fields("Zip") Records("R1").Fields("Email") DatevalMask(Records("R1").Fields("Birth Date"), "mm/dd/yyyy") Records("R1").Fields("Favorites")

tblIndividuals.Favorites

tblIndividuals.Standard Payment Records("R1").Fields("Standard Payment") tblIndividuals.Payments tblIndividuals.Balance

Records("R1").Fields("Payments") Records("R1").Fields("Balance")

Target Field Expressions tblTradingPartners


tblTradingPartners.Account Number tblTradingPartners.Name tblTradingPartners.Company tblTradingPartners.Street tblTradingPartners.City tblTradingPartners.State tblTradingPartners.Zip tblTradingPartners.Email

Records("R1").Fields("Account Number") Records("R1").Fields("Name") Records("R1").Fields("Company") Records("R1").Fields("Street") Records("R1").Fields("City") Records("R1").Fields("State") Records("R1").Fields("Zip") Records("R1").Fields("Email")

tblTradingPartners.Standard Payment Records("R1").Fields("Standard Payment") tblTradingPartners.Payments tblTradingPartners.Balance

Records("R1").Fields("Payments") Records("R1").Fields("Balance")

175 Data Integrator Fundamentals Training

Multimode Data Normalization


Keywords: Comprehensive exercise, Create Unique Indexes (Action Keys), Primary & Surrogate keys, On Error & On Constraint Error event handling. The Map Designer has a rich set of Event Handlers with predefined Actions that can make quick work of complex mapping problems. In this exercise we will normalize data from Accounts.txt as we load it directly to the target database. A single record will be written to three different target tables and in the case of the Favorites column, we will write one-to-many records again. As we map to the three different tables, we need to map foreign keys and generate primary keys so we will be able to relate the data downstream. We can also de-dupe the data by placing unique indexes on the load tables and checking constraints as we insert rows. Finally, we will utilize more of the target Event Handlers to catch exception records. However, in this case we will not use the Reject Connection Info functionality. We will insert exception records to a reject table in the target database and add our own text for the reject reason.

Exercise 1. Create a map based on the specifications given below. 2. Run the map and observe the result.

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True

Define the Target: Target Connector: Target Data: ODBC 3.x Multimode Database: TrainingDB Tables: tblEntity, tblFavorites, tblPaymets, tblRejects Target Options: none

Variables
Name rejectReason Type Variant Public no Value "NoReason"

176 Data Integrator Fundamentals Training

Dont forget to set Action Keys!

Target Field Expressions: tblEntity


tblEntity.Account Number tblEntity.Name tblEntity.Company tblEntity.Street tblEntity.City tblEntity.State tblEntity.Zip tblEntity.Email tblEntity.Birth Date

Records("R1").Fields("Account Number") Records("R1").Fields("Name") Records("R1").Fields("Company") Records("R1").Fields("Street") Records("R1").Fields("City") Records("R1").Fields("State") Records("R1").Fields("Zip") Records("R1").Fields("Email") DateValMask(Records("R1").Fields("Birth Date"), "mm/dd/yyyy")

Target Field Expressions: tblFavorites


tblFavorites.Account Number tblFavorites.FavoritesID

Records("R1").Fields("Account Number") Serial(0) ' Starts at 1 each execution. Consider using a lookup to get Max Value first. Parse(cntFavorites, Records("R1").Fields("Favorites"), "|")

tblFavorites.Favorites

Target Field Expressions: tblPayments


tblPayments.Account Number tblPayments.PaymentID

Records("R1").Fields("Account Number") Serial(0) 'Starts at one each execution. Consider using lookup for Max Value Records("R1").Fields("Payments") Records("R1").Fields("Balance")

tblPayments.Payments tblPayments.Balance

Target Field Expressions: tblRejects

177 Data Integrator Fundamentals Training

tblRejects.Account Number tblRejects.RejectID

Records("R1").Fields("Account Number") Serial(0) 'Starts at one each execution. Consider using lookup for Max Value rejectReason Records("R1").Fields("Name") Records("R1").Fields("Company") Records("R1").Fields("Street") Records("R1").Fields("City") Records("R1").Fields("State") Records("R1").Fields("Zip") Records("R1").Fields("Email") Records("R1").Fields("Birth Date") Records("R1").Fields("Favorites") Records("R1").Fields("Standard Payment") Records("R1").Fields("Payments") Records("R1").Fields("Balance")

tblRejects.RejectReason tblRejects.Name tblRejects.Company tblRejects.Street tblRejects.City tblRejects.State tblRejects.Zip tblRejects.Email tblRejects.Birth Date tblRejects.Favorites tblRejects.Standard Payment tblRejects.Payments tblRejects.Balance

Define Events: Transformation Events Event Name Event Actions Event Parameters
target name table name

BeforeTransformation

DropTable

Target
tblEntity

DropTable

target name table name

Target
tblFavorites

DropTable

target name table name

Target
tblPayments

DropTable

target name table name

Target
tblRejects

CreateTable

target name record layout table name

Target
tblEntity tblEntity

178 Data Integrator Fundamentals Training

CreateTable

target name record layout table name

Target
tblFavorites tblFavorites

CreateTable

target name record layout table name

Target
tblPayments tblPayments

CreateTable

target name record layout table name

Target
tblRejects tblRejects

CreateIndex

target name record layout table name index name unique

Target
tblEntity tblEntity idxEntity True

CreateIndex

target name record layout table name index name unique

Target
tblFavorites tblFavorites idxFavorites True

CreateIndex

target name record layout table name index name unique

Target
tblPayments tblPayments idxPayments False

CreateIndex

target name record layout table name index name unique

Target
tblRejects tblRejects idxRejects False

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapInsert Record

Target tblEntity

179 Data Integrator Fundamentals Training

table name

tblEntity Target tblFavorites tblFavorites

ClearMapInsert Record

target name record layout table name

count:
Charcount(|, Records(R1).Fields(Favorites)) +1

counter variable: cntFavorites ClearMapInsert Record


target name record layout table name

Target tblPayments tblPayments

Define Events: Target Events Event Name Event Actions Event Parameters
Expression:

OnConstraintError

Execute

rejectReason = "General-OnConstraintErr"

ClearMapInsert Record

target name record layout table name

Target tblRejects tblRejects

Resume OnError Execute

none

Expression:
rejectReason = "General-OnError"

ClearMapInsert Record

target name record layout table name

Target tblRejects tblRejects

Resume

none

Do not forget the Resume Action. The Resume Action is what causes the map to continue processing the remaining records after the error is handled.

180 Data Integrator Fundamentals Training

Multimode Implementation with Upsert Action


Objectives At the end of this lesson you should be able to use the Upsert Action. Keywords: Multimode, Change Source, Upsert Description The Upsert Action is used by Multimode Connectors only. This Action updates records where there is a key match and Inserts records where there is not. This Map uses Multimode to load two tables and then a Change Source Action to load a second file into the same tables. It utilizes the Upsert Action to either Insert or Update the records into the target tables.

Exercise 1. Create our map based on the specifications given below. 2. Save the map as m_Multimode_Upsert.map.xml. 3. Run the map and observe the result.

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: $(FUN_DATA)Accounts.txt Header = True

Define the Target: Target Connector: Target Data: ODBC 3.x Multimode Database: TrainingDB Tables: tblIndividuals, tblTradingPartners Target Options: None

Variables
Name varChngSrc Type Variant Public no Value 0

181 Data Integrator Fundamentals Training

Define Events: Transformation Events Event Name Event Actions Event Parameters
target name statement Delete from tblIndividuals; Delete from tblTradingPartners

BeforeTransformation

SQL Statement

Target

Target Field Expressions tblIndviduals


tblIndividuals.Account Number tblIndividuals.Name tblIndividuals.Street tblIndividuals.City tblIndividuals.State tblIndividuals.Zip tblIndividuals.Email tblIndividuals.Birth Date

Records("R1").Fields("Account Number") Records("R1").Fields("Name") Records("R1").Fields("Street") Records("R1").Fields("City") Records("R1").Fields("State") Records("R1").Fields("Zip") Records("R1").Fields("Email") DatevalMask(Records("R1").Fields("Birth Date"), "mm/dd/yyyy") Records("R1").Fields("Favorites") Records("R1").Fields("Standard Payment") Records("R1").Fields("Payments") Records("R1").Fields("Balance")

tblIndividuals.Favorites tblIndividuals.Standard Payment tblIndividuals.Payments tblIndividuals.Balance

Target Field Expressions tblTradingPartners


tblTradingPartners.Account Number tblTradingPartners.Name tblTradingPartners.Company tblTradingPartners.Street tblTradingPartners.City

Records("R1").Fields("Account Number") Records("R1").Fields("Name") Records("R1").Fields("Company") Records("R1").Fields("Street") Records("R1").Fields("City")

182 Data Integrator Fundamentals Training

tblTradingPartners.State tblTradingPartners.Zip tblTradingPartners.Email tblTradingPartners.Standard Payment tblTradingPartners.Payments tblTradingPartners.Balance

Records("R1").Fields("State") Records("R1").Fields("Zip") Records("R1").Fields("Email") Records("R1").Fields("Standard Payment") Records("R1").Fields("Payments") Records("R1").Fields("Balance")

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name

AfterEveryRecord

ClearMap

= Target = tblIndividuals

record layout count

If Left(Records("R1").Fields("Account Number"), 2) == "02" Then 1 Else 0 End if

Upsert Record

target name record layout table name

Target tblIndividuals tblIndividuals

count
If Left(Records("R1").Fields("Account Number"), 2) == "02" Then 1 Else 0 End if

ClearMap

target name record layout count

Target tblTradingPartners

If Left(Records("R1").Fields("Account Number"), 2) == "01" Then 1

183 Data Integrator Fundamentals Training

Else 0 End if

Upsert Record

target name record layout table name count

Target tblTradingPartners tblTradingPartners

If Left(Records("R1").Fields("Account Number"), 2) == "01" Then 1 Else 0 End if

Consider creating variables to use as flags the count parameter. The variables can be set in an Execute Statement in the AfterEveryRecord Event. This method keeps the logic for writing to each table in one location which would be easier to update in the long run.

Define Events: Source Events Event Name Event Actions Event Parameters
If varChngSrc == 0 Then varChngSrc = 1 "+File=$(FUN_DATA)AccountsUpdate.txt" End if

OnEOF

ChangeSource

184 Data Integrator Fundamentals Training

Reference

185 Data Integrator Fundamentals Training

Checklist Starting Your Integration Project


Scoping Your Project
Before you start working with the integration tools, it is worthwhile to do some initial planning and preparation. When scoping a new data integration project, review the following checklist.

Preparing for the Initial Project


Integration Type o Is your project primarily a migration, extract, transform and load (ETL), application integration, business to business transactions (B2B), data profiling, or other? Data Objects o What is the number of transformations, processes, source files or tables, target files or tables that you plan to use? o Will this number vary between transformations and processes, or will it always be the same? Direction o Is the direction of your data connection one-way or bidirectional? Volume o What is the volume of your data? o How many records do you plan to process? o Find a unit of measure to count. Frequency o Do you want to run your processes and transformations for a single impromptu purpose, in scheduled engine runs, or continuously in real time?

Project Planning Integration Design


Connectivity options o What source or target connector or component do you plan to use? Server address, User IDs, Passwords. o Gather information on server addresses, and get user IDs and passwords ready. Shared Data Objects o Do you plan to use any code modules, RIFL scripts, SQL queries, or statements that were used in a previously designed transformation or process? Shared Transformations or Processes o Are there any existing transformations or processes that you can use in the project? Data o Make a list of any data files, tables, or entities that you plan to use. o Will the data require special handling, such as encoding or Unicode? Results Management o Do you need to build notifications of results or events, such as e-mail notification, custom log files, or data archival? 186 Data Integrator Fundamentals Training

o Gather e-mail information and confirm checkpoints that require special logging. Identify Platform and Software Needs o What operating system platforms do you plan to use? Is client software needed for connectivity? o Do you need special expertise to set up and configure the software which would require a database administrator or Pervasive professional services? Naming Conventions for Specification Files, Variables, and Objects Invoke Integration. How will you call your maps and process (batch, real-time with Integration Server)?

Integration Design
Performance (Lookups, Parallel and Multithreaded) How To (record x to filter or use multiple record type; error handling)

Reference:
See Best Practices: http://docs.pervasive.com/products/integration/download/best_practices.pdf

187 Data Integrator Fundamentals Training

Upgrading from 8.x to 9.x


You cannot install 9.x in the same folder as 8.x. You may install 9.x in another folder on the same machine, or on another machine. After installing 9.x follow the steps below to bring 8.x files forward to 9.x: 1. In the 8.x installation, back up all of the repositories the Cosmos_Work directory or any other directory where files related to the repositories may exist. 2. In the 8.x installation, back up the InstallDir\Common800\extractor800.mdb. 3. After installing version 9.x you will be prompted to accept aCosmos9_Work directory as the 9.x workspace root directory location. Even if this directory does not exist, it will be created for you. Do one of the following: a. Accept this default location and copy all of your files from the 8.x Cosmos_Work folder to the new 9.x Cosmos9_Work folder. b. Change the workspace location to the 8.x workspace location. Choosing this option will allow you to use both versions to run the same files in a single workspace. 4. If you are using Extract Schema Designer or Extract Scripts also do one of the following: a. Run the 9.x version of Extract Schema Designer. Choose File Change Database to point to Install\Common800\extractor800.mdb instead of Install\Common\extractor900.mdb. b. Change the name of the extractor800.mdb to extractor900.mdb, and use it to replace the extractor900.mdb in the InstallDir\Common folder.

188 Data Integrator Fundamentals Training

Cosmos.ini Settings
The cosmis.ini file contains the startup information required to launch the integration products. This file is available in InstallDir\Common, where InstallDir is the installation directory for the integration tool set. For more information, see the next page Windows Default Installation Locations, and see the Installation Locations topic in the release notes.

189 Data Integrator Fundamentals Training

Windows Default Installation Locations


The following tables provide a brief description of what is stored in each default installation folder on Windows XP/ Server 2003/ and Windows Vista. The Pervasive and the Cosmos9 folder names may be overridden by setting option values in the setup.ini file prior to installation. Table 1-1 Windows XP/Server 2003 Default Installation Locations

Component
cosmos.ini

Default Installation Directory and Path


C:\Documents and Settings\All Users\Application Data\Pervasive\Cosmos9\Common

Integration Platform Designers and Other Executables Integration Server

C:\Program Files\Pervasive\Cosmos9\Common

C:\Program Files\Pervasive\Cosmos9\IntegrationServer C:\Documents and Settings\All Users\Application Data\Pervasive\Cosmos9\IntegrationServer

Repository Manager Component SDK Target and Source Connectors Product Documentation (PDFs and Help) SDK Documentation License Files Components (Plug Ins) Samples .msi file SDKs for Content eXtraction Language and Engine SDK

C:\Program Files\Pervasive\Cosmos9\RepositoryManager C:\Program Files\Pervasive\Cosmos9\Common\ComponentSDK C:\Program Files\Pervasive\Cosmos9\Common\connections C:\Program Files\Pervasive\Cosmos9\Common\Help C:\Program Files\Pervasive\Cosmos9\Common\Help\PDF C:\Program Files\Pervasive\Cosmos9\Common\Help\SDKs C:\Documents and Settings\All Users\Application Data\Pervasive\Cosmos9\Common\License C:\Documents and Settings\All Users\Application Data\Pervasive\Cosmos9\Common\Plug-Ins C:\Program Files\Pervasive\Cosmos9\Common\Samples C:\Program Files\Pervasive\Cosmos9\Common\SDKs

Table 1-2 Windows Vista and Windows Server2008 Default Installation Locations

Component
cosmos.ini Integration Platform Designers

Default Installation Directory and Path


C:\ProgramData\Pervasive\Cosmos9\\Common C:\Program Files\Pervasive\Cosmos9\Common 190 Data Integrator Fundamentals Training

and Other Executables Integration Server C:\Program Files\Pervasive\Cosmos9\IntegrationServer C:\ProgramData\Pervasive\Cosmos9\\IntegrationServer Repository Manager Component SDK Target and Source Connectors Product Documentation (PDFs and Help) SDK Documentation License Files Components (Plug Ins) Samples .msi file SDKs for Content eXtraction Language and Engine SDK C:\Program Files\Pervasive\Cosmos9\RepositoryManager C:\Program Files\Pervasive\Cosmos9\Common\ComponentSDK C:\Program Files\Pervasive\Cosmos9\Common\connections C:\Program Files\Pervasive\Cosmos9\Common\Help\ C:\Program Files\Pervasive\Cosmos9\Common\Help\PDF C:\Program Files\Pervasive\Cosmos9\Common C:\ProgramData\Pervasive\Cosmos9\\Common\License C:\ProgramData\Pervasive\Cosmos9\\Common\Plug-Ins C:\Program Files\Pervasive\Cosmos9\Common\Samples C:\Program Files\Pervasive\Cosmos9\Common\SDKs

191 Data Integrator Fundamentals Training

Design Tool User Interfaces


Map Designer

192 Data Integrator Fundamentals Training

Process Designer

193 Data Integrator Fundamentals Training

Setting Properties

Setting Map Properties To set the Map Tab to show the Navigation Tree with Events, from the Menu click View Preferences. Click the General tab. Check Always show Map All view.

Setting RIFL Script Properties To set the editor to show a line number for each line of the scripts, on the menu bar choose View Editor Properties. Click on the Misc tab. In the lower left see Line numbering. In the Style dropdown choose Decimal. Change the Start at to 1. Click OK.

194 Data Integrator Fundamentals Training

Reading a Log File


The Log File Browser allows you to view the contents of log files from Map Designer and Process Designer. To view the error and event log: 1. In the main toolbar, click the View >> TransformationMap.log icon 2. The Log File Browser displays the contents of the log file. Each of the designers creates their own log file. For instance, the Map Designer creates a TransformMap.log, and Process Designer creates a process log. The following screenshot shows a TransformMap.log file generated by the Map Designer user interface:

Note: The Log File Browser displays a maximum of 32,000 lines. If your log file is very long, you will be able to see only the last 32,000 lines of it in the browser. If you want to see the rest, open it in a text editor, such as WordPad. 3. Click Search to display the Find Text dialog box. It allows you to search the error and event log file for a particular string of text. 4. Click Clear Log to delete the log file.

To change names of log files: You can set the name of the .log file in three places. In Map Designer, you open Transformation and Map Properties, click Error Logging, and type a new log file name. In Process Designer, select File > Process Properties, click the Logging Tab, type a name for the log, and click OK. For the Engine, type the following at a command prompt: -logfile newlogfilename 195 Data Integrator Fundamentals Training

Transformation Log Codes The transformation log file displays the following information:

Date

Time

Error Type

Internal Code

Direction Code

Source

08/25/2006

14:08:10

Global

Error Type 1 Informative 2 Warning 4 General Error 8 Fatal Error 16 Debug Message Internal Code This code is related to 255xx codes. Direction Code I Import C Component E Export M Message component O Other U Unknown component Source The source of the log message can be global, the name of a connector, name of a component, or some other indication of the origin of the message

196 Data Integrator Fundamentals Training

Examples of Complex Process Layouts

197 Data Integrator Fundamentals Training

198 Data Integrator Fundamentals Training

Additional Documentation Resources


Best Practices:

http://docs.pervasive.com/products/integration/download/best_practices.pdf
Product Documentation:

http://docs.pervasive.com/products/integration/di/wwhelp/wwhimpl/js/html/wwhelp.htm#href=conta ct/contact.html

Integration Support:

http://www.pervasiveintegration.com/support/Pages/submit_a_support_ticket.aspx
Integration Forums:

http://cs.pervasive.com/forums/16.aspx
Documentation and Downloadable Samples:

http://www.pervasiveintegration.com/support/documentation/Pages/documentation_and_samples.as px
Event Management Guide:

http://docs.pervasive.com/products/integration/download/events.pdf

Product Updates and Connectivity Packs:

http://www.pervasiveintegration.com/support/Pages/product_downloads.aspx
Integration Manager Pages:

http://www.pervasiveintegration.com/products/Pages/integration_manager.aspx

199 Data Integrator Fundamentals Training

Glossary

Glossary of Integration Product Terminology

200 Data Integrator Fundamentals Training

A
Action One of the options in Event Handlers (Map tab, upper left quadrant in Map Designer). For example, ClearMapPut Record is the default Action automatically set when you do not override the option. Some other Actions in the drop down list include: Execute, MapPut Record, Map, Put Record, Insert Record, Clear, and ClearInitialize. Array In programming, a series of objects, all of which are the same size and type. Each object in an array is called an array element. For example, you could have an array of integers or an array of characters or an array of anything that has a defined data type. The important characteristics of an array are: Each element has the same data type (although they may have different values). The entire array is stored contiguously in memory (that is, there are no gaps between elements). Arrays can have more than one dimension. A one-dimensional array is called a vector; a twodimensional array is called a matrix. Arithmetic operators The +, -, *, /, and ( ) are operators used to construct arithmetic expressions. ASCII The most common format for text files. In an ASCII file, each alphabetic, numeric, or special character is represented with a 7-bit number (a string of seven 0s or 1s), with 128 characters defined. Unix and older DOS-based operating systems use ASCII for text files. Newer Windows systems use an encoding standard called Unicode. IBM System 390 servers use a proprietary 8-bit code called EBCDIC. Transformation programs allow different operating systems to change a file from one encoding standard to another. The American National Standards Institute (ANSI) oversees ASCII Standards.

B
Binary File A computer file that contains machine-readable information that must be read by an application; the characters use all 8 bits of each byte. Boolean logic The type of an expression with two possible values, True and False. Also, a variable of Boolean type or a function with Boolean arguments or result. The most common Boolean functions are And, Or and Not. In computer operation with binary values, Boolean logic can be used to describe electromagnetically charged memory locations or circuit states that are either charged (1 or true) or not charged (0 or false). The computer can use an AND gate or an OR gate operation to obtain a result that can be used for further processing.

C
Comma-delimited A data format in which each piece of data is separated by a comma. This is a popular format for transferring data from one application to another, because most database systems are able to import and export comma-delimited data. Concatenate 201 Data Integrator Fundamentals Training

To merge the records from two or more files into a single file. Also, to add a string of data to other data that already exists in a field. In Map Designer, you can concatenate fields into a single field by using an expression. Connection String A list of key = value pairs. The keywords are either names of connection information fields, or Connector property names. The key=value pairs are separated by a semi colon (;). Connector Name of the type of connection at the Source or Target tab. ASCII (Delimited), MySQL, and Oracle 9i are some examples of Connectors. In early versions of Map Designer, the term for connector was spoke. Constraint An object used to place rules on data in a relational database. Constraints are used to control the allowed data in a column, are created at the column level, and are used to enforce referential integrity (parent and child table relationships). Conversion Called a transformation in more recent versions of Map Designer, and the basic unit for all data transfer and manipulation. A conversion (transformation) is one set of source, target, and mapping specifications. When these specifications are set, the data transformation process can be run.

D
Data (1) Distinct pieces of information, usually formatted in a special way. Software is divided into two general categories: data and programs. Programs are collections of instructions for manipulating data. (2) The term data is often used to distinguish binary machine-readable information from textual human-readable information. For example, some applications make a distinction between data files (files that contain binary data) and text files (files that contain ASCII data). (3) In database management systems, data files are the files that store the database information, whereas other files, such as index files and data dictionaries, store administrative information, known as Metadata. Data integrity Refers to the validity of data. Data integrity can be compromised in a number of ways: Human errors when data is entered Errors that occur when data is transmitted from one computer to another Software bugs or viruses Hardware malfunctions, such as disk crashes Natural disasters, such as fires and floods There are many ways to minimize these threats to data integrity. These include: Backing up data regularly Controlling access to data via security mechanisms Designing user interfaces that prevent the input of invalid data (such as Input Boxes for user input) Using error detection and correction software when transmitting data (error trapping, reject tables) Data structure 202 Data Integrator Fundamentals Training

A Schema (Map tab in Map Designer). In previous versions of Map Designer, data structures were also called Record Layouts. A data structure is the arrangements of fields in a record within a particular data file, either source or target. This includes field length, record length, field data types, and other field properties such as Decimal and Scale. Data type The classification of data a field can contain. Some data types include text, numeric, datetime, float, packed decimal, Boolean, and 16-bit binary. Database An organized collection of information, stored systematically in tables or files. Default What the integration product automatically does in the absence of an overriding command. For example, if no After Every Record events are selected in Map Designer, the ClearMapPut Record Action is automatically invoked when a transformation is run. Delimited ASCII data ASCII data has fields that are separated by some character, often a comma. Field entries frequently begin and end with double quotation marks ("), and records are often separated by a carriage returnline feed (CR-LF). Records and fields are not usually a fixed length. Delimiter A character or combination of characters used to separate one item or set of data from another. For example, in comma-delimited records, a comma is used to separate each field of data. In the Map Designer ASCII Delimited connector, the source and target Property default setting is commadelimited. Design time Activities performed when designing a transformation or process. It includes specifying source and target connection information, reading and applying metadata, specifying transformation events, options, execution paths, errors, defining mapping expressions, and exception handling. Discriminator A discriminator is the data within a file that indicates record type. DJAR Data Junction Archive (DJAR) is a package that contains processes and dependents of the processes such as Maps, Functions, Executables, etc. DJImport Object An internal object designed to provide a generic interface to Map Designer Connectors. It is used to read data to be utilized as a source.

E
EBCDIC An IBM code for representing characters as numbers. Although it is common on large IBM computers, most other computers, including PCs and Macintoshes, use ASCII codes. Expression 203 Data Integrator Fundamentals Training

An Expression (also called a Script) is a combination of Operator, literal values, field names, Statement, Variable, and Function. They are used to perform calculations, enter a specific value, concatenate data, or otherwise modify data in a particular field. Expression Builder Now called RIFL Script Editor, this is the functional area of Map Designer where you can write your own scripts to include with your transformations. RIFL Script Editor includes a list of all of the functions and operators available to you in RIFL (Rapid Integration Flow Language).

F
Field A labeled or unlabeled column of information in a data file or table; a field contains the same kind of information for each record in the data file or table. File format A format for encoding information in a file. Each different type of file has a different file format. The file format specifies first whether the file is binary or ASCII, and second, how the information is organized. Filter A set of criteria applied to a range of records. In the Map Designer, both the source and target filters sift through data and return a subset of records specified in the filter options. The number of records processed can also be specified in these filters. Fixed ASCII file An ASCII data file that has fixed field and record sizes, but no delimiter (except possibly a record separator). Fixed length Having a set length that never varies. In database systems, a field can have a fixed or a variable length. A variable-length field is one whose length can be different in each record, depending on what data is stored in the field. The terms fixed length and variable length can also refer to the entire record. A fixed-length record is one in which every field has a fixed length. A variable-length record has at least one variablelength field. Flow Control Management of data flow between computers, devices, or network nodes to maintain efficient use of data. Function A small section of a program designed to perform a specific task. Many functions return a value based on the results of a calculation or other operation. Some functions operate as a procedure and return no value. In Map Designer, functions can be used to map and manipulate data. A list of available functions is in the RIFL Script Editor interface. For a list of functions, see All Functions in the Help Files.

G
GUI 204 Data Integrator Fundamentals Training

(Graphical User Interface) A graphics-based user interface that incorporates icons, menus, and a mouse. The interface has become the standard way users interact with a computer. In a client-server environment, the GUI resides in your client machine.

H
Header Information that appears at the beginning of a data file, but is not a part of the actual data.

I
Integration In reference to data it is the combining or movement of data from different sources to provide end users with a unified view of this data. The data movement may also involve transforming the data through computations, or modifying the data format.

K
Key In database management systems, a key is a field that you use to sort data. It can also be called a key field, sort key, index, or key word. Most database management systems allow you to have more than one key so that you can sort records in different ways. One of the keys is designated the primary key, and must hold a unique value for each record. A key field that identifies records in a different table is called a foreign key.

L
Lookup Table An array or matrix of data that contains values that can be searched.

M
Mask A pattern of tokens used to accept or reject patterns in another set of data. For example, a date mask that looks for two numbers followed by a slash followed by two more numbers, another slash and two more numbers (##/##/##) can be used to match a string of source data. When the specified pattern appears in both the mask and the data, the source data will be written to the target. Metadata Data about data. Metadata describes how and when and by whom a particular set of data was collected, and how the data is formatted. Metadata is essential for understanding information stored in data warehouses. Multimode Specific connector types that have been designated to allow writes to multiple tables. When a user selects one of these connector types, the Output mode will automatically be "Multiple Output Mode". This cannot be changed to regular output mode. SQL Script and ODBC 3.x are two of the Multimode Connectors available.

N
205 Data Integrator Fundamentals Training

Null A value that indicates missing or unknown data in a field. Null characters are placeholders with a hex value 00. These values can be entered in fields for which information is unknown and can be used in expressions. Some fields, such as those identified with primary keys, cannot contain Null values.

O
Object A mechanism that binds data to methods that operate on it. In object-oriented programming, an object is a self-contained entity that consists of both data and procedures to manipulate the data. ODBC (Open Data Base Connectivity) A database programming interface introduced by Microsoft in 1992 that provides a common language for applications to access databases on a network. ODBC is made up of the function calls programmers write into their applications and the ODBC drivers themselves. For client/server database systems such as Oracle and SQL Server, the ODBC driver provides links to their database engines to access the database. For desktop database systems such as dBASE and FoxPro, the ODBC drivers actually manipulate the data. ODBC supports SQL and non-SQL databases. Although the application always uses SQL to communicate with ODBC, ODBC will communicate with non-SQL databases in its native language. Map Designer supports ODBC 2.x, ODBC 3.x, ODBC 3.5 and ODBC 3.x multimode connectivity. OLE OLE (Object Linking and Embedding) is a compound document technology and part of Microsoft ActiveX technologies. A compound document can contain visual and information objects of all kinds. Each object is an independent program entity that can interact with a user and also communicate with other objects. OLE utilizes the Component Object Model (COM) and its distributed version, (DCOM). An OLE object is also, by default, a component (or COM object). OnEOF A source schema event (upper left, Map tab in Map Designer). Executed when the end of the file (EOF) is reached. Operator A symbol that represents an operation to be performed on a value or values. For example, the + operator represents addition, and the * operator represents multiplication. Output A mode which represents the transfer of data from the source to the target (Map tab in Map Designer). Some selections include: Replace File/Table, Append to File/Table, Update File/Table and Clear/ Append. Connectors that write to multiple tables use the Multimode Output mode.

R
RDBMS

206 Data Integrator Fundamentals Training

Relational Database Management System. RDBMS includes a wide variety of SQL and relational database systems, such as SQL Server and Oracle. Data is stored in multiple tables, many of which are linked by the use of primary key fields. Record (1) In database management systems, a complete set of information. Records are composed of fields, each of which contains one item of information. A set of records constitutes a file. For example, a personnel file might contain records that have three fields: a name field, an address field, and a phone number field. (2) Some programming languages allow you to define a special data structure called a record. Generally, a record is a combination of other data objects. For example, a record might contain three integers, a floating-point number, and a character string. Record layout The term for a data structure used in Map Designer. The alternative term is schema. The arrangement of fields in a record in a particular data file, either source or target. This includes field length, record length, field data types, and other field properties such as decimal and scale. Record number A unique number that identifies each record in a data file or table. Record type A set of field options within the source and target schemas (Map tab in Map Designer). These options include layout name, length, lock, schema origin, and description. Regular expression A string of characters that defines a set of rules for matching character strings found in fields. Relative path An implied path. When a command is expressed that references files, the current working directory is the implied, or relative, path if the full path is not explicitly stated. Repository A physical location on your local system and on the network. It stores maps, connections, structured schemas and join view files. RIFL Rapid Integration Flow Language (RIFL) is a custom expression language for the integration products. RIFL includes functions, statements, operators, events, scripts, and objects unique to the integration platform. Some RIFL functions are similar, but not the same as, Visual Basic. RIFL scripts can be run on both Windows and Unix systems. Use the .rifl extension for script files. Run Time The events that occur during transformation and process execution. These include connecting to data sources and targets, reading and writing data, compiling and evaluating expressions, transformation events, and exception processing.

S
Scale A Field Property Value option (Map tab in Map Designer). Designates where a decimal is positioned in a number. 207 Data Integrator Fundamentals Training

Schema The term for a data structure (Map tab in Map Designer). The arrangement of fields in a record in a particular data file, either source or target. This includes field length, record length, field data types, and other field properties such as decimal and scale. You can create and modify schemas in Document Schema Designer and in Structured Schema Designer. These schemas can then be validated in Process Designer, and used as structural metadata in Map Designer. Scope In programming, the visibility of variables within a program. For example, whether or not one function can use a variable created in another function. Script A Script or Expression is a grammatically correct combination of operators, literal values, field names, variables and functions used to perform calculations, enter a specific value, concatenate data, or otherwise modify data in a particular field. Server The application that responds to the calling application or client in a DDE or OLE conversion. The server usually sends data to the client. SQL Structured Query Language (abbreviated SQL and commonly pronounced "sequel") is the standard language for storing and manipulating data in relational databases. Statement A descriptive phrase that generates one or more instructions in the computer. String An alphanumeric value or an expression consisting of alphanumeric characters. Syntax Grammar, structure, or order of elements in a language statement. Syntax Error An error caused by an incorrectly expressed statement written in the RIFL Script Editor or in a transformation event in Map Designer.

T
Table (1) In programming, a collection of adjacent fields. Also called an array. A table contains data that is either constant within the program or is called when the program runs. (2) In a relational database, the same as a file; a collection of records. A structure made up of rows (records) and columns (fields) that contain information. A table is the primary object used to store data. When data is queried and accessed for modification, it is usually found in a table. Transformation Called a conversion in previous versions of Map Designer, a transformation is the basic unit for all data transfer and manipulation. A transformation is one set of source connection, target connection, mapping, event, and property specifications. When these specifications are set, the data transformation process can be run. 208 Data Integrator Fundamentals Training

Truncate To remove leading or trailing digits or characters from an item of data without regard to the accuracy of the remaining characters. Truncation occurs when data is converted into a new record with smaller field lengths than the original.

U
Unicode A character encoding scheme that uses two bytes to represent every character, regardless of whether its an ASCII character. This scheme is capable of encoding all known characters and is used as a worldwide character-encoding standard.

V
Validation A process that ensures that the user has provided sufficient information in the design phase. In Process Designer, for example, it verifies that the Steps and links have certain fundamental requirements. Variable (Public, Global, Dim) A named storage location that can be modified during program execution. Each variable has a name that uniquely identifies it within its level of scope. A Public variable can be used throughout a project, while a Global variable can be used throughout a transformation. Dim variables are specific to a module or an expression. View A virtual table that looks like and acts like a table in a relational database. A view is defined based on the structure and data of a table. A view can be queried and sometimes updated.

W
Where Clause The part of a SQL statement that specifies which records to retrieve. In the Map Designer, the statement is an option in source properties in several SQL database applications, such as Access, Oracle, and SQL Server. Workspace A collection of Repositories. Each Workspace directory contains a macro definitions file

called "macrodef.xml".

209 Data Integrator Fundamentals Training

Appendix

This section contains additional exercises and information that may be of use.

210 Data Integrator Fundamentals Training

Additional Exercises

211 Data Integrator Fundamentals Training

Extract Schema Designer: Extracting Fixed Field Definitions

Keywords: Extract Schema Designer: Multiple Fields per Line Style (fixed) Description The next file that we will be parsing is Purchases_Mail.txt. We should take a look at it in a text viewer. Although it might be possible to use this report file as a direct input for a transformation, we would have to define it as a multiple-record-type file. Although there are fewer record types than with the phone purchases we dealt with earlier, there are still enough that when combined with the extra processing logic involved, the job would become tedious. So, again, what we plan to do is use the Extract Schema Designer to create an extract specification that will transform the report file into a more familiar row/column format, and then use that formatted data as input to the transformation that adds these purchases to the database table. As before, we dont require multiple passes of the input file. We will just create the extract schema and apply it to the input on the Source tab of our eventual transformation.

Exercise 1. From the Repository Explorer, select New Object Extract Schema. 2. At the prompt, navigate to the file you will be working with, in this case, Purchases_Mail.txt. 3. In the Source Options dialog, on the Extract Design Choices tab, set the Tag Separator to Colon:Space(: ) Also on this tab, ensure that the Trim Leading and Trailing Spaces checkbox is selected. 4. On the Display Choices tab, ensure that the Pad Lines checkbox is selected. 5. Choose OK to accept the selections. 6. Highlight the entire Account Number line in the data.

7. Right-click in the highlight and select Define Data Field Parse Tagged Data.
8. Highlight the label Purchase Order Number. 9. Right-click in the highlight. 10. Select Define Line Style New Line Style. 11. Change the Line Style Name to PONumber. 12. Choose Add. 13. Highlight the Purchase Order Number tag and the data following it. 14. Right-click in the highlight. 15. Select Define Data Field Parse Tagged Data. 16. Define the PO_Date Field using the same technique 17. Define the Category Line Style and the three Fields on it using the same technique. 18. Define the Unit Cost Line Style and the three Fields on it using the same technique. 212 Data Integrator Fundamentals Training

19. Define the Line Style that determines the end of a row of data for the Extract File. 20. Locate the Line Style that contains the Field that will be the last column in each row in the eventual extract file (in this case, Unit_Cost). 21. Double-click on the Line Style name to bring up the Line Style Definition dialog. 22. On the Line Action tab, choose ACCEPT Record, and accept the remaining defaults. 23. Choose Update. 24. Click on the Browse Data Record button. 25. Choose OK to allow assignment of all Fields to the Extract File. 26. Examine the data to ensure that your Field definitions are correct. 27. Close the browser window. 28. Ensure that the Fields are in the order they appear in the input data. 29. Save the Extract Schema Design as Purchases_Mail.cxl. 30. Close the Extract Schema Designer. 31. Remember that this schema can be used as part of a source connection in Map Designer.

213 Data Integrator Fundamentals Training

Integration Engine: Using the -Set Variable Option


Objectives At the end of this lesson you should be able give a variable a value from the command line. Keywords: -Set Description In the Solutions\MapDesigner_TransformationFundamentals folder there is a transformation that has a msgbox that displays the value of a variable. First lets run the map without changing the value of the variable. At the command prompt type: djengine C:\Cosmos_Work\ Fundamentals\Solutions\IntegrationEngine_CommandLine\m_EngineTestwithVar.tf.xml

Click OK on the MsgBox pop up.

Note that without the Verbose command the only command line indication that the Map ran correctly is a single line, Return Code: 0 Now lets change the value of the variable. For a string with a single word, type at the command prompt: djengine -se myVar=\"NewValue\" C:\Cosmos_Work\ Fundamentals\Solutions\IntegrationEngine_CommandLine\EngineTestwithVar.tf.xml

Click OK on the MsgBox pop up.

214 Data Integrator Fundamentals Training

For a string with multiple words, type at the command prompt: djengine -se myVar=\"New Value\" C:\Cosmos_Work\Fundamentals\Solutions\IntegrationEngine_CommandLine\EngineTestwith Var.tf.xml

Click OK on the MsgBox pop up.

Additional notes: Aside from normal command line quoting/escaping sequences for the given operating system, what is to the right of the equals sign will be used verbatim in an expression to set the variable. On windows, the only command line quote character is the double quote, and it is escaped using a backslash. By using -se gblsStartDate='07-09-1976' you are causing the expression gblsStartDate = '07-09-1976' to be executed, which of course does nothing since the single quote indicates the start of a comment. By using -se gblsStartDate=07-09-1976 you are causing the expression gblsStartDate = (07 09) 1976 to be executed. If you use -se gblsStartDate="07-09-1976" you will get the same results as above (as if the quotes weren't present). However, if you use -se gblsStartDate=\"07-09-1976\" the expression gblsStartDate = "07-091976" will be executed, which is what you want. Note that this also means you can do something like -se gblsStartDate=now() and have gblsStartDate = now() executed.

215 Data Integrator Fundamentals Training

Integration Engine: Scheduling Executions


Keywords: Scheduler Pervasives Integration Manager product provides scheduling capabilities, but many users may just want to use schedulers they already have at hand. There are many schedulers available that can be used to call the DJEngine.exe command and execute a process or map. Some are third-party tools and some are native to the operating systems themselves. For example, you can schedule a batch file containing DJEngine commands using the following: Windows: Schtasks (command-line only); Task Scheduler Unix: Cron

216 Data Integrator Fundamentals Training

Lookup Wizard: Flat File Lookup

Keywords: Lookup Wizard, Count & Counter Variable parameters,


One-to-Many records (unrolling occurrences), and referencing Target Field values Description Flat File Lookups allow us to look up data from a file that is not our source. We reference this data with a key value that does come from the source and returns matching data or a default value if no matches are found. The Lookup Function Wizard allows us to build these customized functions and store them in a code module. We will also be unrolling a data field that contains multiple values. The Favorites categories are all stored in one field with a pipe delimiter separating them. We will create a unique target record for each of the values stored in a single source record. The Count and Counter Variable parameters of the ClearMapPut action can be used to parse this field and unroll the records dynamically.

Exercise 1. Create a Map based on the specifications below.

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: File: $(FUN_DATA)Accounts.txt Header = True

Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB tblFavoriteInfo Target Options: none

Target OutputMode: Clear File/Table contents and Append

Note: The following code module should be built through the Lookup Wizard. The steps for creating the code module are specified below.
217 Data Integrator Fundamentals Training

Define Code Modules: Code Modules: $(FUN_DATA)Scripts\Categories.flatfile.rifl

2. From the Menu click Tools Define Lookup Functions to open the Lookup Wizard. 3. Choose the Flat File Lookup Wizard and click Next. 4. Create a new Flat File Definition named Categories and click Next.

5. Specify the Lookup File as C:\Cosmos9_Work\Fundamentals\Data\Category.txt. 6. Click Next. 7. Choose the appropriate Key Field, and Fields that should be returned by the lookup.

218 Data Integrator Fundamentals Training

8.

Click Finish. The Wizard will create the Lat File Lookup Functions in a code module. Use the functions in the appropriate event handlers as described below.
Categories_Field2_Lookup(KeyValue, DefaultValue) Categories_Field3_Lookup(KeyValue, DefaultValue)

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name

AfterEveryRecord ClearMapPut Record

= Target = R1

record layout

count =
CharCount("|",Records("R1").Fields("Favorites")) + 1

counter variable = cntr

Target Filed Expressions


R1.FavoritesID

Serial()

219 Data Integrator Fundamentals Training

R1.Account Number R1.CategoryCode R1.CategoryLiteral

Records("R1").Fields("Account Number") parse(cntr, Records("R1").Fields("Favorites"), "|") Categories_Field2_Lookup(Targets(0).Records("R1").Fields("CategoryCode"), "NoMatches")

R1.ProductManager Categories_Field3_Lookup(Targets(0).Records("R1").Fields("CategoryCode"),

"NoManagers")

220 Data Integrator Fundamentals Training

Lookup Wizard: Dynamic SQL Lookup

Keywords: Lookup Wizard, Dynamic SQL Lookup, Count & Counter


Variable parameters, One-to-Many records (unrolling occurrences), and referencing Target Field values Description Dynamic SQL Lookups allow us to look up values from other sources when that source is a relational table or view. Again we will use the Lookup Function Wizard to create User Defined Functions that are stored in a code module.

Exercise 1. Create a Map based on the specifications below.

Map Summary:
Define the Source: Source Connector: ASCII(Delimited) Source Data: Source Options: File: $(FUN_DATA)Accounts.txt Header = True

Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB tblFavoriteInfo Target Options: none

Target OutputMode: Clear File/Table contents and Append

Note: The following code module should be built through the Lookup Wizard. The steps for creating the code module are specified below.

Define Code Modules: Code Modules: $(FUN_DATA)Scripts\Categories.dynsql.rifl

221 Data Integrator Fundamentals Training

2. From the Menu click Tools Define Lookup Functions to open the Lookup Wizard. 3. Choose the Dynamic SQL Lookup and click Next. 4. Create a new Dynamic SQL Definition named Categories and click Next.

5. Create a name for the DJImport Object that will make the connection to the table or file. Choose the name category and click Next. 6. Click Build to build a new Connection String and then click Next. 7. Connect to the data source defined below for the lookup and click Next.
Define the Connection String: Connector: Access 2000 File: C:\Cosmos9_Work\Fundamentals\Data\TrainingDB.mdb Table: tblCategories Properties: none

8. Choose the appropriate Key Field, and Fields that should be returned by the lookup.

222 Data Integrator Fundamentals Training

9.

Click Finish. The Wizard will create the following Dynamic SQL functions in a code module. Use the functions in the appropriate event handlers as described below.
Categories_Init() Initializes the DJImport object and makes the connection to the data source as defined by the connection string. Categories_Category_Lookup(KeyValue, DefaultValue) Creates the SQL call needed to retrieve a value from the Category field based on a Key value. Categories_ProductManager_Lookup(KeyValue, DefaultValue) Creates the SQL call needed to retrieve a value from the ProductManager field based on a Key value. Categories_Terminate() Terminates the connection to the data source by destroying the DJImport Object.

Define Events: Transformation and Map Properties Events Event Name Event Actions Event Parameters

BeforeTransformation

Execute

Expression:
Categories_Init()

AfterTransformation

Execute

Expression:
Categories_Terminate()

223 Data Integrator Fundamentals Training

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name

AfterEveryRecord ClearMapPut Record

= Target = R1

record layout

count =
CharCount("|",Records("R1").Fields("Favorites")) + 1

counter variable = cntr

Target Field Expressions


R1.FavoritesID

Serial()

R1.Account Number Records("R1").Fields("Account Number") R1.CategoryCode R1.CategoryLiteral

parse(cntr, Records("R1").Fields("Favorites"), "|") Categories_Category_Lookup _ (Targets(0).Records("R1").Fields("CategoryCode"), "NoMatches") Categories_ProductManager_Lookup _ (Targets(0).Records("R1").Fields("CategoryCode"), "NoManagers")

R1.ProductManager

224 Data Integrator Fundamentals Training

RDBMS: Integration Querybuilder


Objectives At the end of this lesson you should be able to extract data from one or more tables in the same database by using a SQL Passthrough statement. Keywords: Integration Query Builder, SQL Passthrough Statements Description The Transformation Map designer source connectors allow for passing Select statements through to a database server to obtain a row set. The resultant row set that is returned by the query then becomes the source data for your Map. Use the Integration Query Builder to generate the source record set. Alternatively, you can use the SQL script that generates this source record set by using the SQL File connection option and pointing to the matching SQL Script file in the Scripts folder. When you choose an RDBMS source connector, there are three choices for selecting Source Data. You can to point directly to a table or view, pass a SQL statement through, or point to a SQL script file that contains a SQL statement. We will construct a SQL statement using the query builder.

Exercise Once you have connected to a data source, (described below) your connection is displayed in the upper-right pane. You can set up and save as many data source connections as you need. Integration Querybuilder stores all connections you create unless you explicitly delete them. 1. Double-click the connection you want to use. The DB Browser in the lower-right pane will display the database. 2. Click the database icon to display the icons for tables, views and procedures for this database. Clicking on these will display their contents. Click on the individual tables to list their columns, or right-click and select Get Details from the shortcut menu to see the SQL representation of column values such as length, data types and whether they are used as primary or secondary keys. 3. To create a query, select New Query from the Query menu. A new query icon will be opened beneath the connection icon in the upper-right pane. You can rename this now or later by Integration Querybuilder Right-click on the icon. 4. Drag the tables and views you want to use into the upper-left pane. This is called the Relations pane. As you drag tables into this pane, you will see that SELECT... FROM statements are created in the SQL pane. If tables are already linked in the database, these links will be displayed, although these can be changed or removed for the purpose of this particular query. If you are using a table more than once, the second and further copies will be renamed. For example, if you already have a Customer table in the Relations pane and you drag across another copy, it will be automatically renamed Customer1.

225 Data Integrator Fundamentals Training

The Select statement that is generated becomes part of the connection string and it is passed through to the database server. We can now map this data into any target type and format we desire. The following is information taken from reports generated by Repository Manager from the RDBMS_SelectStatements transformation in the Solutions folder: Source (ODBC 3.x)
Database TrainingDB SELECT srcAccounts.[Account Number], srcAccounts.Name, srcAccounts.Company, SQLStatement srcAccounts.Street, srcAccounts.City, srcAccounts.State, srcAccounts.Zip, srcPurchases.PONumber, srcPurchases.Category,

226 Data Integrator Fundamentals Training

srcPurchases.ProductNumber, srcPurchases.ShipmentMethodCode FROM (srcAccounts RIGHT JOIN srcPurchases ON srcAccounts.[Account Number] = srcPurchases.AccountNumber) ORDER BY srcPurchases.ShipmentMethodCode, srcAccounts.City

Target (ASCII (Delimited))


location $(FUN_DATA)Purchases_SQLSelect.txt

TargetOptions
header True

Outputmode Replace

Source R1 Events
AfterEveryRecord ClearMapPut Record

target name Target record layout R1

Map Expressions
R1.Account Number R1.Name R1.Company R1.Street R1.City R1.State

Fields("Account Number") Fields("Name") Fields("Company") Fields("Street") Fields("City") Fields("State")

227 Data Integrator Fundamentals Training

R1.Zip R1.PONumber R1.Category R1.ProductNumber

Fields("Zip") Fields("PONumber") Fields("Category") Fields("ProductNumber")

R1.ShipmentMethodCode Fields("ShipmentMethodCode")

228 Data Integrator Fundamentals Training

Structured Schema Designer: Binary Data and Code Pages


Objectives At the end of this lesson you should be able to create a Structured Schema for a Binary, EBCDIC File. Keywords: Binary, EBCDIC Description Creating a Structured Schema for a Binary, EBCDIC File When working with binary files, we will usually need to tell the Structured Schema Designer that the file should be displayed and accessed using a coding structure other than ANSI. The most common binary coding structure is EBCDIC. To change this property, we will work with the SSD connection specification and specifically its Property Sheet. We can change the Code Page property to match the coding structure of the file we are working with. Another issue with binary files is that the records are often some arbitrary length (e.g., 500 bytes) even though the logical records might be longer or shorter than that. As a result, when the data is displayed in the Visual Parser, it does not appear as if the data is structured. There is no automatic solution to this problem, but you can adjust the record length that the Visual Parser will use until you see the data lining up properly. Then you can parse normally, and the SSD will remember the record length you have set, and break the file apart properly when you use the schema in a Map Design.

Exercise 1. Start a New Structured Schema Design 2. Click the Visual Parser button (red knife) 3. Change the Code Page property to 37 US EBCDIC (click the Apply button!) 4. Navigate to the file named Accounts_Binary.bin. 5. Determine the record length by looking for patterns in the file 6. Overtype the Length and hit Enter key (try 180, what happens?) 7. After you have the columns lined up, parse the fields, select data types and field properties until you have defined the structure. 8. Save the Structured Schema as s_BinaryDataCodePages.ss.xml for reuse.

Record Layouts
Record R1
Name AccountNumber Type Text Length 9

229 Data Integrator Fundamentals Training

Name Company Address City State ZipCode Email BirthDate Favorites

Text Text Text Text Text Text Text Date Text

21 31 35 16 2 10 25 4 11 6 7 6 183

StandardPayment Packed decimal Payments Balance Total Packed decimal Packed decimal

230 Data Integrator Fundamentals Training

Structured Schema Designer: Reuse Metadata (Reusing a Structured Schema)


Objectives At the end of this lesson you will know the steps involved in applying a pre-developed Structured Schema to a new file that is supposed to follow the structure defined in that schema. Keywords: Structured Schema Description This example transforms a Binary file into an ASCII Delimited file. When you activate the Structured Schema Designer from the Source Tab or Target Tab, and have saved the schema, it is automatically attached to the current Transformation. If you wish to use a pre-defined schema, both the Source and Target Tabs have a dropdown from which an existing schema can be selected. As soon as the schema is attached, the Source or Target information (hierarchy and field list) will be filled in on the Map Tab. You may change field names, lengths and data types, but only if you first unlock the schema.

Exercise 1. Start a New Map design session and choose the Binary connector. 2. Select the Structured Schema named s_BinaryDataCodePages.ss.xml. 3. Select the file named Accounts_Binary.bin. 4. Change the source property Code Page to 37 US EBCDIC (click APPLY button!). 5. Browse the file to confirm the structure has been applied. 6. If desired, you can complete the map based on the specifications below. The lesson, though is intended to demonstrate that a file can be parsed in Structured Schema Designer, and used as input for Map Designer. In this exercise we use it as a source connection. Structured Schemas can also be used as part of a target connection.

Map Summary:
Define the Source: Source Connector: Binary Source Data: Source Schema: Source Options: $(FUN_DATA)Accounts_Binary.bin s_BinaryDataCodePages.ss.xml codepage = 0037 US (EBCDIC)

231 Data Integrator Fundamentals Training

Define the Target: Target Connector: Target Data: Target Options: ASCII(Delimited) File: $(FUN_DATA)AccountsOut.txt Header = True

Target OutputMode: Replace

Target Field Expressions


R1.AccountNumber R1.Name R1.Company R1.Address R1.City R1.State R1.ZipCode R1.Email R1.BirthDate R1.Favorites

Records("R1").Fields("AccountNumber") Records("R1").Fields("Name") Records("R1").Fields("Company") Records("R1").Fields("Address") Records("R1").Fields("City") Records("R1").Fields("State") Records("R1").Fields("ZipCode") Records("R1").Fields("Email") Records("R1").Fields("BirthDate") Records("R1").Fields("Favorites")

R1.StandardPayment Records("R1").Fields("StandardPayment") R1.Payments R1.Balance

Records("R1").Fields("Payments") Records("R1").Fields("Balance")

Define Events: Source R1 Events Event Name Event Actions Event Parameters
target name record layout

AfterEveryRecord

ClearMapPut Record

Target R1

232 Data Integrator Fundamentals Training

Structured Schema Designer: Multiple Record Type Support in Structured Schema Designer
Objectives At the end of this lesson you should be able to discuss the differences between files that have multiple record types and those that dont. You should be able to describe the tasks that will have to be performed to work with source files that have multiple record types. You should also be able to describe the actions you will have to take should you wish to create a target file with multiple record types. Keywords: Record Types, Record Layouts, Discriminator, and Recognition Rules Description Files can be grouped into two main classifications relative to the records they contain. The first classification is comprised of those files all of whose records are of the same type. This means that each record will contain the same fields, in the same order and with the same properties. The second classification is comprised of those files that contain records that have different formats. One record might contain ten fields while another might contain only six or perhaps twelve. One record type might describe a Customer while another describes a payment he made on his account. Certainly these two records would be different. The critical issue for record type files is not the definition of the records themselves. These can be defined in the Structured Schema Designer with the Visual Parser (by parsing one of them, adding another, parsing it, adding another, and so on). They can also be defined using the grid interface within the SSD (where you simply enter record type names and then enter the field lists for each). You might also be able to import the record layouts, perhaps from a COBOL copybook or some other readable file. The critical issue is how the Map Designer will be able to distinguish one record from another. For any application to be able to work with a file of this type there must be some way to tell the records apart. There should be one common field in each record type, the value of which must identify the record type itself. If this were not true, no software application would be able to deal with the fileMap Designer included. This field is called the discriminator field as it enables us to discriminate between record types. Once the discriminator field has been identified, the remaining task is to define the values that it can have and associate these values with individual record types. For example, if the value of the field were CUS, we might know we have a Customer record type. Or if the value of the field were PAY, we might know we are dealing with a record that describes a payment on an account. These types of rules are called recognition rules, and we must define at least one such rule for each record type. Rules might not be so simple, but fortunately the Structured Schema Designer can work with very complex ones. To create a structured schema for a source file that contains multiple record types, there are three possible strategies you can follow. The strategies you choose depends on what information you already have available describing the file. The three strategies are: 1. You have record layout definitions available in a file: Import the record layout definition file into the SSD. Use the ALL Record Type Rules Recognition dialog to define at least one rule for each record type. 233 Data Integrator Fundamentals Training

2. You have record layout definitions available in a printed document: Select the connector type in the SSD. Use the Grid layout to define each record type and its fields. Use the ALL Record Type Rules Recognition dialog to define at least one rule for each record type. 3. You have no definitions available- only the data file: Activate the SSD Visual Parser for your file. Name and parse each record type. Find and select the discriminator field. Use the Recognition Rules button to activate the Recognition Rules dialog and define at least one rule for each record type. The common element to these strategies is the definition of the recognition rules. These are defined in the Recognition Rules dialog, which is activated from either the ALL Record Type Rules Recognition hierarchy item or the individual R1 Rules R1 Recognition items on the grid layout in the SSD. First, youll identify the discriminator- the field whose contents will be used to tell the record types apart. Next, you can use the Generate Rules button to automatically generate some skeleton rules for each record type. Finally, you can add the actual value that the discriminator field will contain for each record type (and adjust other properties of the rules as you wish). When youre done, the structured schema for the file can be saved. Scenario A source file (Payments_MultiRecType.txt) contains multiple record types, and there is not any information about the files records or its fields. We know that the file contains payment records followed by a summary record. We also know that the payment records are supposed to contain an account number, payment date and payment amount, and that the summary records will contain a payment count and a payment total. However, we do not know where in the records each field begins and ends. We need to define a structured schema for this file by visually determining where each field for both record types starts and stops. We will use the parse data tool to accomplish the task.

Exercise 1. Begin a new Map Design. 2. Point the source to the ASCII Fixed file Payments_MultiRecType.txt. 3. Browse the source file and determine whether record types exist. Close the browser. 4. Click the Build Schema... button for the Structured Schema. 5. Click the Parse Data icon. 6. Rename the Record to Payment and parse a payment record according to the record layout given below.

234 Data Integrator Fundamentals Training

Record Payment
Name Type Length 1 9 8 11 29

RecordIndicator Text AccountNumber Text PaymentDate Amount Total Text Text

7. Click the Add Record button and name the new record type CheckSum. 8. Scroll down until you find the next different structured record (row 30). 9. Parse this record type with its fields as described below.
Record CheckSum
Name Type Length 1 8 3 9 4 4 29

RecordIndicator Text EmptiedDate Action TotalAmount PaymentCount ClerkID Total Text Text Text Text Text

10. Select the Payment record from the Record dropdown and ensure that the RecordIndicator field is displayed in the Field Name box. 11. Check the Discriminator check box. 12. Click the Recognition Rules... button. 13. Click the Generate Rules button. 14. Define PaymentRule1 to be that the discriminator field equals P . 15. Define CheckSumRule1 to be that the discriminator field must be equal to E. 16. Return to the Structured Schema Designer dialog. 17. Save the structured schema as s_Payments_MultiRecType.ss.xml. 18. Close the Structured Schema Designer. 235 Data Integrator Fundamentals Training

19. Browse the source file again and note how the structured schema information has been applied to it. Look at both kinds of records and see how the browser changes.

236 Data Integrator Fundamentals Training

Structured Schema Designer: Conflict Resolution


Objectives At the end of this lesson you should be able to use a Structured Schema to set up a Map that uses one source record type to verify the data in the other record type. Keywords: Schema Mismatch Handling, Record Specific Event Handlers, and Validation Description Our newly defined payment file structure allows us more robust data validation opportunities as we load the Payments table because we have some checksum values on which we can evaluate data. The additional record layout (Check Sum) in our payments file has data that allows us to evaluate aggregated data with checksum values. We can make use of the record specific Event Handlers to perform the evaluations at the appropriate time. When creating map with multiple record types, the Default Event Handler may not be set for you automatically. Neither will the Default Event Handler be sufficient. Therefore, you will need to define the Event Handlers and Actions that are needed to perform the transformation.

Exercise Build a map based on the specifications in the report below.

Map Summary:
Define the Source: Source Connector: ASCII(Fixed) Source Data: Source Schema: Source Options: $(FUN_DATA)Payments_MultiRecType.txt s_PaymentsMultiRecType.ss.xml None

Define the Target: Target Connector: Target Data: ODBC 3.x Database: TrainingDB Table: tblPaymentsVerified Target Options: None

Target OutputMode: Clear File/Table contents and Append

237 Data Integrator Fundamentals Training

Target Field Expressions


R1.AccountNumber R1.PaymentDate

Records("Payment").Fields("AccountNumber") Datevalmask(Trim(Records("Payment").Fields("PaymentDate")), "mmddyyyy")

R1.PaymentAmount Records("Payment").Fields("Amount") / 100

Variables
Name paymentCounter Type Public Value

Variant no

paymentSubtotal Variant no

Define Events: Source Payment Events Event Name Event Actions Event Parameters
Expression:

AfterEveryRecord

Execute

paymentSubtotal = paymentSubtotal + Records("Payment").Fields("Amount") paymentCounter = paymentCounter + 1

ClearMapPut Record

target name record layout

Target R1

Define Events: Source CheckSum Events Event Name Event Actions Event Parameters
Expression:

AfterEveryRecord

Execute

'This code can be imported by the menu, File > Open Script File > ChecksumTest.rifl ' declare temp variables used for better readability Dim crlf, realTotal, realCount, crlf crlf = Chr(13)&Chr(10) realTotal = Records("CheckSum").Fields("TotalAmount") realCount = Records("CheckSum").Fields("PaymentCount")

238 Data Integrator Fundamentals Training

' display current count and payment sub-total for each clerk MsgBox("---New Checksum---" & crlf & _ "PaymentCounter= " & paymentCounter & " : Should be = " & realCount & crlf & _ "Paymt Amt= " & paymentSubtotal & " : Should be = " & realTotal) ' evaluate count and sub-total for inconsistencies If paymentSubtotal <> Trim(realTotal) Then MsgBox("Total payment amount for this clerk does not match checksum amount!!!", 48) End If If paymentCounter <> Trim(realCount) Then MsgBox("Payment Count for this clerk does not match checksum amount!!!", 48) End If ' reset global variables for next clerk paymentCounter = 0 paymentSubtotal = 0

239 Data Integrator Fundamentals Training