Beruflich Dokumente
Kultur Dokumente
Prepared for:
Prepared by:
October 9, 2002
October 9, 2002
Table of Contents
1. INTRODUCTION .................................................................................................................................. 1 1.1. 1.2. 2. 2.1. 2.2. 2.3. 2.4. 3. 4. 3.1. Purpose of Document .................................................................................................................. 1 Background.................................................................................................................................. 1 Development Repository (etl_repository) .................................................................................... 3 QA Repository (Proposed)........................................................................................................... 4 Production Repository (dwprod1_repo)....................................................................................... 4 Naming Conventions ................................................................................................................... 5 Version Control ............................................................................................................................ 7
INFORMATICA ARCHITECTURE....................................................................................................... 2
MIGRATION ......................................................................................................................................... 7 INFORMATICA INSTALLATION AND CONFIGURATION ................................................................ 8 4.2. Overview ...................................................................................................................................... 8 4.3. Detailed Installation and Configuration........................................................................................ 8 4.3.1. Installing the Informatica Server............................................................................................. 8 4.3.2. Configuring the Informatica Server ........................................................................................ 8 4.3.3. Connecting to Databases....................................................................................................... 9 4.3.4. Registering the Informatica Server......................................................................................... 9 4.3.5. Starting and Stopping the Informatica Server ........................................................................ 9
A.1. A.2.
ii
October 9, 2002
1. Introduction
1.1. Purpose of Document
The purpose of this document is to provide OSU with a proposed Informatica architecture, guidelines for migrating Informatica objects into production, and installation procedures.
1.2. Background
The Ohio State University engaged Covansys to manage the Course Analytics Data Warehouse Pilot project. Prior to beginning this project, Ohio State selected Informatica PowerMart as its Extract, Transformation and Load (ETL) product to support its data warehousing data conversion needs. Convansys installed Informatica PowerMart and PowerPlug in May 2002. An overview of the installation was documented during the initial project. This information is included as a reference in Appendix A.
October 9, 2002
2. Informatica Architecture
During Course Analytics Data Warehouse Pilot project Informatica was installed on the DWDEV server. This environment was used to support the development, testing and data conversion requirements of the Course Analytics data mart. The development architecture included Informatica PowerMart 5.1.2, PowerPlug 5.1 on a Windows Server 2000 platform using Microsoft SQL Server 2000 to support the Informatica Repositories. Covansys performed the installation of the production Informatica environment (dwprod1_repo) as part the current project. Ohio State wants to establish an Informatica architecture to support its for future data warehousing projects. This section of this document will propose an Informatica Architecture that will support development, quality assurance, and production needs of OSU. The diagram in figure 1 proposes a repository structure that would support an enterprise implementation of Informatica.
etl_repository ETL1 (proposed) ETL2 (proposed) ETL3(proposed) Shared Objects Developer 1 Developer 2
dwprod1_repo
Figure 1 Proposed Repository Structure for OSU The proposed Informatica environment will have three repositories, development (etl_repository), QA (proposed) and Production (dwprod1_repo). Ideally each repository should exist on a different machine with its own Informatica server. The benefit of building this architecture is that it will decrease the development time and provide management controls to manage a production environment. Developers can develop while QA testing is being performed. However due to the high cost of three Informatica server licenses, OSU could configure the development and QA repositories on one server. The risk associated with this approach is that the development and QA testing performance may be impacted if both activities are being performed at the same time. The development and QA environments should have its own databases for unit testing and system testing. It is recommended that each developer should have its own database schema or file area against which they can test their mappings. This will ensure that each developer can work independently without causing conflict with other developers.
October 9, 2002
Informatica Architecture, Migration, and Installation Definition of the program is complete when: o o It has been unit tested for the business rules. A control balancing test is performed on the output data (i.e. match the number of rows in the source and target) o A referential integrity test has been done. o A data quality test has been done (i.e. identifying the bad data if any) o The mapping has been tested for performance o All of the above has been documented and all the exceptions have been accounted for. It is recommended that after a program is moved from the Development folder to QA folder to Production folder its version in the previous folder be deleted.
Informatica Architecture, Migration, and Installation The production programs should be set to read access only. Production ODBC connections should be setup with administrative access only and should not be accessible to developers. Developer should synchronize the completed programs between the QA and Production repositories.
Naming convention for Informatica mappings and objects. Mappings All mappings should have m_<target name> or m_<business name>. Sources: All sources should have <table name>_<db name>.
The table below provides a few examples for the abbreviation of the Input sources Input source Financial_assistance Ccourse_current Abbreviation fin_assist ccourse_curr
Targets The targets should have the same name as the target table names. Transformations Transformations created for global business rules should be names as <transformation type>_global_<description> 5 October 9, 2002
Informatica Architecture, Migration, and Installation For example a global transformation to filter bad dates should be named as the following filter_global_baddates Also a transformation local to a specific input source should be named as the following expression_<input_source> Sessions/Batches S_<mapping name> B_<functionality name> For example if a set of three mapping that load student information are to be executed in a batch the batch can be named as b_student_information ODBC Connections To better manage and control the ODBC connections a naming convention should be adopted for the connections. For source ODBC connections SRC_<database name> For target ODBC connections TGT_<database name> This will help identify if the connection is being used to extract data or to load data. This will also help identify from where the data is being extracted or to which database the data is being loaded.
October 9, 2002
3. Migration
3.1. Version Control
Informaticas version control features are very basic. It is recommended to use them only if there is no other version control tools are available. A mapping or a folder can be copied or moved from one repository/folder to another using Informaticas Repository Manager tool. A single mapping cannot be versioned in Informatica. Informatica versions the entire folder in which the mapping is stored.
October 9, 2002
4.2. Detailed Installation and Configuration 4.2.1. Installing the Informatica Server
1. 2. 3. 4. Log on to the Windows machine using the account with administrator rights. Run SETUP.EXE Chose the directory where the program needs to be installed. When the Edit Service account dialog box appears enter the required information Domain (optional) . [for development] . [for production] User <xxxx > [for development]* <xxxx > [for production]* *The user should have the rights to run a service Password <xxxx > [for development] <xxxx > [for production]
Informatica Architecture, Migration, and Installation Repository name: Name of the repository to be created Database type: MS SQL Server Repository User: Administrator [Informatica default, can be changed if needed] Repository password: Administrator [Informatica default, can be changed if needed] Database user: <xxxx > [account for the database containing the repository] Database password: <xxxx> [password for the above account] Connect String: <servername@dbname> [the native connect string that Informatica server uses to access the database Domain: (optional) For MS SQL server repository only, the NT domain of the database user specified above. Use Trusted Connection: (optional) For MS SQL server repository only, if selected the repository uses NT integrated security 3. Compatibility and Database: Default settings were selected for this tab 4. Miscellaneous: Default settings were selected for this tab
October 9, 2002
Informatica Architecture, Migration, and Installation 6. If the service fails to start go to Administrative tools Event Viewer Log Application. Look for source PowerMart, select the latest event and view the description of why the service failed to start. To stop the server: 1. Log on as user that has rights to run a service 2. Go to Control Panel Services Informatica 3. Click Stop The Informatica server can also be stopped using Informatica server manager. However in order to do that you must be an administrator or super user in Informatica.
10
October 9, 2002
A.1. Overview The following provides a summary of the Informatica installation for the support of the Course Analytics Data Warehouse Pilot project: Informatica was installed on Server DWDEV, which will be used for the development of the pilot project. The version of Informatica installed was PowerMart 5.1.2 for Windows 2000. The database for the Informatica repository is SQL Server 2000. PowerPlug 5.1 will be used to import metadata from ERwin into the Informatica Repository. The information in the Informatica Repository will be used by BRIO to publish metadata to the Data Warehouse Users. ERwin, Power Plug, Informatica, and the Data Warehouse Pilot Database all reside on the DWDEV server. PowerPlug 5.1 was installed on the DWDEV server so it can be accessed remotely using Terminal Services Client, which is widely used at The Ohio State University. The Informatica PowerMart clients were installed on several machines that will be used by developers, data modelers, and database administrators. Since source systems were not yet identified during the installation process, a copy of the operational data store was used to test the connection between a source database and informatica. To ensure the connectivity between ERwin and Informatica, the physical data model of the operation data store in ERwin was imported into the Informatica repository using PowerPlug 5.1. A trial run was conducted using Informatica where data was extracted from the operational data store and loaded into the target database. This test ensured the connectivity of Informatica between the source and target systems. Due to security concerns Usernames, Passwords, and other connectivity settings will not be stated in this document. For futher information regarding these settings contact the Office of Information Technology.
11
October 9, 2002
Informatica Architecture, Migration, and Installation A.2. Pilot Project Informatica Architecture
Informatica Developer/Admin
ERwin Developer/Admin
ERwin 4.0
Brio Developer/Admin
PowerPlug 5.1
Informatica Server
Brio
GEORGE Windows 2000 Informatica Repository SQL Server 2000 DW Pilot Database SQL Server 2000
12
October 9, 2002