Identifying Bottlenecks

General Informatica Best Practices
Performance and Tuning Overview Identifying ETL Bottlenecks Target Bottlenecks Source Bottlenecks Mapping Bottlenecks Session Bottlenecks System Bottlenecks Partitioning How to make it fly
Identifying Bottlenecks
Target Bottleneck
Common sources of problems: indexes or key constraints database checkpoints small database network packets size too many target instances in your mapping target table is too wide Common solutions: drop indexes and key constraints before loading, rebuild after loading use bulk loading or external loaders when practical increase database network packets size decrease the frequency of database checkpoints optimize target database disks allocation when using partitions, consider partitioning your target table as well
Source Bottleneck
Common sources of problems: slow query small database network packets size wide source tables Common solutions: analyze the query issued by the Source Qualifier. It appears in the session log. consider using database optimizer hints when joining several tables in a Source Qualifier consider indexing tables when you have order by or group by clauses try database parallel queries if supported try partitioning the session if appropriate, try partitioning your source database as well test Source Qualifier conditional filter versus filtering at the database level increase database network packets size
Mapping Bottleneck
Common sources of problems: too many transforms unused links between ports too many input/output or outputs ports in aggregator or ranking transformations unnecessary data type conversions
Common solutions: eliminate transformation errors if several mappings read from the same source, try single pass reading optimize data types, use integers for comparisons. dont convert back and forth between data types optimize lookups and lookup tables, using cache and indexing tables put your filters early in the data flow, use a simple filter condition for aggregators, use sorted input, integer columns to group by and simplify expressions use reusable sequence generators, increase number of cached values if you use the same logic in different data streams, apply it before the streams branch off optimize expressions: isolate slow and complex expressions reduce or simplify aggregate functions
Session Bottleneck
Common sources of problems: inappropriate memory allocation settings running in series rather than in parallel error tracing override set to high level
Common solutions: calculate DTM buffer pool and buffer block size make sure to keep data caches and indexes in memory, paging to disk is very slow if your mapping allows it, use partitioning run sessions in parallel, within concurrent batches, whenever possible increase database commit interval turn off recovery and decimal arithmetic (theyre off by default) use debugger rather than high error tracing, always reduce your tracing level for production runs
System Bottleneck
Common sources of problems: slow network connections overloaded or under-powered servers slow disk performance Common solutions: get the best machines to run your server. Better yet, use several servers against the same repository (power center only) use multiple CPUs and session partitioning make sure Informatica servers and database servers are closely located in your network
if you have several CPUs, several disk drives and gobs of RAM, consider having Informatica server and database server on the same machine shutdown unneeded processes or network services on your servers use 7 bit ASCII data movement (the default) if you dont need Unicode evaluate hard disk performance, try locating sources and targets on different drives get as much RAM as you can for your servers
Partitioning
A partition is a pipeline stage that executes in a single thread Partition points mark the thread boundaries in a pipeline and divides the pipeline process into stages The partition strategy can be different at each partition point in the pipeline process Adding partitions increase the number of threads created by Informatica PowerCenter allows for up to 16 partitions at each partition point By increasing partition points, threads increase, allowing performance increase HOWEVER load on server is also increased, so if server is undersized partitioning is of no value, can actually decrease performance
Partitioning continued
Partition Types Round Robin Key Range Hash Key Pass Through Performance can be increased by changing partitioning strategy at different partition points Source Qualifier Key Range or Hash Auto Expression or Filter Round Robin Sorter and AggregatorHash Auto Keys Target Key Range

Identifying Bottlenecks

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Identifying Bottlenecks

Hochgeladen von

Copyright:

Verfügbare Formate

General Informatica Best Practices

Das könnte Ihnen auch gefallen