Beruflich Dokumente
Kultur Dokumente
Overview of Document
This document shows average sizing for V9 Installs at 3 different levels. The first is the size of
installed elements on the file system. The second is the runtime footprint of general V9 services for
all users. The last is the additional overhead in memory and disk of an individual users running
mappings.
The mappings contribution to disk/memory usage is usually the most critical and the most
difficult to average without particular details. The details below can be used as a basis of scaling
calculations based on number of concurrent mappings submitted to the server, transform usage in
the mapping and the data file input size in number of rows and columns.
3.2 GB
600 MB
4 GB
3 GB
This gives a rounded base figure of about 12 GB. This does not include additional customer reference
data or increases in Address Reference data which is amended as country postal authorities add
additional data.
Service Name
Virtual Set
Working Set
1.
Admin Console
773K
133K
2.
MRS
1288K
407K
3.
Mapping Service
978K
254K
4.
Analyst Tool
702K
79K
This table shows the average sizes of the 4 V9 services of a typical configuration. The Virtual Set is
the total memory in virtual memory and the Working Set is the physically resident memory usage.
Batch/Interactive
533 MB
It would be expected that the process memory size will grow by 533MB approximately. It should be
noted that this memory cost is for the life time of the server and is a once off cost for the server and
all mappings run in the servers lifetime. The loaded Address Validation data is not unloaded even
when there are no current users for performance reasons.
Standard DQ Transformations
Comparison Transformation
Decision Transformation
Merge Transformation
None of the transforms have dynamic memory or disk usage that varies with the size of the
data being processed. All these components are referred to as passive since they process data rows
in small batches and send to the next component in the mapping immediately.
Parser Transformation
Standardiser Transformation
These transforms are all based around usage of reference data. While they are all passive in
that they process data immediately they have initialisation costs that increase memory based on
configuration. This memory usage makes them dynamic based on the transforms configuration but
not dynamic based on the number of rows presented for processing
While the reference data is managed in a database for editing, at runtime its held in
memory for performance. To optimise the throughput this in-memory storage is designed for speed
rather than space efficiency. The current list of reference tables available is around 3.5K so a list of
tables and in-memory sizes is not included. Each transform will have its own copy of the in-memory
reference data. To enable sizing the customer should take the number of bytes in each column of
the reference table and multiply it by the number of lines. This final calculation multiplied by 1.3 will
give an approximate guide to the in memory footprint.
For example a reference table with 10K rows and 6 columns with an average byte count per column
of 25 will give 10000 * 6 * 25 * 1.3 approximately 2M runtime memory usage. This runtime
memory cost is for the lifetime of the mapping. All in memory reference tables are freed when the
mapping is finished.
Dynamic DQ Transformations
All the following components have dynamic memory and disk usage. These components are referred
to as active and in general store large numbers of rows internally for block processing and have
memory/disk requirements that increase in-line with the volume of input rows and number of
corresponding columns per row
Address Validator Transformation
This component is treated in the General Runtime memory sizing section as it affects all
users as soon as the first mapping is run.
Association Transformation
This component makes extensive use of B-tree file based storage. Each column used in the
association will have its own b-tree and a general b-tree is used to store all the input data rows. The
Informatica b-tree is space efficient but not compressed. So the general sizing guideline here is as
follows,
Each association column is the total volume of data for each column * 20 bytes per input row
The general storage cache is the size of the input data set * 10 bytes per row will be the on disk
runtime cost.
An internal memory map of association ids and rows will be no larger than 20 bytes * the number of
rows
Sorter Based Transforms
Consolidation Transformation
Assumption here is that a mapping without disk/memory sensitive components will add little
beyond the standard footprint. This will not be true with very complex mappings.
User 1 Running a matching mapping
Dual Source Identity with Source1 containing 1M rows and source2 containing 100K rows, 6
columns with 25 bytes per column, 20 columns of pass-through data with 25 bytes per column
This mapping will have 2 sorters from the key generation phase, 1 B-tree from matching, 1 B-tree
from Identity and internal memory usage for Identity and clustering
Disk Usage
B-tree 1 Identity = 1100000 * 6 * 25 = 165MB
B-tree 2 Pass-through = 1100000 * 20 * 25 = 550MB
Memory Usage = Internal storage for large number of transforms used for matching 10MB
Batch/Interactive
533 MB
United States
GeoCoding
422 MB
United States
FastCompletion
380 MB
Summary
The data in this document estimates the standard disk and memory footprint of the V9
server. In addition the 2 tables shown at the end of the document will allow a user to minimise the
on disk footprint of the install if this is required. The Example sizing at the bottom of the document
shows how to estimate a mappings contribution to disk/memory by analysing the composition of the
mapping and each transforms contribution to disk/memory usage. The example also shows the
importance of factoring in the number of concurrent users and likely usage in defined the total peak
requirements of an individual installation.
Appendix 1
Address Validation Reference Data with On Disk size
Largest 50 files
United States
Batch/Interactive
533 MB
United Kingdom
FastCompletion
501 MB
United States
GeoCoding
422 MB
United States
FastCompletion
380 MB
United Kingdom
Batch/Interactive
306 MB
France
FastCompletion
210 MB
France
Batch/Interactive
153 MB
Argentina
FastCompletion
120 MB
Brazil
FastCompletion
104 MB
Germany
FastCompletion
102 MB
Germany
Batch/Interactive
99 MB
United Kingdom
Supplementary
94.5 MB
Italy
FastCompletion
92.9 MB
Argentina
Batch/Interactive
90 MB
Canada
FastCompletion
83.1 MB
India
FastCompletion
83.1 MB
India
Batch/Interactive
80 MB
Germany
GeoCoding
73.5 MB
Brazil
Batch/Interactive
73.3 MB
Italy
Batch/Interactive
66 MB
Canada
Batch/Interactive
61.8 MB
United Kingdom
GeoCoding
51.8 MB
Sweden
FastCompletion
49 MB
Mexico
FastCompletion
48.5 MB
Australia
FastCompletion
44.6 MB
Russian Federation
FastCompletion
44.3 MB
Mexico
Batch/Interactive
42.8 MB
Australia
Batch/Interactive
40.9 MB
Russian Federation
Batch/Interactive
40.5 MB
France
GeoCoding
39.7 MB
Portugal
FastCompletion
38.8 MB
Italy
GeoCoding
36.6 MB
Netherlands
FastCompletion
35.5 MB
Canada
GeoCoding
32.7 MB
China
FastCompletion
28.4 MB
Netherlands
Batch/Interactive
27.8 MB
Sweden
Batch/Interactive
27.4 MB
Spain
GeoCoding
25.6 MB
Australia
GeoCoding
25.4 MB
Spain
FastCompletion
23.7 MB
Chile
FastCompletion
23.4 MB
Netherlands
GeoCoding
22.7 MB
Portugal
Batch/Interactive
22.5 MB
China
Batch/Interactive
21.4 MB
Finland
GeoCoding
18.8 MB
Switzerland
FastCompletion
18.2 MB
Sweden
GeoCoding
17.8 MB
Chile
Batch/Interactive
16.8 MB
Belgium
FastCompletion
16.1 MB
Spain
Batch/Interactive
15.4 MB
Appendix 2
Identity Based Matching Reference Data with On Disk Size
IM_japan_i.zip
IM_japan.zip
IM_japan_r.zip
IM_gaelic.zip
IM_canada.zip
IM_international.zip
IM_chinese_s.zip
IM_south_africa.zip
IM_uk.zip
IM_ireland.zip
IM_new_zealand.zip
IM_australia.zip
IM_usa.zip
IM_arabic_m.zip
IM_indonesia.zip
IM_cyrillic.zip
IM_arabic_r.zip
IM_singapore.zip
IM_india.zip
IM_chinese_t.zip
IM_aml.zip
IM_greek_l.zip
IM_switzerland.zip
IM_france.zip
IM_philippines.zip
IM_luxembourg.zip
IM_belgium.zip
IM_germany.zip
IM_brasil.zip
IM_portugal.zip
IM_korean_r.zip
IM_italy.zip
IM_turkey.zip
IM_hk_r.zip
IM_sweden.zip
IM_czech.zip
86,222,167
86,222,153
15,754,935
9,237,372
8,933,319
5,303,974
4,955,588
4,260,152
4,241,637
4,241,357
4,200,805
4,153,252
4,134,750
3,893,388
3,494,046
3,022,104
2,980,176
2,505,578
2,321,418
2,189,993
2,083,153
2,057,442
2,028,497
1,950,898
1,896,332
1,812,614
1,696,864
1,604,137
1,596,925
1,596,786
1,588,819
1,554,842
1,552,887
1,542,915
1,528,272
1,525,846
IM_netherlands.zip
IM_taiwan_r.zip
IM_denmark.zip
IM_slovakia.zip
IM_malaysia.zip
IM_thai_r.zip
IM_spain.zip
IM_chinese_r.zip
IM_colombia.zip
IM_argentina.zip
IM_indo_chin_r.zip
IM_chile.zip
IM_peru.zip
IM_vietnam_r.zip
IM_puerto_rico.zip
IM_mexico.zip
IM_thai.zip
IM_finland.zip
IM_norway.zip
IM_poland.zip
IM_greek.zip
IM_hungary.zip
IM_estonia.zip
IM_korean.zip
IM_ofac.zip
IM_hebrew.zip
IM_chinese_i.zip
IM_arabic.zip
1,476,954
1,473,532
1,473,231
1,458,393
1,447,577
1,443,929
1,438,526
1,431,129
1,414,047
1,413,962
1,410,620
1,400,965
1,389,800
1,379,744
1,372,143
1,344,656
1,279,607
1,273,884
1,273,795
1,261,906
1,247,548
1,205,908
1,092,791
821,290
759,006
754,978
544,844
297,401