Beruflich Dokumente
Kultur Dokumente
PC Client Performance
The Oracle client recommendations state that when evaluating CPU speed and
amount of memory, consider Microsoft's minimum requirements for the operating
system, other software that runs concurrently, and external factors such as network
characteristics. You should check that the desktop configuration is sufficient to
achieve the throughput necessary to sustain your business model; your requirements
may be higher than the minimum specification. Even if you have what would
normally be considered a high-specification machine, you may encounter
A minimum specification client only
performance problems when running several applications simultaneously.
provides a minimal performance that may It is important to note that minimum specifications are exactly that, and will only
not be sufficient to support your business.
provide minimum performance. A minimum specification client may be adequate
for casual home use, but would probably not be considered sufficient when running
several applications together as would typically be the case in a business setting. For
example, displaying thousands of tasks and shifts that span several weeks on a Gantt
chart in Advanced Supply Chain Planning will cause severe paging on minimum
specification machines, and the PC client will appear unresponsive for a considerable
time.
The initial amount of memory used by a Each browser is an application in its own right, and each has different characteristics
browser is referred to as an offset
with a different memory footprint and behavior. The amount of memory used by a
and needs to be considered when
browser when it is started with a blank “Home Page” is known as the browser’s
analyzing the results.
memory footprint and referred to as the browser offset. This needs to be investigated
and taken into consideration when analyzing results and making recommendations.
Testing Accuracy
If your simulation and testing methodology is consistent, repeated tests on the same
Even though measuring Windows memory
PC client should be within a tolerance of 2.5%. If the accuracy is lower, then you
is an inexact science, you should achieve
an accuracy of 2.5% should establish the cause, as it could be contributing to problems with the end-user
experience.
The amount of shareable memory will vary over time, but is not really a concern, as
it does not limit the actual amount of shared memory pages. As Internet Explorer is
a Microsoft product, you would expect it to share more memory pages with the
Windows Kernel and other Microsoft applications, and this is indeed evident in the
results. The amount of shared memory used by Internet Explorer increases when
also using other Windows applications such as Microsoft Outlook. Firefox is an
open source browser, so not surprisingly shares a relatively small amount of memory
with Windows components. Even so, the 3.5MB difference between Firefox and
IE7 is only 35% and in practical terms, this is a small percentage. Several other
factors need to be considered prior to choosing a particular browser.
Process Explorer provides the Working Set (WS) and Private Bytes (PB) used for the
measurements throughout this paper. The Peak Working Set specifies the maximum
amount of physical memory that a process has been allocated since it started. During
testing, the peaks only lasted a short time even on the slowest client, and there was
no correlation with any major performance related Window events. The peaks were
always within close proximity of the Working Set and can generally be ignored. WS
Private shows the amount of memory exclusive to the process. The WS Shareable
and WS shared columns break down the process’s non-private pages into what can
potentially be shared, and what is actually being shared by at least one other process.
Working Set is usually larger than Private As a point of interest, on a PC with plenty of memory, Working Set is usually larger
Bytes on a machine with plenty of memory. than Private Bytes. If Private Bytes significantly exceed Working Set for the majority
If Private Bytes exceed Working Set for the of applications that have never been minimized, you will benefit from running fewer
majority of applications that have never applications simultaneously or additional memory. This introduces further important
been minimized, you will benefit from concepts that are discussed at length in the Process Trimming section.
running fewer applications simultaneously
or additional memory. Notice that Figure 2 shows a very small Working Set for the Palm Desktop; this
indicates that the application was minimized and the process has been trimmed.
While the process only had 5MB of physical memory allocated, it was still using
20MB of private virtual memory (Private Bytes).
Double-click a process (or right-click and select Properties from the context menu)
to show additional process information. The Performance Chart in Figure 3 provides
a graph of the Private Bytes history, and although not the focus of this paper, the
CPU and I/O. The Performance tab in Figure 4 shows detailed information about
the process. This information provides significantly more detail. For example, it is
possible to calculate the precise amount of physical RAM that is currently backed up
in the paging file. This may be interesting if you were trying to size a paging file
accurately, instead of simply using the standard approach of using minimum =
maximum = 2 x physical memory. The columns displayed on the main screen
provide adequate analytical information for the purposes of this paper.
As you will appreciate, there are several different ways to measure memory. Peak
There are four ways to measure memory.
Working Set, Working Set, Private Bytes, WS Private and WS Shared are compared
graphically in Figure 5. To be pedantic, the measurements include the initial amount
of memory used by the browser, which is a pivotal topic discussed in The Browser
Offset Memory section. As such, the chart should only be used to compare the
different approaches to memory measurement. In order to make the comparison
easier, the terminology is summarized in Table 3.
Virtual Size N/A Virtual Bytes A process’s virtual address space consists This is far higher than the
of free, reserved, and committed memory physical requirements. It does
pages. Reserving address space costs not represent the real amount
nothing in terms of memory consumption of memory that the program is
but allows for future allocations from a using.
contiguous area of memory.
Private VM Size Process | This is the private virtual memory that the This may undercount memory
Bytes Private Bytes process has been allocated. It is committed as it omits the shared memory
memory that can reside in physical memory that the process is using.
or page file and is protected and However, it is accurate when
inaccessible by any other process. The shared memory is minimal. It is
difference between Private Bytes and WS the best value to use once a
Private is the total that is currently not in process has been minimized, or
memory but is backed by the page file. on a machine with low memory
where the process may be
immediately trimmed.
Working Set Mem Process | These pages are currently resident in This value can be larger than
Usage Working Set physical memory and dedicated to a the minimum number of bytes
specific process. It is effectively the sum of actually needed by the process,
Process Explorer WS Private + WS as it includes the shared
Shareable. The peak Working Set is the memory for each application. It
maximum amount of physical memory may artificially reduce as you
used since the process was started. open more programs (due to
trimming). The accuracy
decreases when several
processes involved in the total.
WS Private N/A N/A This is the amount of physical mapped This cannot be used in
memory used by, and only accessible by, isolation, as it does not include
that process. Other processes cannot share the shared memory component.
it. This is the same as Private Bytes except
all referenced pages are currently resident
in physical memory as opposed to both
physical memory and page file.
WS N/A N/A WS Shareable is the amount of memory in Use this to determine the
Shareable a process’s Working Set that can be shared amount of shared memory that
and WS with other processes. WS Shared is the is in use/may be used by the
Shared subset of the shareable memory that is process Working Set.
shared with at least one other process.
Table 3: Mapping Task Manager, Performance Monitor, and Process Explorer
Purchasing Menu
Login Screen
Run further
Start Browser
Maximize all
Queries
Minimize all
Explorer Windows
Windows
Maximize the
Form Window
Run an Open Query
Restore the Form
Figure 7: Browser Memory As Measured For Oracle E-Business Suite 12 Using JRE
There is a lot of detail in this chart. Notice that the browser footprint on the first
Firefox has the highest memory footprint
two low-specification machines is very small; this may be due to a more radical
on the 1.83GHz Core Duo,
but this includes several plug-ins approach to memory management, effectively resulting in immediate process
and helper applications. trimming. When IE7 is installed over IE6, it integrates all the same browser
extensions used by its predecessor, and the larger footprint relates to an increase in
functionality.
A first glance at these results shows that Firefox has the highest memory footprint
(on the 1.83GHz Core Duo), but this is misleading as this particular browser
includes several plug-ins and helper applications that you may not find in a normal
business environment. A more accurate comparison for the basic Firefox browser is
the 23.6MB on the Intel 930 Core Duo, which was a fresh install and this is
noticeably less than Internet Explorer 7 (30.4MB).
Adobe Acrobat Reader, Skype, and the Google Toolbar were installed in each of the
browsers and their memory profiles compared in Figure 9. The Internet Explorer 7
memory utilization increased much more than the other browsers even though the
version and size of the utilities was identical.
Table 4 shows an abbreviated export from Process Explorer for Internet Explorer 7
The largest components are the
with Adobe Reader, Skype, and the Google Toolbar. Note that some utilities will not
Google Toolbar and the
Microsoft Phishing Filter Data File. appear in the DLL listing until they have been used at least once.
Total 24,588KB
Mapping add-ons and plug-ins can be Table 4 identifies the Microsoft Phishing option as one of the largest components.
difficult and instead it may be easier to This can be disabled and the results for the Internet Explorer 7 memory footprint
measure the browser memory footprint are shown in Table 5.
before and after you integrate the features.
On 39,900KB 28,612KB
Disabled 36,416KB 24,736KB
Difference 3,484KB 3,876KB
Table 5: Exploring the Microsoft Phishing Option
Given that the Phishing data file (ieapfltr.dat) has a WS Total of 2,392KB, disabling
this option shows a Working Set reduction of 3,484KB, which indicates that some
other associated components were not loaded. While disabling this function on an
Intranet may be useful, this highlights a possible requirement to review proposed
changes with a wider audience (such as network security in this case).
Disabling every add-on resulted in an unexpectedly small reduction of the browser’s
Disabling all add-ons did not reduce the
Working Set by approximately 7MB. For IT professionals wanting to take the
browser memory as much as expected and
attempting to normalize the browser
analysis to the next stage, the browser startup events can be captured using Process
footprints was abandoned. Monitor (also free from Microsoft). These can be filtered using the browser Process
IDentifier (PID), which is shown in Process Explorer. There are in excess of 30,000
events for Internet Explorer 7, but again, correlation to the DLLs, drivers, and data
files is not easy. Trying to normalize the browser footprints was abandoned after
extensive analysis.
Referring to the discussion around Figure 7, the most reliable measurements for the
Table 6 shows the offsets that have been
browser footprints are the fresh installs on the Intel 930 Core Duo PC. Rounding
used as a basis to scale all the results.
these figures keeps the calculations simple without introducing significant errors.
Table 6 shows the resulting offsets that have been used as a basis to scale all the
results.
Figure 10 shows a graphic comparison and visual check of all the average memory
used by each method for the full range of tests using Oracle E-Business Suite 12
with JRE. Firefox included several additional components and including it shows
how well each method worked.
This chart shows that excluding the offset reduces the differences between the
browsers, but this is misleading as this incorrectly indicates that Firefox has the
highest footprint. Scaling using a static ratio dramatically changes the ratio of Private
Bytes and Working Set for IE7 and Firefox. The scaled results using a variable ratio
follow the same pattern as the extrapolated averages in Table 6; the Firefox browser
with the additional components has been fully normalized and the ratio of Private
Bytes and Working Set for all three browsers is verifiable.
Figure 12: Average Memory Utilization for Oracle E-Business Suite 12 Using JRE
Grouped by Type
Figure 13: The Relationship between Private Bytes and Working Set
If you think it unlikely that Windows was trimming the process, you need to
consider other factors that may cause this behavior. For example, the amount of
shared memory, or some other memory structure, could vary between the different
types of forms and screens. As all browsers exhibit the same step changes, any
browser will show this effect. The amount of shared memory is determined by the
set of applications that are running simultaneously, and measurements from Internet
Explorer 7 are shown in Table 7. This shows that the difference between the
functions is minimal, in which case shared memory cannot be the reason.
Browser 9708
Login Screen 12344
OAF/HTML Screens 12444
Menu 12998
Oracle Forms 13400
Gantt Charts 13588
Table 7: Shared Memory by Product Group
Trimming a large menu results in a reduction of the memory utilization from 63MB
to 27MB; this represents a saving of almost 50%. With menus in particular,
bandwidth is the primary network constraint; latency has a much lower affect. For
example, for all latencies from 50ms to 300ms, the Manufacturing and Distribution
Manager Menu takes almost a minute to display on a network with only 256Kbps
available bandwidth, and almost two minutes with 128Kbps available.
The callout on the left shows a sample of the Working Set and Private Bytes for the
1.6GHz 1GB client. The ratio between these measurements appears exceptionally
consistent. Although they fluctuate through the remainder of the tests, they follow
the expected pattern (Working Set greater than Private Bytes) for all clients, until the
final stage of the tests containing the Gantt Charts. At this point, Private Bytes
(shown in red) exceed Working Set for all the clients with less than 2GB. Each of
the callouts on the right of the chart shows the Private Bytes and Working Set for
the WIP Discrete Job Workbench.
This represents a fundamental change in the expected profile, and is typically one
area where inadequate client memory results in performance problems and
consequently complaints from unhappy users. The worst profile and greatest
difference between the sets appears on 700MHz 256MB client, though to be fair, the
128MB client was really struggling for memory and so you would not expect
anything sensible from the memory figures. As you would expect, the 366MHz
128MB client also uses old technology, and so paging and other disk activity would
be slower than on the other clients.
Windows was trimming processes on a 1GB machine even when all running
If Private Bytes significantly exceed
applications were using less than 160MB. It is worth stressing again that if private
Working Set for the majority of
applications that have never been bytes exceeds working set for the majority of applications, either the client is running
minimized, run fewer applications too many applications, or (if all the applications are needed) you need more memory.
simultaneously or add memory.
Comparing Load Times for across the Range of Clients
Unlike OAF/HTML screens and menus, E-Business Suite 11i and 12 forms and
Gantt Charts will download the necessary JAR files to the local client when a form is
accessed the first time. For example, using JRE 1.5.0_12 the Purchase Order form
downloads 10.3MB of JAR files, whereas a Gantt chart may require 19.3MB of JAR
files (the exact figures vary with the functions in use). Once JAR files have been
downloaded, they remain valid for the browser session. They are revalidated if you
restart the session, but not if you switch forms or responsibilities.
The Oracle E-Business Suite load times were compared for the same four clients
used in Figure 15, for the full range of tests excluding the measurement for the WIP
Discrete Job Workbench with no data. Again, reviewing all the data in one graph is
cluttered, and therefore a range of graphs have been used. When reviewing the
charts, note that the Y-axis scale has been changed in order to highlight key
elements.
Figure 16 compares the initial and subsequent load times for two fastest clients.
Remember that JAR files need to be downloaded the first time they are used, and the
download time will be influenced by network characteristics such as latency and
available bandwidth. In high-latency situations, using a local proxy server will help to
alleviate network problems and improve download times. The client specification is
listed in Table 2. Although CPU benchmarks are not very representative, the
1.8GHz Mobile Core is allegedly equal to a 3.43GHz Pentium (non Intel sources),
though this has not been verified.
Accepting a small margin of error, the times for each of the clients for the initial
login and menu are comparable for each of the clients (menus do not use JAR files).
The times to load the large menus are relatively long, even though these are the
fastest clients in the test and they are being loaded over a low latency LAN.
The difference in client speed is very clear in the forms and Gantt chart sections. As
expected, the fastest machine (which usually commands a price premium) provides
the best response times. Downloading the JAR files is generally a one-off operation
and most of the time you will be working with locally cached (downloaded) JAR
files. The relative performance of the two fastest clients is very similar, bearing in
mind that both machines are very lightly loaded in terms of the number of
applications that are running.
Figure 17: Comparing Load Times for Forms Previously Used in the Same Session
Both of the smaller clients perform very poorly when rendering large menus and the
graphics-rich dashboards. While this chart provides an overall picture of what is
happening, expanding the Y-axis scale to 30 seconds, as in Figure 18 really highlights
the differences.
The slow clients almost match the fastest While technology does not generally scale linearly, the 366MHz is approximately
machines for OAF/HTML functions.
twice as slow as the 700MHz. The 366MHz appears to work almost as well as the
rest of the machines for small OAF/HTML forms and there is little difference
between the 700MHz and 1.6GHz, and not as much difference as you might expect
between the 1.83GHz Mobile Core (with a dual processor) and the 1.6GHz
Pentium.
Figure 18: Comparing Load Times for Forms Previously Used in the Same Session
(expanded)
Although the first use times (without JAR files on the client) have not been included
for the 700MHz client, it is reasonable to conclude that they are substantially longer
than the 1.6GHz client. When you compare the measurements across Figures 17 and
19, you can see that when the forms have been used once in the session, that the
times to reopen the form are similar. One approach to extend the working life of
older technology is to have call centre staff start the necessary Oracle E-Business
Suite forms prior to starting their work shift, so that the JAR files will be validated
before using the forms in earnest when interacting with customers.
Form opening times are roughly This chart only includes the forms and Gantt charts, as these are the only
proportional to the volume of JAR files that
components that require JAR files. Although these components represent a subset
need to be downloaded and validated.
of the Oracle E-Business Suite, the sample should be reasonably representative. The
three lines representing the downloaded time, validation time, and volume of JAR
files (either downloaded or validated) follow a similar pattern. It is evident that the
form opening times are roughly proportional to the volume of JAR files that need to
be downloaded. As mentioned previously, converging time lines indicate that the
JAR files have already been downloaded and validated.
When JAR files are shared, the first time The speed of JAR file revalidation depends on a number of factors including the
use timings only apply to whichever form
application tier (or proxy server if applicable), network capabilities, and (to a lesser
is used first in that particular set.
extent) the speed of the client. On the 1.83GHz test machine, revalidating the JAR
files occurs at between 1-2 MB per second. As already mentioned, several forms
share the same set of JAR files and so the first time usage timings (including the JAR
file download or validation) will only apply to whichever form is used first in that
particular set. You can easily compare the JAR files for each of the forms by
reviewing the contents of the JAR file cache.
Figure 22: Comparing Working Set for Oracle E-Business Suite 11i and 12
Figure 23: Memory Profiles for Oracle E-Business Suite 11i and 12
All the evidence suggests that the best The amount of shareable memory is the same for both of the technologies, which is
combination for low memory clients is
what you would expect as they have a common base. For completeness, you need to
IE6 with JInitiator for
consider the effect of the shared memory component (as shown in Figure 1). You
Oracle E-Business Suite 11i and JRE with
IE6 for Oracle E-Business Suite 12.
would reduce the browser memory utilization by the following amounts: FF2 6.2
MB, IE7 9.6MB, and IE6 8MB. However, this fails to explain the difference. Bearing
in mind that even though the tests span a wide range of Oracle E-Business Suite
forms and screens, this is still just a subset of tests. However, all the evidence
suggests that the best combination for low memory clients is IE6 with JInitiator for
Oracle E-Business Suite 11i and IE6 with JRE for Oracle E-Business Suite 12.
Memory Optimizers
CONCLUSION
Memory is far more important than CPU for This paper is aimed at customers who are trying to extend the useful life of their
performance.
existing Windows clients, or trying to ensure that they specify an adequate client at
the optimum price/performance point. A rigorous approach is essential when
reviewing or planning for hundreds or thousands of existing or new clients, with the
first half of the paper providing a theoretical background for the testing and
terminology described in the second half.
Windows appears overly aggressive when reclaiming physical memory, and while you
cannot control the process, you can compensate for it to some extent by adopting
the good working practices outlined throughout this last section.
As stated, the best combination for low memory clients is Internet Explorer 6 with
Oracle and Microsoft state that adding
memory makes a significant difference to
JInitiator for Oracle E-Business Suite 11i and Internet Explorer 6 with JRE for
performance. This has been conclusively Oracle E-Business Suite 12. Oracle and Microsoft state that adding memory makes a
proven for the significant difference to performance. This has been conclusively proven for the
Oracle E-Business Suite. Oracle E-Business Suite. There is a very small difference in performance and
response times (about 2 seconds) between the two fastest clients. Even though they
are a generation apart in technology terms, one is double the cost of the other.
The importance of the Holistic Approach can be proven through performance
auditing. Not only does it encompass modeling all the salient factors in your test
Use the holistic approach to model all the
salient factors in your test environment. If
environment, but it also extends to consider auditing system performance of your
users have large amounts of data on- key critical business transactions over time. In one exceptional case, the users at one
screen at remote locations, make sure you site complained only after performance had reduced by 50% over the period of a
include all salient factors. year, which is something a performance audit would have highlighted much sooner.
A rule of thumb is that users will generally not notice any single change (positive or
This appendix lists the forms and screens used during the tests. In order to be able
to compare results you always need to follow the same path, to avoid any deviation
that might affect the measurements.
Login Screen
Main Menu
User: operations/welcome
Payables Menu
User: operations/welcome
Responsibility: Payables, Vision Operations (USA)
Sales dashboard
User: operations/welcome
Function: Sales User
View Payslip
User: operations/welcome
Responsibility: Employee Self-Service
Function: Employee Self-Service: Payslip
Absence Management
User: operations/welcome
Responsibility: Employee Self-Service
Function: Employee Self-Service: Absence Management
Recall that committed memory pages are backed by the page file or reside in physical
memory. To complicate the question of how to measure memory, committed
memory can be private or shared. These are defined as follows:
• Private memory represents the private virtual memory allocated to the
process and is committed memory that is protected and inaccessible by
any other process. It can reside in physical memory or the page file and
can be considered as the amount of page file space that would be
occupied if the entire process was paged out of physical memory. The
Performance Monitor counters Process/Private Bytes and Process/Page
File Bytes are the same as VM Size in Task Manager and Private Bytes in
Process Explorer (a free tool from Microsoft) .
• Shared memory can be shared by one or more processes. For example,
this could be an image, in which case the address where the image is
loaded is mapped to the address space of the process in question, or a
mapped file, such as a dynamic link library (DLL). As the name implies, a
mapped file is not loaded into memory, but each process will use
addressing to map (load) only the sections that it is using.
Shared memory reduces the overall physical memory requirements of the system as
not all the shareable memory is necessarily shared with other processes, and what is
shared by the application being measured may actually be mapped from another
process that was already running. Therefore, this analysis needs to measure the
amount of shareable and shared memory and consider whether this level of
complexity is relevant.
Each program has an allocated range of physical memory (its Working Set). If the
memory requirements of an existing process increase, or you start another
application and there is enough physical memory available, it is simply assigned from
the available pool of free memory without affecting other processes. However, if
insufficient free memory is available, Windows will trim (displace) memory pages
from other processes.
When Windows trims memory from existing processes, it places unmodified pages
The standby, free, and zero lists are added
on the Standby List, and modified pages on the Modified List (as shown in Figure C1).
together and the sum classified as
available bytes. Modified pages have to be written to the paging file before they can be transitioned
to the standby list. Importantly, these pages still contain the original data; if they are
subsequently referenced, a soft page fault occurs and the requisite pages are moved
back to the process’s Working Set in a memory-to-memory operation. If the
program references modified pages that have been overwritten (and moved to one
of the other lists), it will read original program code (such as a DLL) from the
original source on disk, or modified pages from the page file.
Standby list pages are transitioned to the free list. Once the free list has a predefined
set number of pages, the kernel pages thread executes to zero-fill the memory and
pages are transitioned to the Zero List, where they remain until they are allocated to a
new program or a process that needs more memory.
Windows kernel memory consists of two sub-pools: the paged pool, which can be
paged to disk and the non-paged pool. The latter includes essential processes such as
drivers that cannot be paged out of memory. All processes have defined maximum
and minimum values for working set size. These can either be set programmatically
or more usually set to a default value, but the minimum is a non-zero value meaning
that an application can never be completely paged out of memory.
Windows will dynamically adjust the rate at which it examines Working Sets based
on memory utilization and the age of pages in the process Working Set. In order to
decide which processes are candidates to be trimmed, the Windows memory
manager examines each process. It compares the current memory allocation to its
minimum Working Set, the age of the page, whether it has been accessed since the
last Working Set trim, and other factors such as the number of hard page faults
(physical disk reads) that would be incurred.
It is difficult to monitor and interpret all the Windows memory lists and therefore
the best indicators that you have are Private Bytes and Working Set, and specifically
the ratio between them.
Oracle Corporation
World Headquarters
500 Oracle Parkway
Redwood Shores, CA 94065
U.S.A.
Worldwide Inquiries:
Phone: +1.650.506.7000
Fax: +1.650.506.7200
oracle.com