Aas White Paper

hp-ux / virtual memory
May 2003
Adaptive Address Space

technical white paper
table of contents
introduction executive summary background limitations of traditional HP-UX address space models what does MPAS provide?
larger, more homogeneous address space mmap independence multiple mmap()s of the same data preffered address for shmat()
2 2 2 5 6
6 8 8 8
how to get MPAS? usage details alternatives to MPAS performance compatibility/ interoperability configuration glossary (simplified) for more information
8 8 11 11 11 11 12 12
introduction
The Adaptive Address Space (AAS) feature is introduced in HP-UX 11i V2. AAS provides a new address space layout, MPAS (Mostly Private Address Space), which helps applications targeted for HP-UX running on Intel Itanium Processor Family (IPF) machines. MPAS can also be used to aid application portability to HP-UX. This paper describes the features AAS provides, how applications can use these features and impacts thereof.
executive summary
The memory model traditionally used by HP-UX trades off an applications flexibility in using its address space for performance. This means that 32-bit applications coded for HP-UX that share a lot of data among processes enjoy certain performance advantages that they would not on other operating systems. However, these applications are restricted by certain limitations of HP-UXs memory model. AAS removes all such limitations from HP-UX for applications that choose to use the feature. The removal of these limitations means that applications have greater control of their own address space and have memory-management features that did not exist on HP-UX but were provided by several other operating systems (e.g. Solaris and Linux). This aids portability of applications to HP-UX from these other operating systems. However, this may cause the application to loose some of the performance advantages typically enjoyed by HP-UX applications. Application vendors coding for HP-UX can also use the flexibility and features offered by AAS to simplify their design. This applies only to HP-UX applications compiled and linked for IPF platforms.
background
HP-UX provides various address space layouts for applications to choose from. An address space layout represents a processs accessible virtual address space and how the operating system divides it into different regions into which different types of memory objects can be attached. The default address space layout for 32-bit processes is known as SHARE-MAGIC. In this layout, the first 1GB of the processs address space is reserved exclusively for text. This 1GB is shared by all processes executing the same text (hence the name SHAREMAGIC). The memory layout for a 32-bit SHARE-MAGIC process looks like:
0x0000 0000 0x4000 0000
Quadrant text initialized data bss heap 0
private mmaps register stack Maximum RSE stack size Red Zone memory stack Shared data (Space shared by all processes) 0xFFFF FFFF Virtual address space allocated to process Virtual address space not yet allocated to process Reserved - no user access Virtual address space shared by all processes Maximum memory stack size
0x7FFF FFFF
2,3
(The diagram shows a SHARE-MAGIC process on IPF. Note, the figure is not drawn to scale. The maximum stack sizes are decided by system tunables and process resource limits at the time of process start-up).
Notice that the address space of the process is split into 4 quadrants, numbered 0-3. Each quadrant is 1GB in size. Quadrant 0 is reserved for text, quadrant 1 for process private data and quadrants 2 and 3 for data that can be shared between processes. The 2GB of shared data space between virtual addresses 0x8000 0000 and 0xFFFF FFFF (quadrants 2 and 3) is consumed by most 32-bit processes in the system. For
2
example, if processes A and B want to share 1MB of data, they will use this space to share it (the kernel will pick a virtual address in this range). This 1MB data may not be visible to process C, but the data will consume 1MB of virtual address space that process C could have used. Processes A and B will access the data using the same virtual address hence they will read / write the same data. This is how HP-UX allows multiple processes to share data by giving them the same virtual address (i.e. aliasing of different virtual addresses to the same physical page is not needed). Also note that this layout reserves 1GB for text. This 1GB may not be used by the process for any other purpose. All processes using the same binary with the default layout will share the text using the same virtual address. This method of sharing costs virtual address space, but is very efficient because of the lack of aliasing. The advantages of not having aliasing are: 1. Fewer faults are needed by the processes to access the data. 2. The sharing processes incur fewer TLB misses when accessing the data. 3. Less space is needed by the kernel for its data structures.
Some applications may not like this particular distribution of the quadrants. For instance, an application may need more than 1GB of process private address space (for example, because it has a large heap). Or, it may want more than 2GB of shared address space. And so on. HP-UX provides other address space layouts. The 32-bit layouts of HP-UX are summarized in the table below:
table 1a. 32-bit Address Space layouts.
Text Default (ShareMagic) Exec-Magic Shmem-Magic Q3 Private Share-Magic Q4 Private Share-Magic Q3 Private ExecMagic Q4 Private ExecMagic MPAS Quadrant 0
Process private address space Quadrant 1
Shared address Limitations on space usage Quadrants 2,3 Quadrants 2,3 Quadrants 1,2,3 Quadrant 3 None. Quadrant 3 None. None. None. None. PA-RISC platforms only PA-RISC platforms only PA-RISC platforms only PA-RISC platforms only Available only with AAS
Quadrants 0,1 Quadrant 0 Quadrant 0 Quadrant 0 Quadrants 1,2 Quadrants 1,2,3
Quadrants 0,1,2 Quadrants 0,1,2,3 Quadrants 0,1,2,3
64-bit processes are treated differently on IPF and PA-RISC. On IPF, the address space for a 64-bit process is divided into 8 octants. The layout for the default 64-bit process is shown below. This type of layout is called MGAS (for Mostly Global Address Space).
3
The layout is:
Octant 0x0000 0000 0000 0000 Shared data (space shared by all 64-bit processes) text Text initialized data bss heap 0,1
0x4000 0000 0000 0000 0x6000 0000 0000 0000
register stack
Fixed size Red Zone Fixed size
3,4
memory stack 0xa000 0000 0000 0000 0xc000 0000 0000 0000 Shared data (space shared by all 64-bit processes) Kernels address space 0xFFFF FFFF FFFF FFFF
5 6
0xe000 0000 0000 0000 7
Virtual address space allocated to process Virtual address space not yet allocated to process Reserved - no user access Virtual address space shared by all 64-bit processes
(The diagram shows a 64-bit MGAS process on IPF. Note, the figure is not drawn to scale. The maximum stack sizes are decided by system tunables and process resource limits at the time of process start-up).
For a 64-bit process, text can occupy up to 1 octant. This entire octant is not available to the application for any other use. 64-bit MGAS applications share data among themselves by using the same mechanism as 32-bit applications (i.e. the data is attached to the shared virtual address space, and all sharing processes access it using the same virtual address. This eliminates the need for aliasing of virtual addresses. But shared virtual address consumed by one process will impact other processes). Shared virtual address space for 64-bit MGAS processes
4
lies in octants 0,1 and 6. The layout for a 64-bit process on PA-RISC is significantly different and is not shown here.
limitations of traditional HPUX address space models
The traditional address space layouts of HP-UX suffered from the following limitations: 1. On HP-UX, processes have a fixed distribution of the processes virtual address space into shared address space and private address space (i.e. virtual address space in which processes can share data among themselves VS virtual address space in which processes can attach data that is meant for the use by the attaching process alone). This distribution of virtual address space is fixed at compile / link time. This means that a process cannot dynamically decide how much data in memory it wants private and how much it wants to share. This restriction mainly impacts 32-bit processes. 2. A 32-bit process cannot get a single data object greater than 1GB in size if it wants to share it with other processes. This restriction applies even if the process has more than 1GB of shared address space available. 3. The shared address space available to 32-bit processes is consumed by all processes that read / write shared data. This means that a processs pool of shared data space available can be consumed by other processes even by other processes that are not sharing data with this process. 4. The mmap(2) system call cannot be used to map a portion of a file multiple times. A number of applications and libraries use the mmap() system call to read / write data from files. The inability to map pieces of the file multiple times complicates application logic. This applies to both 32-bit and 64-bit applications. (This applies only if the MAP_SHARED flag is specified in the mmap() call). 5. Some complex sequences of mmap(2) can fail. For instance, if process A maps page 1 of a file using the MAP_SHARED flag of mmap(2), and process B tries to map pages 1 and 2, it could fail. [The workaround in this case is to have process B map first, or have process A map both pages 1 and 2 even though it needs only 1 page process A does not loose any virtual address space in doing this]. This applies to both 32-bit and 64-bit applications. The Adaptive Address Space project provides the user with a new type of address space layout, called Mostly Private Address Space (or MPAS for short), that can overcome these limitations. This makes it easier to port applications to HP-UX from other operating systems that provide these features. It also simplifies the design of applications written for HP-UX.
what does MPAS provide?

larger, more homogeneous address space
The MPAS layout provides the following features:
The address space layout for 32-bit MPAS processes looks like:
0x0000 0000
text initialized data bss heap
mmap and shmat register stack Maximum RSE stack size Red Zone Maximum memory stack size
0xFFFF FFFF
memory stack
Virtual address space allocated to process Virtual address space not yet allocated to process Reserved - no user access
(Note that the figure is not drawn to scale. The maximum stack sizes are decided based on system tunables and process resource limits at the time of process start up).
In this layout the entire 4GB, i.e. all 4 quadrants, are available for the process to consume in any manner it chooses. No other process consumes any part of this processs address space. Private and shared data can both be attached at any location. A process is not disallowed to attach objects greater than 1GB in size. This gives the process more flexibility in how it consumes its address space. However this scheme implies that data can be shared between two processes only by aliasing their virtual addresses to the same physical page. This leads to some performance inefficiencies.
The layout for 64-bit MPAS processes is shown below:
Octant 0x0000 0000 0000 0000 mmap and shmat objects 0,1
0x4000 0000 0000 0000 0x6000 0000 0000 0000
text Text initialized data bss heap
register stack
Fixed size Red Zone Fixed size
3,4
memory stack 0xa000 0000 0000 0000 0xc000 0000 0000 0000 Shared data (space shared by all 64-bit processes) Kernels address space 0xFFFF FFFF FFFF FFFF
5 6
0xe000 0000 0000 0000 7
Virtual address space allocated to process Virtual address space not yet allocated to process Reserved - no user access Virtual address space shared by all 64-bit processes
(Note that the figure is not drawn to scale. The maximum stack sizes are decided based on system tunables and process resource limits at the time of process start up).
For 64-bit MPAS processes, text can occupy one entire octant of address space. This octant is not available to the process for any other use. All processes running the same binary use share the text using the same virtual address without need for aliasing virtual addresses. A 64-bit MPAS process can consume address space from octant 6 only by providing special instructions to the operating system this address space is not consumed by MPAS applications by default.
7
mmap independence
For MPAS processes, mmap() calls made by this process are independent of calls made by any other process. In the case cited earlier in the section limitations of traditional HPUX address space models, item number 5, process B could not mmap() pages 1 and 2 of a file because another process A had mmap()ed page 1. This, and all such interprocess limitations are removed for MPAS processes. As described in limitation number 4 in the section limitations of traditional HP-UX address space models, traditional HP-UX processes (those not using MPAS) cannot mmap the same portion of a file more than once using the MAP_SHARED flag. For instance, if a process has mapped page 1 of a file, then it cannot map page 1 again (or map pages 1 and 2) without first unmapping the previous mapping. For an MPAS process, each call to mmap is independent of other mappings made by the same process. Apart from the above mentioned ability to map the same portion of the file multiple times, this also means that each mapping can be mprotect(2)ed individually.
multiple mmap()s of the same data
preffered address for shmat()
Traditionally, HP-UX applications have a very limited ability to specify a preferred address during the call to shmat(2) to attach a system V shared memory segment. A process either specified a NULL address, in which case the kernel was free to choose any address. Or, the process had to specify the very same address that other processes had attached the segment at. Any other address would fail. This restriction is lifted for MPAS processes, which can specify any preferred address with shmat().
how to get MPAS?
To use the features provided by the AAS, the binary has to be converted to use the MPAS layout. This can be done under the following rules: Binary has to be linked with a linker provided with HP-UX 11i V2 or later. Provide ld(1) / chatr(1) option to change an executable to MPAS model: +as mpas
Other +as options: share_magic exec_magic shmem_magic mgas (same as share_magic for 32-bit)
usage details
(This section mentions technical details that can be skipped by a reader interested only in an overview of AAS).
While designing / running applications that use MPAS processes, the following points will need to be considered: 1. An MPAS process can get up to 4GB of address space. To actually succeed in large allocation attempts, the user would need to set tunables, process resource limits, etc. to get all the memory desired. E.g. to get a 4GB heap for a 32-bit MPAS process, set the RLIMIT_AS limit and the maxdsiz tunable appropriately, and ensure that enough swap is available.
8
Similar constraints apply if other types of large objects are desired e.g. if a large stack was needed, then the maxssiz tunable and the RLIMIT_STACK limit would have to be adjusted instead of the maxdsiz tunable and RLIMIT_AS limit respectively. The man page for setrlimit(2) is a good starting point for information on resource limits. The man pages for the tunables maxssiz(5) and maxdsiz(5) are good starting points for information on some of the address space related tunables. 2. HP-UX on PA-RISC provides two types of layouts for binaries: q3private and q4private. HP-UX PA-RISC binaries using these layouts were not supported on IPF machines running HP-UX. On HP-UX 11i V2 onwards, they will be supported, but will be treated as MPAS binaries on IPF platforms. 3. HP-UX provides the memory windows functionality to 32-bit applications that need more control over the usage of their shared address space. Since 32-bit MPAS processes have the maximum possible limit of 4GB of address space that can be used for shared objects, MPAS processes should not have need to use the memory windows functionality. In fact, combining MPAS processes and memory windows is fraught with complications and is not recommended. If such a combination is unavoidable, then the following points should be borne in mind: o o MPAS processes can map objects not present in their memory windows However, MPAS processes create objects in their own memory windows. I.e. if Process A (MPAS) and process B (non-MPAS, say, share magic) want to share an object. Then either: i. Process B should create the object (e.g. shmget() with IPC_CREAT) Or, Processes A and B should be in the same memory window Or, Process A should do the shmget() with the IPC_GLOBAL flag
ii.
iii.
More information on memory windows can be obtained from the link provided in the section for more information.
4. The MAP_GLOBAL and MAP_ADDR32 flags of mmap(2) behave differently for MPAS and non-MPAS processes.
MAP_GLOBAL MAP_ADDR32 MAP_ADDR32|MAP_GLOBAL 32-bit MPAS No effect No effect No effect
32-bit, Goes to 4th non-MPAS quadrant 64-bit MPAS Goes to 6th Octant
No effect
Same as MAP_GLOBAL
No effect
Same as MAP_GLOBAL
64-bit, Goes to 6th non-MPAS Octant
Goes to virtual Goes to virtual address between address <4GB 3GB and 4GB
5. MPAS processes attach shared library text in private address space. However, the text is still shared with other processes. Hence, breakpoints cannot be placed in them. To place breakpoints in shared library text, use chatr +dbg enable.
10
alternatives to MPAS
Reasons for not using the MPAS layout could include: 1. Application is targeted for the PA-RISC architecture. 2. Application is targeted for a version of HP-UX preceding HP-UX 11i V2. 3. Application vendor wants to retain performance advantages provided to nonMPAS applications. Applications that cannot use MPAS can get some of the advantages that MPAS layouts provide by choosing one of several address space layouts provided by HP-UX as described earlier. Limitation number 3 in the section limitations of traditional HP-UX address space models can be addressed by using the memory-windows functionality.
performance
Using MPAS layouts may incur costs in terms of performance. The performance cost comes mainly from using aliases. This performance cost becomes a factor if the application has a large amount of data that is shared between processes. On the other hand, 32-bit processes using MPAS layouts gain the advantage of a flexible virtual address space. This flexibility translates to a performance advantage to processes that can use larger address space to do more work, faster. For example, the Java Virtual Machine is very sensitive to the amount of heap space it has. Since MPAS allows processes to have up to 4GB of address space for the heap, this can translate to a performance advantage. The performance difference thus seen will vary from application to application. The following factors can aid in deciding whether or not to use the feature. In terms of performance, MPAS should be a good alternative for: o processes whose performance is sensitive to the amount of private data space available. processes that do a lot of mprotect(2) operations on shared data
MPAS could lower performance for applications that: o share a lot of data between processes
compatibility/ interoperability configuration
AAS provides binary and API compatibility for applications that do not use the feature.
An application that wants to use MPAS layouts needs to link with a special flag. Details are mentioned in the section how to get MPAS?
11
glossary (simplified)
AAS: Adaptive Address Space. The feature being discussed in this paper. Aliasing: In this paper, aliasing refers to the condition when two or more unique virtual addresses translate to the same physical address. All aliased virtual addresses can be used to access the same data. BSS: Block Started by Symbol. The section of a programs data which is used to store global data that is not initialized explicitly by the programmer. (Implicitly initialized to 0). MGAS: Mostly Global Address Space. This is the default address space layout on HP-UX. MPAS: Mostly Private Address Space. This is the new type of address space layout that is introduced by the AAS project. RISC: Reduced Instruction Set Computer. A computer architecture that reduces chip complexity by using simpler instructions. RSE: Register Stack Engine. Traditional processor architectures require spilling and filling of registers during function call /return. On newer, RISC architectures, a register stack engine avoids this via compiler controlled renaming of general registers. For details, refer to the IA-64 Architecture Software Developers Manual. TLB: Translation Look-aside Buffer. A small table in the processors Memory Management Unit that contains translations from virtual address to physical addresses.
for more information
For more information on memory-windows, go to http://docs.hp.com/hpux/onlinedocs/os/memwn1_4.pdf For more information on the linker and other developer tools, go to http://docs.hp.com/hpux/dev/index.html#Developer%20Tools%20and%20Libraries For more information on the IPF architecture, see the Intel IA-64 Architecture Software Developer's Manual. For more information on the following, see the relevant HP-UX manual pages: mmap(2) mprotect(2) shmat(2) shmget(2) ld(1) cc(1) chatr(1) setrlimit(2)
12
maxdsiz(5) maxssiz(5)
Intel Itanium Processor Family is a trademark of Intel Corporation in the US and other countries and is used under license. Solaris is a registered trademark of Sun Microsystems, Inc. Linux is a registered trademark of Linus Torvalds. HP-UX and PA-RISC are registered trademarks of Hewlett-Packard company. The information in this document is subject to change without notice. Copyright Hewlett-Packard Company 2003 05/2003 Publication number 1.0
13

Aas White Paper

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Aas White Paper

Hochgeladen von

Copyright:

Verfügbare Formate

hp-ux / virtual memory

Adaptive Address Space

Adaptive Address Space

Adaptive Address Space

0x0000 0000 0x4000 0000

Quadrant text initialized data bss heap 0

Process private address space Quadrant 1

Quadrants 0,1 Quadrant 0 Quadrant 0 Quadrant 0 Quadrants 1,2 Quadrants 1,2,3

Quadrants 0,1,2 Quadrants 0,1,2,3 Quadrants 0,1,2,3

The layout is:

0x4000 0000 0000 0000 0x6000 0000 0000 0000

Fixed size Red Zone Fixed size

0xe000 0000 0000 0000 7

limitations of traditional HPUX address space models

what does MPAS provide?

The MPAS layout provides the following features:

text initialized data bss heap

The layout for 64-bit MPAS processes is shown below:

0x4000 0000 0000 0000 0x6000 0000 0000 0000

text Text initialized data bss heap

Fixed size Red Zone Fixed size

0xe000 0000 0000 0000 7

multiple mmap()s of the same data

preffered address for shmat()

how to get MPAS?

MAP_GLOBAL MAP_ADDR32 MAP_ADDR32|MAP_GLOBAL 32-bit MPAS No effect No effect No effect

64-bit, Goes to 6th non-MPAS Octant

compatibility/ interoperability configuration

for more information

Das könnte Ihnen auch gefallen