Beruflich Dokumente
Kultur Dokumente
SVC GLOBAL MIRROR A practical review of important parameters and their impact on SVC.
Page 1 of 19
When a global mirror is started between two VDisks, there is an initial period of copying from the Primary to the Secondary to get the two VDisks in synch. This is often referred to as Initial Synching. If a mirror is paused for some reason (say, due to a network outage), during this time changes to either the Primary or the Secondary are stored in bitmaps to keep track of what has changed. When the mirror is restarted, these bitmaps are used to choose which parts of the Primary need to read and copied to the Secondary. This is typically called a Re-synch. During an initial synch, the Primary is read sequentially, block by block, and the data is sent to the Secondary to be written there. When re-synching, a bit map is used to determine which parts of the VDisk have changed and need to be recopied. Page 2 of 19 Copyright IBM Corp. 2010 All rights reserved.
Once a mirror is fully synched, all Writes to the Primary are replicated to the Secondary. No other traffic (except a heartbeat and some handshaking) is necessary between the clusters. However, this Write traffic generated by applications can be significant.
Bandwidth Considerations for Global Mirror
To reliably run the Global Mirror, you must ensure that the bandwidth available between sites is not overrun except for short bursts. Without any control, synching operations might run at disk speed, which is typically much higher than the available bandwidth. Once mirrors are in synch, write traffic must not be so heavy as to overrun the link. What you normally see when a Global Mirror implementation overdrives the bandwidth between sites, is that processing slows down for a little while, and then the SVC pauses a mirror to try and reduce traffic. This may be enough. If not, the SVC will pause another mirror, and another, until the traffic is low enough for the link(s) to handle it. Finally, the links between sites may not be dedicated to mirror traffic. There may be other SAN traffic on the links, or there may be networking (IP) traffic between sites, all sharing the same bandwidth. In this case, the amount of bandwidth deemed to be available to mirroring must be decreased to reflect that bandwidth the other traffic is using.
Page 3 of 19
Page 4 of 19
The information, tools and documentation (Materials) are being provided to IBM customers to assist them with customer installations. Such Materials are provided by IBM on an as-is basis. IBM makes no representations or warranties regarding these Materials and does not provide any guarantee or assurance that the use of such Materials will result in a successful customer installation.
Page 5 of 19
Page 6 of 19
We had a link between the SVC5 (Local) and SVC6 (Remote) of 1 Gb (approx. 100 MB/s) Notice on the Cisco Fabric manager display, the FCIP link is totally saturated. (We are looking at the switch ports at the Remote Site; hence all the traffic is under the Receive column. The top row is the actual GigE port on the switchs IP blade that the Empirix is attached to. The second row is the internal TE-port used by that IP blade.)
It took approximately 18 minutes to synchronize 4 Vdisks (25GB each). 102400 MB / 1098 Sec = 93.3MB/s, about as much as we can expect from a 1Gb Ethernet link.
Page 7 of 19
This time, the 1Gb link showed around 68% utilization. Synchronization took approximately 28 Minutes. 102,400 MB / 1658 Sec = 61.8 MB/s.
Page 8 of 19
This time, the 1Gb link only showed 18% utilization. Synchronization took approximately 113 Minutes. 102400 MB / 6780 Sec = 15.1 MB/s.
Page 9 of 19
Page 10 of 19
To establish a baseline, we deleted all the mirror definitions and ran the IOMETER with just normal local processing. Using one Worker per VDisk, we see about 227 MB/sec of Write activity to the 4 VDisks, with response time less than a millisecond.
Page 11 of 19
Then we stopped IOMETER, started the mirrors, and waited until they synchronized .
Page 12 of 19
The Cisco Fabric Manager shows about 19% utilization on the Gig E link, (restricted by the Empirix to run OC3), in other words, the OC3 Wan link is full.
Page 13 of 19
When the Mirror stopped, the host throughput got significantly better, we see the IOMETER throughput has increased significantly by 11 55AM (approximately 5 minutes after the mirror stopped.
Page 14 of 19
At 12:00 it traffic had increased from 121 MBs to 156 MBs, still lower than the base line 224 MBs. This is probably due to time spent de-staging cache, as the SVC and the backend storage catch up to the heavy IOMETER workload. We did observe the traffic increasing steadily over time.
Page 15 of 19
Page 16 of 19
After 2 minutes, response time increased to 3.4 ms, and the Write traffic has decreased.
Page 17 of 19
Since the IOMETER writes rate are only a bit higher than the WAN capacity it took longer to stop the mirroraround 14 minutes for the link to get oversaturated and the mirrors to stop.
Page 18 of 19
These examples illustrate the effects of the Bandwidth parameter, and the behavior expected when the link(s) between sites are over-driven. While important, these are only a few of the issues that should be reviewed when planning of a Global Mirror implementation.
Page 19 of 19