Beruflich Dokumente
Kultur Dokumente
&UHGLWV
$JUHDWGHDORIFUHGLWJRHVRXWWR'DQLHO*XHYDUDDQG$PMDG-DEDOL
ZKRDXWKRUHGSUHYLRXVYHUVLRQVRIWKLVGRFXPHQW'DQLHO*XHYDUD
V
GRFXPHQWLVOLQNHGDERYHEXWLWDSSHDUV$PMDG-DEDOL
VLVRIIOLQH
:KLOH,KDYHDGGHGDJUHDWGHDOWRWKLVGRFXPHQWDQGPDGHPDQ\
FKDQJHVDJUHDWGHDORIZRUNZDVGRQHE\WKHVHRWKHUDXWKRUVWR
WKHSRLQWZKHUHWKLVGRFXPHQWZRXOGQRWH[LVWZLWKRXWWKHP
7KDQNVIRUWKHJUHDWZRUNJX\V
,1'(;
Operational Overview...3
What Is DRBD.....3
What Heartbeat does.....4
Equipment Overview.4
DRBD Install and Configuration..5
Heartbeat Configuration..10
Credits......12
References......12
Appendix
A (changes made)...........13
Appendix
B (Flash op panel fix)..........14
Appendix C (Update Elastix w/ DRBD).....................16
Appendix D (Resize DRBD partition)........................18
Appendix E (DRBD w/ 2 NICs)..................................23
Appendix F (TFTPBOOT on DRBD)..........................24
Appendix G (Three Nodes).......................................25
Appendix H (IP Sourcing)..........................................28
2
2SHUDWLRQDO2YHUYLHZ
What is DRBD?
'5%'UHIHUVWREORFNGHYLFHVGHVLJQHGDVDEXLOGLQJEORFNWRIRUPKLJKDYDLODELOLW\+$
FOXVWHUV7KLVLVGRQHE\PLUURULQJDZKROHEORFNGHYLFHYLDDQDVVLJQHGQHWZRUN'5%'
FDQEHXQGHUVWRRGDVQHWZRUNEDVHGUDLG
,QWKHLOOXVWUDWLRQDERYHWKHWZRRUDQJHER[HVUHSUHVHQWWZRVHUYHUVWKDWIRUPDQ+$
FOXVWHU7KHER[HVFRQWDLQWKHXVXDOFRPSRQHQWVRID/LQX[NHUQHOILOHV\VWHPEXIIHU
FDFKHGLVNVFKHGXOHUGLVNGULYHUV7&3,3VWDFNDQGQHWZRUNLQWHUIDFHFDUG1,&GULYHU
7KHEODFNDUURZVLOOXVWUDWHWKHIORZRIGDWDEHWZHHQWKHVHFRPSRQHQWV
7KHRUDQJHDUURZVVKRZWKHIORZRIGDWDDV'5%'PLUURUVWKHGDWDRIDKLJKO\DYDLODEOH
VHUYLFHIURPWKHDFWLYHQRGHRIWKH+$FOXVWHUWRWKHVWDQGE\QRGHRIWKH+$FOXVWHU
,Q RXU LPSOHPHQWDWLRQ ZH ZLOO EH FUHDWLQJ D '5%' V\QFKURQL]HG SDUWLWLRQ RQ GHYVGD
FDOOHG UHSOLFD 7KLV SDUWLWLRQ ZLOO FRQWDLQ RQO\ WKRVH GLUHFWRULHV DQG ILOHV ZH ZDQW
V\QFKURQL]HG EHWZHHQ RXU SULPDU\ DQG VHFRQGDU\ VHUYHU 1DPHO\ WKH LPSRUWDQW $VWHULVN
DQG(ODVWL[UHODWHGGLUHFWRULHVDQGILOHV
7KHXSSHUSDUWRIWKLVSLFWXUHVKRZVDFOXVWHUZKHUHWKHOHIWQRGHLVFXUUHQWO\DFWLYHLH
WKHVHUYLFH
V,3DGGUHVVWKDWWKHFOLHQWPDFKLQHVDUHWDONLQJWRLVFXUUHQWO\RQWKHOHIWQRGH
7KHVHUYLFHLQFOXGLQJLWV,3DGGUHVVFDQEHPLJUDWHGWRWKHRWKHUQRGHDWDQ\WLPHHLWKHU
GXHWRDIDLOXUHRIWKHDFWLYHQRGHRUDVDQDGPLQLVWUDWLYHDFWLRQ7KHORZHUSDUWRIWKH
LOOXVWUDWLRQVKRZVDGHJUDGHGFOXVWHU,Q+$VSHDNWKHPLJUDWLRQRIDVHUYLFHLVFDOOHG
failoverWKHUHYHUVHSURFHVVLVFDOOHGfailbackDQGZKHQWKHPLJUDWLRQLVWULJJHUHGE\DQ
DGPLQLVWUDWRULWLVFDOOHGswitchover.
,QRXULPSOHPHQWDWLRQZHZLOOXWLOL]H+HDUWEHDWWRPRQLWRUWKHVWDWHRIWZRVHUYHUVDQG
GXULQJDIDLORYHUPRXQWRXUV\QFKURQL]HGSDUWLWLRQRQWKHVHFRQGDU\VHUYHUDQGVWDUWXS
WKHIROORZLQJUHVRXUFHVDSSOLFDWLRQVDVWHULVNP\VTODQGKWWS'XULQJIDLORYHURXUIORDWLQJ
,3DGGUHVVZLOOPRYHIURPWKHSULPDU\WRWKHVHFRQGDU\VHUYHU7KLV,3DGGUHVVVKRXOGEH
XVHGWRUHJLVWHU6,3DQGRWKHU9R,3HQGSRLQWV
(TXLSPHQW2YHUYLHZ
7KLVLQVWDOODWLRQVFHQDULRDVVXPHVWZRVHUYHUVHDFKZLWKRQH(WKHUQHWLQWHUIDFHVDQGD
VLQJOH6$7$KDUGGULYH<RXPD\KDYHDGLIIHUHQWW\SHRIKDUGGULYH,'(6&6,HWFDQG
WKHUHIRUHVRPHRIWKHVHVWHSVPD\QHHGWREHPRGLILHGWREHWWHUUHIOHFW\RXUHQYLURQPHQW
yum y update
x
x
x
x
x
x
x
x
Primary (p)
Partition number (3)
Press enter until returned to fdisk command prompt
NOTE: if your servers have two different sized hard drives it is imperative that the
third partition is identical in size or they will never synchronize over DRBD. Do this
by accepting the default first cylinder and then specifying the Last cylinder with the
+sizeM option. Ex. +6048M. Make these same specifications on both servers.
Press t to change the partition system ID
Press 3 to choose partition number
Choose HEX 83 for type
Press w to save changes
RESTART SERVER
7. Format newly made partition
8.
mke2fs -j /dev/sda3
Now we delete the file system from the disk we just created
dd if=/dev/zero bs=1M count=500 of=/dev/sda3; sync
9.
10. To ensure proper host name to IP resolution it is recommended that you manually
update the /etc/hosts file to reflect proper host-to-IP mapping. Add the following:
192.168.1.242 voipserver.drbd
192.168.1.243 voipbackup.drbd
11. Edit /etc/drbd.conf on Server1.drbd. Modify this sample to meet your particular
needs.
global { usage-count no; }
resource r0 {
protocol C;
startup { wfc-timeout 10; degr-wfc-timeout 30; } #change timers to your need
disk { on-io-error detach; } # or panic, ...
net {
after-sb-0pri discard-least-changes;
after-sb-1pri discard-secondary;
after-sb-2pri call-pri-lost-after-sb;
cram-hmac-alg "sha1";
shared-secret "Super Secret Password!";
}
syncer { rate 5M; }
on voipserver.drbd {
device /dev/drbd0;
disk /dev/sda3;
address 192.168.1.242:7788;
meta-disk internal;
}
on voipbackup.drbd {
device /dev/drbd0;
disk /dev/sda3;
address 192.168.1.243:7788;
meta-disk internal;
}
}
Note: The following lines are used to help the servers resolve split brain recovery. Split
brain is when two servers are in primary mode and need to know how to resolve who
should assume primary/secondary role (discarding or accepting changes made in
primaries).
after-sb-0pri discard-least-changes;
after-sb-1pri discard-secondary;
after-sb-2pri call-pri-lost-after-sb;
Reference: http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html
13. Initialize the meta-data area on disk before starting drbd (! on both server!)
drbdadm create-md r0
16. As you can see, both nodes are secondary, which is normal. we need to decide
which node will act as a primary now (voipserver.drbd) : that will initiate the first 'full
sync' between the two nodes:
drbdadm -- --overwrite-data-of-peer primary r0
17. Launch the command and wait until its finish synchronizing
20. Now we will copy all of the directories we want synchronized between the two
servers to our new partition, remove the original directories and then create
symbolic links to replace them on voipserver.drbd
cd /replica
amportal chown
tar -zcvf etc-asterisk.tgz /etc/asterisk
tar -zxvf etc-asterisk.tgz
tar -zcvf var-lib-asterisk.tgz /var/lib/asterisk
tar -zxvf var-lib-asterisk.tgz
tar -zcvf usr-lib-asterisk.tgz /usr/lib/asterisk/
tar -zxvf usr-lib-asterisk.tgz
tar -zcvf var-spool-asterisk.tgz /var/spool/asterisk/
tar -zxvf var-spool-asterisk.tgz
tar -zcvf var-lib-mysql.tgz /var/lib/mysql/
tar -zxvf var-lib-mysql.tgz
tar -zcvf var-log-asterisk.tgz /var/log/asterisk/
tar -zxvf var-log-asterisk.tgz
tar -zcvf var-www.tgz /var/www/
tar -zxvf var-www.tgz
rm -rf /etc/asterisk
rm -rf /var/lib/asterisk
rm -rf /usr/lib/asterisk/
rm -rf /var/spool/asterisk
rm -rf /var/www
rm
rm
ln
ln
ln
ln
ln
ln
ln
cd
-rf /var/lib/mysql/
-rf /var/log/asterisk/
-s /replica/etc/asterisk/ /etc/asterisk
-s /replica/var/lib/asterisk/ /var/lib/asterisk
-s /replica/usr/lib/asterisk/ /usr/lib/asterisk
-s /replica/var/spool/asterisk/ /var/spool/asterisk
-s /replica/var/lib/mysql/ /var/lib/mysql
-s /replica/var/log/asterisk/ /var/log/asterisk
-s /replica/var/www /var/www
/
mysqld restart
mysqld stop
asterisk stop
httpd stop
elastix-updaterd stop
elastix-portknock stop
22. Verify services are down and proceed to switch manually to the second server:
[root@voipSERVER.drbd /]# umount /replica ; drbdadm secondary r0
Note: Executing this same command in voipbackup.drbd while in secondary mode should
not display the /dev/drbd0 partition unless its assuming primary mode.
25. Now we will remove and link on voipbackup.drbd
rm -rf /etc/asterisk
rm -rf /var/lib/asterisk
rm -rf /usr/lib/asterisk/
rm
rm
rm
rm
ln
ln
ln
ln
ln
ln
ln
-rf /var/spool/asterisk
-rf /var/lib/mysql/
-rf /var/log/asterisk/
-rf /var/www
-s /replica/etc/asterisk/ /etc/asterisk
-s /replica/var/lib/asterisk/ /var/lib/asterisk
-s /replica/usr/lib/asterisk/ /usr/lib/asterisk
-s /replica/var/spool/asterisk/ /var/spool/asterisk
-s /replica/var/lib/mysql/ /var/lib/mysql
-s /replica/var/log/asterisk/ /var/log/asterisk
-s /replica/var/www /var/www
mysqld restart
mysqld stop
asterisk stop
httpd stop
elastix-updaterd stop
elastix-portknock stop
28. Drbd is working ... let's be sure that it will always be started:
chkconfig drbd on
+HDUWEHDW&RQILJXUDWLRQ
29. Remember to stop any boot up services on both servers that should be controlled
by heartbeat. These services will be controlled by heartbeat on the server that is in
control.
chkconfig asterisk off
chkconfig mysqld off
chkconfig httpd off
chkconfig elastix-updaterd off
chkconfig elastix-portknock off
service mysqld stop
service asterisk stop
service httpd stop
service elastix-portknock stop
service elastix-updaterd stop
10
logfacility local0
keepalive 2
deadtime 30
warntime 10
initdead 120
udpport 694
bcast eth0
auto_failback off
node voipserver.drbd
node voipbackup.drbd
NOTE: I've set auto_failback to off. This seems more appropriate to me.
use the following command on the current secondary to switch back:
sh /usr/lib/heartbeat/hb_takeover
35. Replicate now the ha.cf, authkeys and haresources to voipbackup.drbd and start
heartbeat
[root@voipserver.drbd ha.d]# scp /etc/ha.d/ha.cf /etc/ha.d/authkeys /etc/ha.d/haresources
root@voipbackup.drbd:/etc/ha.d/
[root@svoipbackup.drbd ha.d]# service heartbeat start
38. Execute df h on the primary to confirm that our /dev/drbd0 partition is
mounted and in use.
11
>URRW#YRLSVHUYHUa@GIK
)LOHV\VWHP6L]H8VHG$YDLO8VH0RXQWHGRQ
GHYVGD***
WPSIV**GHYVKP
GHYGUEG*0*UHSOLFD
39. Test your work by creating a SIP extension or anything inside Elastix Web
Interface, then shut down your primary server while making a continuous ping to
192.168.1.245 (floating IP address) verifying it doesnt lose connectivity. Make
another change in the secondary server, turn your primary back on, and all
changes should be kept intact.
Special Note: Any changes made to asterisk files should be done via web Interface
ONLY. Do not attempt to upgrade Elastix version once finished the cluster or else it will
write its own files again discarding links to the /replica directory.
Troubleshooting:
tcpdump i eth0:0 s 1500 w captura.pcap #capture traffic
mv captura.pcap /var/www/html #move file to web for download
5HIHUHQFHV
http://wiki.centos.org/HowTos/Ha-Drbd
http://support.red-fone.com/downloads/elastix/Elastix HA Cluster.pdf
http://danielaliaman.com/blog/files/phonecube/cluster/AsteriskCluster.pdf
http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html
Author: Nick Ross (based on the work of Daniel Geuvara and Amjad Jabali)
12
APPENDIX A
I changed a few things within the original 2.3 document. Below is my description and
reasoning for all of these changes.
CHANGE 1 - I edited the instructions for only one network card. While its best in practice to use a second network
card for DRBD, virtual machines make this irrelevant. To implement this, I lowered the rate at which DRBD replicates
from 100M to 5M (40 Mbps). This is to avoid creating congestion of VoIP traffic on replication. After the initial sync,
very little data will be transferred, so this is likely not to be an issue. Additionally, DRBD now uses the same
IP address (192.168.1.242 & .243), rather than a separate subnet. Remember to change the IP addresses to those
of your servers. The "/etc/ha.d/haresources" file conatins the IP address for the cluster, which should also be set by you.
_
IF YOU WISH TO USE A SECOND NIC, you can easily do so by going to STEP 11 to edit drbd.conf and changing
the IP addresses in this file to match those of your dedicated NICs. You will also want to go to STEP 30, and change
ha.cf to state "bcast eth1" instead of eth0.
CHANGE 2- In STEP 11 after-sb-0pri is set to "discard-least-changes" . I felt this was a better option for asterisk. In
the event that a split brain occurs, it really doesn't matter who was primary last. What matters is who was the functioning
server for outgoing calls. The functioning server is likely to have had more changes made, and therefore have the most
current configuration. This is important, because the sb2-pri setting refers back to the sb-0pri setting.
CHANGE 3 - Being logged into the physical console/terminal caused Step 22 to fail. I made an entry in the notes.
CHANGE 4 - Step 28 was changed to "chkconfig drbd on" . It previously had "chkconfig drbd83 on" . This command
did not work for me on 2.4, but my command did work
CHAGE 5 - Step 30 was changed to set "auto_failback" to off. Every time the server fails over or fails back, service is interrupted.
Personally, I'd like to investigate the reason for the failover before I switch back. If the backup server fails,
you will still be able to automatically failover to the original primary server, so there is little risk to this. On the other hand, automatically
failing BACK to the original server could cause constant interruptions if it had a bouncing NIC or the server
keeps restarting due to a hardware issue.
_
Manual failback can be completed by the command "sh /usr/lib/heartbeat/hb_takeover" on
whichever server is secondary, or "sh /usr/lib/heartbeat/hb_standby" on whatever server is
currently the primary.
CHANGE 6 - Step 36 was edited to add "chkconfig heartbeat on" . Heartbeat did not automatically
start until this command was added.
CHANGE 7 - Step 33 added "elastix-updaterd elastix-portknock" to haresources file. The services were also added to
Steps 21,26 and 29. They rely on /var/www resources, so they can't be loaded at boot.
CHANGE 8 - Step 20 & 25 added "rm -rf /var/www" as it was missing from previous documents
CHANGE 9 - Step 33, added a line for email notifications.
CHANGE 10- Step 20, added in "amportal chown". This is our last chance to do this and ensure ownership, as it does not work
once the files are moved to the drbd database. Otherwise, we could be stuck without ownership, which can happen after an update.
- Appendix and Changes by Nick R.
APPENDIX B
Flash Operator Panel Fix
The Flash Operator Panel is dependent on a process in order to receive updates made in FreePBX/Elastix.
This process is normally loaded on boot, and called within the /etc/init.d/rc.local file. It requires files in
/var/www/ to successfully launch. Unfortunately,since we are using DRBD, the clustered data holds /var/www
and is not available on boot. We must configure heartbeat to launch this process, but first we must make
a Linux Standard Base service script. YOU MUST DO THIS ON BOTH THE PRIMARY AND THE SECONDARY
SERVERS!
This only matters if you actually use the flash operator panel. If you don't, then there is little reason to do this.
Step 1 - Type the following command on VOIPSERVER (which I'll assume is in the PRIMARY role)
nano /etc/init.d/fop_start
Step 2 - In the editor, copy the fop_start script, then save & exit (CTRL+O then CTRL+X). The fop_start script is found
on the following page (sorry, it didn't fit on this page!).
Step 3 - Type the command to set the appropriate permissions
chmod 755 /etc/init.d/fop_start
Step 4 - Edit the /etc/ha.d/haresources file ( you can type "nano /etc/ha.d/haresources") and add "fop_start" to the end
of the first line. It should now look like this:
voipserver.drbd drbddisk::r0 Filesystem::/dev/drbd0::/replica::ext3 IPaddr::192.168.1.245/24/eth0/192.168.1.255 mysqld asterisk httpd elastix-updaterd elastix-portknock fop_start
voipserver.drbd MailTo::your@emailgoeshere.com,your@emailgoeshere.com::DRBD/HA-ALERT
Step 5- You can test to make sure the fop_start service is working by typing:
service fop_start start
service fop_start status
service fop_start stop
This should produce a bunch of messages about amportal.conf. It then should end with an "OK" for the stop and the
start command. If it does not, then your current server might be the secondary (in which case a "FAILED" result is
to be expected). To check if the current server is the primary, type "drbdadm role r0".
Step 6 - REPEAT STEPS 1-4 ON THE OTHER SERVER (VOIPBACKUP)! Step 5 will result in a "FAILED" message
until it assumes the primary role. Running "sh /usr/lib/heartbeat/hb_takeover" will make you assume the primary role.
APPENDIX C
Updating Asterisk With DRBD
Step 1 - Keep data on both servers by shutting down one server. We want to make sure
only one server is on at a time. I shut down "voipbackup", and leave "voipserver" on.
Step 2 - Stop Services on the active machine, and make sure they always stay off (heartbeat manages them)
chkconfig asterisk off
chkconfig mysqld off
chkconfig httpd off
service mysqld restart
service mysqld stop
service asterisk stop
service httpd stop
service elastix-updaterd stop
service elastix-portknock stop
service fop_start stop
Step 3 - Remove links from the /replica folder to the file system
unlink /etc/asterisk
unlink /var/lib/asterisk
unlink /usr/lib/asterisk
unlink /var/spool/asterisk
unlink /var/lib/mysql
unlink /var/log/asterisk
unlink /var/www/www
unlink /var/www
Step 4 - Copy all the files in the /replica/ folder back to the root partition
mkdir /etc/asterisk
mkdir /var/lib/asterisk
mkdir /usr/lib/asterisk
mkdir /var/spool/asterisk
mkdir /var/lib/mysql
mkdir /var/log/asterisk
mkdir /var/www
cp -arf /replica/etc/asterisk/* /etc/asterisk/
cp -arf /replica/var/lib/asterisk/* /var/lib/asterisk/
cp -arf /replica/usr/lib/asterisk/* /usr/lib/asterisk/
cp -arf /replica/var/spool/asterisk/* /var/spool/asterisk/
cp -arf /replica/var/lib/mysql/* /var/lib/mysql/
cp -arf /replica/var/log/asterisk/* /var/log/asterisk/
yes | cp -arf /replica/var/www/* /var/www/
APPENDIX D
RESIZING OR MOVING the DRBD Partition
This is a very long and drawn out process, and is not for the weak of heart. The concept is similar to the previous section,
where we move everything off the DRBD partition, and then back on. If you've changed the partition, or want to specify a
different size than what I have, then make the changes accordingly. Please ensure you perform the steps on the correct
server.
%%%%% ON VOIPBACKUP (#2) %%%%%
STEP 1 - We have to make sure that VOIPBACKUP will not start DRBD or HEARTBEAT on reboot.
chkconfig heartbeat off
chkconfig drbd off
<Shut down the VOIP BACKUP server>
@@@@ ON VOIPSERVER (#1) @@@@
Step 2 - We must turn all the services off, unlink the folders in replica, then copy the data back to the root partition
chkconfig asterisk off
chkconfig mysqld off
chkconfig httpd off
service mysqld restart
service mysqld stop
service asterisk stop
service httpd stop
service elastix-updaterd stop
service elastix-portknock stop
service fop_start stop
unlink /etc/asterisk
unlink /var/lib/asterisk
unlink /usr/lib/asterisk
unlink /var/spool/asterisk
unlink /var/lib/mysql
unlink /var/log/asterisk
unlink /var/www/www
unlink /var/www
mkdir /etc/asterisk
mkdir /var/lib/asterisk
mkdir /usr/lib/asterisk
mkdir /var/spool/asterisk
mkdir /var/lib/mysql
mkdir /var/log/asterisk
mkdir /var/www
cp -arf /replica/etc/asterisk/* /etc/asterisk/
cp -arf /replica/var/lib/asterisk/* /var/lib/asterisk/
cp -arf /replica/usr/lib/asterisk/* /usr/lib/asterisk/
cp -arf /replica/var/spool/asterisk/* /var/spool/asterisk/
cp -arf /replica/var/lib/mysql/* /var/lib/mysql/
cp -arf /replica/var/log/asterisk/* /var/log/asterisk/
yes | cp -arf /replica/var/www/* /var/www/
amportal chown
18
t
3
83
w
STEP 4- Reboot VOIPSERVER. Start up VOIPBACKUP.
%%%%% ON VOIPBACKUP (#2) %%%%%
STEP 5- We must now force the backup server to a DRBD primary role, and remount the replica folder
service drbd start
drbdadm primary r0 ; mount /dev/drbd0 /replica
STEP 6 - We must perform the copy from the replica folder, back to the root partition, as we did on the other server.
unlink /etc/asterisk
unlink /var/lib/asterisk
unlink /usr/lib/asterisk
unlink /var/spool/asterisk
unlink /var/lib/mysql
unlink /var/log/asterisk
unlink /var/www/www
unlink /var/www
(STEP 6 is continued on the next page)
APPENDIX E
Using Two Network Interface Cards
The original guide that this was based on used a dual NIC configuration. Using a separate NIC for DRBD is usually a
good idea. You want the replication to work through an independent channel, typically where bandwidth for your
services will not be starved by DRBD replication traffic. There are many situations, however, where this is not
necessarily desirable. For instance, if you use virtual machines, there wouldn't be much point in adding a second
virtual NIC. There is also the case of having an elastix based appliance, with only a single NIC built in, and a second
NIC is unavailable.
Asterisk servers are usually not very disk sensitive, so its unlikely DRBD will be sending much data. If you are worried
about DRBD using too much bandwidth, then you can lower the replication rate limit below 5 MB/s (which is 40 Mbits).
If you wish to use two network cards, the changes to the original implementation are very simple. In this scenario, we
will assume VOIPSERVER has a secondary NIC eth1, with an IP address of 10.1.1.1 . We will assume VOIPBACKUP
has a secondary NIC eth1, with an IP address of 10.1.1.2 .
Note: the ip addresses do not need to be in a different. The can be whatever you assign them to. Just make sure they
are static addresses, or you have set a DHCP reservation for these NICs.
!!!!!!!!! MAKE THE CHANGES ON BOTH SERVERS!!!!!
Step 1- Change the following section of the /etc/drbd.conf:
syncer { rate 5M; }
on voipserver.drbd {
device /dev/drbd0;
disk /dev/sda3;
address 10.1.1.1:7788;
meta-disk internal;
}
on voipbackup.drbd {
device /dev/drbd0;
disk /dev/sda3;
address 10.1.1.2:7788;
meta-disk internal;
}
}
Step 2- Change the following section of the /etc/ha.d/ha.cf file
warntime 10
initdead 120
udpport 694
bcast eth1
auto_failback off
node voipserver.drbd
node voipbackup.drbd
Step 3 - Reboot, both servers and you are done.
APPENDIX F
Moving TFTPBOOT
The TFTP boot can be easily moved by copying it to the replica folder, deleting the original folder, and then creating a symbolic
link to the replica folder. As you can see, it is very similar to the procedure we've used for all other folders.
Step 1- Go on the active/primary computer and type the following commands
cd /replica
tar -zcvf tftpboot.tgz /tftpboot/
tar -zxvf tftpboot.tgz
rm -rf /tftpboot
ln -s /replica/tftpboot /tftpboot
Step 2- Go on the secondary computer and type these commands
rm -rf /tftpboot
ln -s /replica/tftpboot /tftpboot
That is all there is to it!
APPENDIX G
Inserting a passive third node (no HA)
While DRBD and Heartbeat do not offer the ability to fail over to a third node, you can set up a third node for passive replication. It
will copy everything on the current DRBD cluster, and always be a secondary node. There are many complications with this process,
and the only real practical use is for disaster recovery. That being said, this is an order of magnitude more difficult than the two node
cluster, and this documentation proved too time intensive to write thoroughly. Hopefully, this will at least guide you.
Just a note - in order to do this, you basically have to recreate the original DRBD setup on your clustered partitions. If you already
are using DRBD, follow the guide in Appendix D, to the point where you fdisk your partitions. This will copy your asterisk
configs back to both servers so you do not lose your copies.
We have 3 servers. VOIPSERVER, VOIPBACKUP and VOIPDR (our passive node for disaster recovery).
VOIPDR will be 192.168.1.244.
Step 1- Follow the main guide for steps 1-9 on ALL servers.
Step 2- For step 10 on the original guide, on all servers, you will need to add this line in addition to the others in step 10:
192.168.1.244 voipdr.drbd
Step 3- For Step 11 on the original guide, on all servers, add the following to drbd.conf AFTER the very last bracket:
resource r1 {
protocol A;
net
{
}
syncer { rate 5M; }
stacked-on-top-of r0 {
device
address
/dev/drbd10;
192.168.1.245:7788;
}
on voipdr.drbd {
device
disk
address
/dev/drbd10;
/dev/sda3;
192.168.1.244:7788;
meta-disk internal;
}
}
Step 4- Continue with the original guide, on only voipserver and voipbackup. STOP at step 17. DO NOT perform these steps on VoipDR.
- (skip from step 17 to 28 on the original guide)
Step 5- Perform step 28 & 29 in the original guide, on ALL SERVERS.
Step 6 - We now must configure heartbeat temporarily. This is much earlier than in the original guide, but it is required
to even get the third node up. Perform steps 30,31 and 32 on VOIPSERVER and VOIPBACKUP. There will be
NO HEARTBEAT on VOIPDR... EVER. If you try to do this on VOIPDR you will break everything.
Step 7- Instead of step 33, we must provide a very different /etc/ha.d/haresources file. The haresources file must
now have only the single following line and NO OTHERS:
voipserver.drbd IPaddr::192.168.1.245/24/eth0
Step 8 - You may now follow steps 34-36 on the following guide. Remember, VOIPDR should NOT have heartbeat running,
so you do not need to replicate these steps to VOIPDR.
Step 9 - Now is where we really deviate. We must make resource r1 on top of r0. Basically, we are putting a second
layer on our existing DRBD, which we will replicate with VOIPDR.
Type the following We will start on the primary, which should be VOIPSERVER.
#####VOIPSERVER#####
drbdadm --stacked create-md r1
drbdam --stacked up r1
drbdadm --stacked adjust r1
Step 10 - Now we prepare VOIPDR, but in a slightly different way
%%%%%% ON VOIPDR %%%%%%
drbdadm create-md r1
service drbd start
drbdadm adjust r1
Step 11- Now we must have VOIPSERVER assume the primary role for stacked resource r1, and force a replication
####### ON VOIPSERVER ########
drbdadm --stacked -- --overwrite-data-of-peer primary r1
You should wait until this has finished before continuing. Use "service drbd status" to check the replication progress.
Please note, we have not touched VOIPBACKUP at all in this process... and that is ok. It is not necessary. All of this
data will automatically replicate over to VOIPACKUP if VOIPSERVER fails (or it will after we finalize the heartbeat settings).
Step 12 - Once this is completed, we can mount the resource on VOIPSERVER. Run the following commands:
####### ON VOIPSERVER #########
mkfs.ext3 /dev/drbd10
mkdir /replica
mount /dev/drbd10 /replica
Notice that we are using /dev/drbd10 instead of drbd0. drbd10 is the device that represents stacked resource r1, which is on top
of resource r0 and /dev/drbd0. All three device will be using the stacked resource r1 & /dev/drbd10 to mount the data, this way
the data is accessible to all three nodes. If we used r0 & drbd0, VOIPDR would never receive any replicated data.
Step 13- We now must move the asterisk/elastix configuration onto our mounted volume. Still on ##VOIPSERVER##,
follow *only* STEP 20 from the original guide. Do not proceed passed step 20.
Step 14- Now we are going to out in the real heartbeat configuration. The configuration we had before was only temporary,
and it was needed to simply bring VOIPDR into the replication. The following is a copy of the ha resources file you should
have. It needs to match EXACTLY. This must be done on BOTH, VOIPSERVER and VOIPBACKUP. Do NOT do this on
VOIPDR.
voipserver.drbd IPaddr::192.168.1.245/24/eth0/192.168.1.255 drbdupper::r1 Filesystem::/dev/drbd10::/replica::ext3 mysqld asterisk httpd elastix-updaterd elastix-portknock
voipserver.drbd MailTo::your@emailhere.com,your@emailhere.com::DRBD/HA_MAIN-NODE
Please note, if you have implemented the flash operator panel fix (Appendix B), please make the appropriate changes.
Step 15- At this point, we need to start preparing VOIPBACKUP, which is a little tricky. With a stacked resource its a little
more complicated to unmount and switch primaries... but first things first. Lets put a /replica/ foldero n VOIPBACKUP.
$$$$$$$$ ON VOIPBACKUP $$$$$$$$$$$
mkdir /replica
Step 16- At this point, the easiest way to finish up the process is to reboot and have VOIPBACKUP take the primary role
via heartbeat. Reboot both ##VOIPSERVER## and $$VOIPBACKUP$$, but give $$VOIPBACKUP$$ a 30 second headstart.
If VOIPBACKUP does not come up in the primary role, you can run "sh /usr/lib/heartbeat/hb_takeover" on it.
Step 17- The /replica/ folder should be populated now on $$VOIPBACKUP$$. If so, you can complete the transfer by
following step 25 on the original guide. Once you are done, you can give ##VOIPSERVER## control by typing in
"sh /usr/lib/heartbeat/hb_takeover" on that server.
THE GUIDE IS OVER. STOP HERE.
THE INSTRUCTIONS BELOW ARE FOR DISASTER RECOVERY ONLY!!!!!
DO NOT FOLLOW TRY TO FOLLOW THESE STEPS UNLESS YOU HAVE TO.
You may be wondering why you've done so little on the third node, VOIPDR. The short answer is that VOIPDR doesn't do much.
Once you set up DRBD on VOIPDR, it just sits there and replicates. There is no heartbeat, no failover, etc. If something ever
were to happen to the other two nodes, VOIPDR would have to be activated manually.
Assuming some disaster, where the other two nodes caught fire and burned, here is a quick rundown of what you'd do.
%%%%% ON VOIPDR - BUT ONLY IN THE EVENT OF A DISASTER %%%%%
Step 1)
mkdir /replica
drbdadm primary r1
mount /dev/drbd10 /replica
Step 2) FOLLOW STEP 25 ON THE ORIGINAL GUIDE to prepare the links and remove the old asterisk files.
Step 3) issue the following commands:
service mysqld start
service asterisk start
service httpd start
service elastix-updaterd start
service elastix-portknock start
fxotune -s
amportal start_fop
Now, edit the phone configurations so that they connect to VOIPDR. Its not nearly as elegant, but it will work. Alternatively,
instead of STEP 25, you can just delete the original folders, then copy them from the /replica/ folder back to their "real"
locations. After you do this, you can stop drbd ("service drbd stop"), and change the ip address of the server to that of the
old cluster (192.168.1.245). The phones will now automatically find the server and resume operations.
APPENDIX H
IP Sourcing
One of the problems I've encountered in a clustered environment is traffic originating from the original IP address of a device,
instead of the cluster's IP address. This can cause problems with static NAT, firewall rules and port forwarding (among other
things). Ideally, all of our heartbeat monitored services will send traffic from the cluster IP address of 192.168.1.245 . Luckily,
the heartbeat service can do just that, with the command IPsrcaddr::192.168.1.245 . This command should be added after
declaring the cluster IP address in /etc/ha.d/haresources .
It would look something like the following:
voipserver.drbd IPaddr::192.168.1.245/24/eth0/192.168.1.255 IPsrcaddr::192.168.1.245 drbdupper:r0 ......
Below are examples for our 2-way and 3-way replicacation guides
Two way replication:
voipserver.drbd IPaddr::192.168.1.245/24/eth0/192.168.1.255 IPsrcaddr:192.168.1.245 drbddisk::r0 Filesystem::/dev/drbd0::/replica::ext3 mysqld asterisk httpd elastix-updaterd elastix-portknock
voipserver.drbd MailTo::your@emailhere.com,your@emailhere.com::DRBD/HA_MAIN-NODE
If you use the Flash operator panel, and use the fop_start guide in the appendix, remember to add the corresponding entry to
the end of your /etc/ha.d/haresources file.