Sie sind auf Seite 1von 29

Building Elastix-2.

4 High Availability Clusters with


DRBD and Heartbeat (using a single NIC)
This information has been modified and updated by Nick Ross.
Please refer to the original document found at:
http://www.elastixbrasil.com.br/downloads/Elastix_2.3_HA_Cluster.pdf

Changes made to this document will be explained at the very end


in Appendix A.
Document Last updated May 3rd, 2014.


&UHGLWV


$JUHDWGHDORIFUHGLWJRHVRXWWR'DQLHO*XHYDUDDQG$PMDG-DEDOL
ZKRDXWKRUHGSUHYLRXVYHUVLRQVRIWKLVGRFXPHQW'DQLHO*XHYDUD
V
GRFXPHQWLVOLQNHGDERYHEXWLWDSSHDUV$PMDG-DEDOL
VLVRIIOLQH

:KLOH,KDYHDGGHGDJUHDWGHDOWRWKLVGRFXPHQWDQGPDGHPDQ\
FKDQJHVDJUHDWGHDORIZRUNZDVGRQHE\WKHVHRWKHUDXWKRUVWR
WKHSRLQWZKHUHWKLVGRFXPHQWZRXOGQRWH[LVWZLWKRXWWKHP

7KDQNVIRUWKHJUHDWZRUNJX\V







,1'(;




Operational Overview...3
What Is DRBD.....3
What Heartbeat does.....4
Equipment Overview.4
DRBD Install and Configuration..5
Heartbeat Configuration..10
Credits......12
References......12
Appendix
A (changes made)...........13


Appendix
B (Flash op panel fix)..........14
Appendix C (Update Elastix w/ DRBD).....................16
Appendix D (Resize DRBD partition)........................18
Appendix E (DRBD w/ 2 NICs)..................................23

Appendix F (TFTPBOOT on DRBD)..........................24
Appendix G (Three Nodes).......................................25
Appendix H (IP Sourcing)..........................................28


2

2SHUDWLRQDO2YHUYLHZ

What is DRBD?

'5%'UHIHUVWREORFNGHYLFHVGHVLJQHGDVDEXLOGLQJEORFNWRIRUPKLJKDYDLODELOLW\ +$ 
FOXVWHUV7KLVLVGRQHE\PLUURULQJDZKROHEORFNGHYLFHYLDDQDVVLJQHGQHWZRUN'5%'
FDQEHXQGHUVWRRGDVQHWZRUNEDVHGUDLG
,QWKHLOOXVWUDWLRQDERYHWKHWZRRUDQJHER[HVUHSUHVHQWWZRVHUYHUVWKDWIRUPDQ+$
FOXVWHU7KHER[HVFRQWDLQWKHXVXDOFRPSRQHQWVRID/LQX[NHUQHOILOHV\VWHPEXIIHU
FDFKHGLVNVFKHGXOHUGLVNGULYHUV7&3,3VWDFNDQGQHWZRUNLQWHUIDFHFDUG 1,& GULYHU
7KHEODFNDUURZVLOOXVWUDWHWKHIORZRIGDWDEHWZHHQWKHVHFRPSRQHQWV
7KHRUDQJHDUURZVVKRZWKHIORZRIGDWDDV'5%'PLUURUVWKHGDWDRIDKLJKO\DYDLODEOH
VHUYLFHIURPWKHDFWLYHQRGHRIWKH+$FOXVWHUWRWKHVWDQGE\QRGHRIWKH+$FOXVWHU
,Q RXU LPSOHPHQWDWLRQ ZH ZLOO EH FUHDWLQJ D '5%' V\QFKURQL]HG SDUWLWLRQ RQ GHYVGD
FDOOHG UHSOLFD 7KLV SDUWLWLRQ ZLOO FRQWDLQ RQO\ WKRVH GLUHFWRULHV DQG ILOHV ZH ZDQW
V\QFKURQL]HG EHWZHHQ RXU SULPDU\ DQG VHFRQGDU\ VHUYHU 1DPHO\ WKH LPSRUWDQW $VWHULVN
DQG(ODVWL[UHODWHGGLUHFWRULHVDQGILOHV








7KHXSSHUSDUWRIWKLVSLFWXUHVKRZVDFOXVWHUZKHUHWKHOHIWQRGHLVFXUUHQWO\DFWLYHLH
WKHVHUYLFH
V,3DGGUHVVWKDWWKHFOLHQWPDFKLQHVDUHWDONLQJWRLVFXUUHQWO\RQWKHOHIWQRGH
7KHVHUYLFHLQFOXGLQJLWV,3DGGUHVVFDQEHPLJUDWHGWRWKHRWKHUQRGHDWDQ\WLPHHLWKHU
GXHWRDIDLOXUHRIWKHDFWLYHQRGHRUDVDQDGPLQLVWUDWLYHDFWLRQ7KHORZHUSDUWRIWKH
LOOXVWUDWLRQVKRZVDGHJUDGHGFOXVWHU,Q+$VSHDNWKHPLJUDWLRQRIDVHUYLFHLVFDOOHG
failoverWKHUHYHUVHSURFHVVLVFDOOHGfailbackDQGZKHQWKHPLJUDWLRQLVWULJJHUHGE\DQ
DGPLQLVWUDWRULWLVFDOOHGswitchover.
,QRXULPSOHPHQWDWLRQZHZLOOXWLOL]H+HDUWEHDWWRPRQLWRUWKHVWDWHRIWZRVHUYHUVDQG
GXULQJDIDLORYHUPRXQWRXUV\QFKURQL]HGSDUWLWLRQRQWKHVHFRQGDU\VHUYHUDQGVWDUWXS
WKHIROORZLQJUHVRXUFHVDSSOLFDWLRQVDVWHULVNP\VTODQGKWWS'XULQJIDLORYHURXUIORDWLQJ
,3DGGUHVVZLOOPRYHIURPWKHSULPDU\WRWKHVHFRQGDU\VHUYHU7KLV,3DGGUHVVVKRXOGEH
XVHGWRUHJLVWHU6,3DQGRWKHU9R,3HQGSRLQWV


(TXLSPHQW2YHUYLHZ

7KLVLQVWDOODWLRQVFHQDULRDVVXPHVWZRVHUYHUVHDFKZLWKRQH(WKHUQHWLQWHUIDFHVDQGD
VLQJOH6$7$KDUGGULYH<RXPD\KDYHDGLIIHUHQWW\SHRIKDUGGULYH ,'(6&6,HWF DQG
WKHUHIRUHVRPHRIWKHVHVWHSVPD\QHHGWREHPRGLILHGWREHWWHUUHIOHFW\RXUHQYLURQPHQW

yum y update

x
x
x
x

x
x
x
x

Primary (p)
Partition number (3)
Press enter until returned to fdisk command prompt
NOTE: if your servers have two different sized hard drives it is imperative that the
third partition is identical in size or they will never synchronize over DRBD. Do this
by accepting the default first cylinder and then specifying the Last cylinder with the
+sizeM option. Ex. +6048M. Make these same specifications on both servers.
Press t to change the partition system ID
Press 3 to choose partition number
Choose HEX 83 for type
Press w to save changes

RESTART SERVER
7. Format newly made partition


8.

mke2fs -j /dev/sda3

Now we delete the file system from the disk we just created
dd if=/dev/zero bs=1M count=500 of=/dev/sda3; sync

9.

Install DRBD, Heartbeat and dependencies with yum.


yum install heartbeat drbd83 kmod-drbd83
Note: If by any chance you experience problems with drbd83, use drbd82 version (64 bit
versions).

10. To ensure proper host name to IP resolution it is recommended that you manually
update the /etc/hosts file to reflect proper host-to-IP mapping. Add the following:
192.168.1.242 voipserver.drbd
192.168.1.243 voipbackup.drbd

11. Edit /etc/drbd.conf on Server1.drbd. Modify this sample to meet your particular
needs.
global { usage-count no; }
resource r0 {
protocol C;
startup { wfc-timeout 10; degr-wfc-timeout 30; } #change timers to your need
disk { on-io-error detach; } # or panic, ...
net {

after-sb-0pri discard-least-changes;
after-sb-1pri discard-secondary;
after-sb-2pri call-pri-lost-after-sb;
cram-hmac-alg "sha1";
shared-secret "Super Secret Password!";
}
syncer { rate 5M; }
on voipserver.drbd {
device /dev/drbd0;
disk /dev/sda3;
address 192.168.1.242:7788;
meta-disk internal;
}
on voipbackup.drbd {
device /dev/drbd0;
disk /dev/sda3;
address 192.168.1.243:7788;
meta-disk internal;
}
}


Note: The following lines are used to help the servers resolve split brain recovery. Split
brain is when two servers are in primary mode and need to know how to resolve who
should assume primary/secondary role (discarding or accepting changes made in
primaries).
after-sb-0pri discard-least-changes;
after-sb-1pri discard-secondary;
after-sb-2pri call-pri-lost-after-sb;

Reference: http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html

12. Replicate this config file (/etc/drbd.conf) to the second server


scp /etc/drbd.conf root@voipbackup.drbd:/etc/

13. Initialize the meta-data area on disk before starting drbd (! on both server!)
drbdadm create-md r0


14. Start drbd on both nodes (service drbd start)


service drbd start

15. Verify that both server are secondary


cat /proc/drbd


16. As you can see, both nodes are secondary, which is normal. we need to decide
which node will act as a primary now (voipserver.drbd) : that will initiate the first 'full
sync' between the two nodes:
drbdadm -- --overwrite-data-of-peer primary r0

17. Launch the command and wait until its finish synchronizing

watch -n 1 cat /proc/drbd

18. We can now format /dev/drbd0 and mount it on voipserver.drbd:


mkfs.ext3 /dev/drbd0
mkdir /replica


mount /dev/drbd0 /replica

19. We can determine the role of a server by executing the following;


drbdadm role r0
The primary server should return;

3ULPDU\6HFRQGDU\

20. Now we will copy all of the directories we want synchronized between the two
servers to our new partition, remove the original directories and then create
symbolic links to replace them on voipserver.drbd
cd /replica
amportal chown
tar -zcvf etc-asterisk.tgz /etc/asterisk
tar -zxvf etc-asterisk.tgz
tar -zcvf var-lib-asterisk.tgz /var/lib/asterisk
tar -zxvf var-lib-asterisk.tgz
tar -zcvf usr-lib-asterisk.tgz /usr/lib/asterisk/
tar -zxvf usr-lib-asterisk.tgz
tar -zcvf var-spool-asterisk.tgz /var/spool/asterisk/
tar -zxvf var-spool-asterisk.tgz
tar -zcvf var-lib-mysql.tgz /var/lib/mysql/
tar -zxvf var-lib-mysql.tgz
tar -zcvf var-log-asterisk.tgz /var/log/asterisk/
tar -zxvf var-log-asterisk.tgz
tar -zcvf var-www.tgz /var/www/
tar -zxvf var-www.tgz
rm -rf /etc/asterisk
rm -rf /var/lib/asterisk
rm -rf /usr/lib/asterisk/
rm -rf /var/spool/asterisk
rm -rf /var/www

rm
rm
ln
ln
ln
ln
ln
ln
ln
cd

-rf /var/lib/mysql/
-rf /var/log/asterisk/
-s /replica/etc/asterisk/ /etc/asterisk
-s /replica/var/lib/asterisk/ /var/lib/asterisk
-s /replica/usr/lib/asterisk/ /usr/lib/asterisk
-s /replica/var/spool/asterisk/ /var/spool/asterisk
-s /replica/var/lib/mysql/ /var/lib/mysql
-s /replica/var/log/asterisk/ /var/log/asterisk
-s /replica/var/www /var/www
/

21. Stop mysqld, asterisk and httpd services on voipserver.drbd


service
service
service
service
service
service

mysqld restart
mysqld stop
asterisk stop
httpd stop
elastix-updaterd stop
elastix-portknock stop

22. Verify services are down and proceed to switch manually to the second server:
[root@voipSERVER.drbd /]# umount /replica ; drbdadm secondary r0


Now switch to the VOIPBACKUP server

[root@voipBACKUP.drbd /]#  mkdir /replica ; drbdadm primary r0 ; mount /dev/drbd0 /replica


[root@voipBACKUP.drbd /]# ls /replica/
Note: This is used to check if you are replicating information on both servers. You should
see all data replicated in the secondary server just like data in the primary.
* DO NOT perform this action with the physical terminal logged in. Use SSH. Otherwise, it will fail to
unmount the /replica folder for some reason! Also make sure you are not IN the replica folder. Type "cd /" .
23. Verify voipserver.drbd status (Primary/Secondary)
drbdadm role r0

24. Execute df h on the primary to confirm that our /dev/drbd0 partition is


mounted and in use.
>URRW#YRLSVHUYHUa@GIK
)LOHV\VWHP6L]H8VHG$YDLO8VH0RXQWHGRQ
GHYVGD***
WPSIV**GHYVKP
GHYGUEG*0*UHSOLFD

Note: Executing this same command in voipbackup.drbd while in secondary mode should
not display the /dev/drbd0 partition unless its assuming primary mode.
25. Now we will remove and link on voipbackup.drbd
rm -rf /etc/asterisk
rm -rf /var/lib/asterisk
rm -rf /usr/lib/asterisk/

rm
rm
rm
rm
ln
ln
ln
ln
ln
ln
ln

-rf /var/spool/asterisk
-rf /var/lib/mysql/
-rf /var/log/asterisk/
-rf /var/www
-s /replica/etc/asterisk/ /etc/asterisk
-s /replica/var/lib/asterisk/ /var/lib/asterisk
-s /replica/usr/lib/asterisk/ /usr/lib/asterisk
-s /replica/var/spool/asterisk/ /var/spool/asterisk
-s /replica/var/lib/mysql/ /var/lib/mysql
-s /replica/var/log/asterisk/ /var/log/asterisk
-s /replica/var/www /var/www

26. Stop mysqld, asterisk and httpd services on voipbackup.drbd


service
service
service
service
service
service

mysqld restart
mysqld stop
asterisk stop
httpd stop
elastix-updaterd stop
elastix-portknock stop

27. Now switch back to the first server :


[root@voipBACKUP.drbd /]# umount /replica/ ; drbdadm secondary r0



Now switch to the VOIPSERVER server

[root@voipSERVER.drbd /]# drbdadm primary r0 ; mount /dev/drbd0 /replica

28. Drbd is working ... let's be sure that it will always be started:
chkconfig drbd on

+HDUWEHDW&RQILJXUDWLRQ
29. Remember to stop any boot up services on both servers that should be controlled
by heartbeat. These services will be controlled by heartbeat on the server that is in
control.
chkconfig asterisk off
chkconfig mysqld off
chkconfig httpd off
chkconfig elastix-updaterd off
chkconfig elastix-portknock off
service mysqld stop
service asterisk stop
service httpd stop
service elastix-portknock stop
service elastix-updaterd stop

30. Let's configure a simple /etc/ha.d/ha.cf file on voipserver.drbd :


debugfile /var/log/ha-debug
logfile /var/log/ha-log

10

logfacility local0
keepalive 2
deadtime 30
warntime 10
initdead 120
udpport 694
bcast eth0
auto_failback off
node voipserver.drbd
node voipbackup.drbd

NOTE: I've set auto_failback to off. This seems more appropriate to me.
use the following command on the current secondary to switch back:
sh /usr/lib/heartbeat/hb_takeover


31. Create also the /etc/ha.d/authkeys on voipserver.drbd:


auth 1
1 sha1 MySecret

32. Change permissions on the /etc/ha.d/authkeys file on voipserver.drbd:


chmod 600 /etc/ha.d/authkeys

33. Edit /etc/ha.d/haresources on voipserver.drbd: (It is two lines!!!!!!! Formating is


important). Replace the email addresses with your own, on the second line.
voipserver.drbd drbddisk::r0 Filesystem::/dev/drbd0::/replica::ext3 IPaddr::192.168.1.245/24/eth0/192.168.1.255 mysqld asterisk httpd elastix-updaterd elastix-portknock
voipserver.drbd MailTo::your@emailgoeshere.com,your@emailgoeshere.com::DRBD/HA-ALERT

34. Start the heartbeat service on voipserver.drbd :


service heartbeat start

35. Replicate now the ha.cf, authkeys and haresources to voipbackup.drbd and start
heartbeat
[root@voipserver.drbd ha.d]# scp /etc/ha.d/ha.cf /etc/ha.d/authkeys /etc/ha.d/haresources
root@voipbackup.drbd:/etc/ha.d/
[root@svoipbackup.drbd ha.d]# service heartbeat start

36. Configure heartbeat to initialize at boot on both server




chkconfig --add heartbeat


chkconfig heartbeat on

37. Verify voipserver.drbd status (Primary/Secondary)


drbdadm role r0


38. Execute df h on the primary to confirm that our /dev/drbd0 partition is
mounted and in use.

11

>URRW#YRLSVHUYHUa@GIK
)LOHV\VWHP6L]H8VHG$YDLO8VH0RXQWHGRQ
GHYVGD***
WPSIV**GHYVKP
GHYGUEG*0*UHSOLFD


39. Test your work by creating a SIP extension or anything inside Elastix Web
Interface, then shut down your primary server while making a continuous ping to
192.168.1.245 (floating IP address) verifying it doesnt lose connectivity. Make
another change in the secondary server, turn your primary back on, and all
changes should be kept intact.
Special Note: Any changes made to asterisk files should be done via web Interface
ONLY. Do not attempt to upgrade Elastix version once finished the cluster or else it will
write its own files again discarding links to the /replica directory.
Troubleshooting:
tcpdump i eth0:0 s 1500 w captura.pcap #capture traffic
mv captura.pcap /var/www/html #move file to web for download

5HIHUHQFHV
http://wiki.centos.org/HowTos/Ha-Drbd
http://support.red-fone.com/downloads/elastix/Elastix HA Cluster.pdf
http://danielaliaman.com/blog/files/phonecube/cluster/AsteriskCluster.pdf
http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html

Author: Nick Ross (based on the work of Daniel Geuvara and Amjad Jabali)

12

APPENDIX A
I changed a few things within the original 2.3 document. Below is my description and
reasoning for all of these changes.
CHANGE 1 - I edited the instructions for only one network card. While its best in practice to use a second network
card for DRBD, virtual machines make this irrelevant. To implement this, I lowered the rate at which DRBD replicates
from 100M to 5M (40 Mbps). This is to avoid creating congestion of VoIP traffic on replication. After the initial sync,
very little data will be transferred, so this is likely not to be an issue. Additionally, DRBD now uses the same
IP address (192.168.1.242 & .243), rather than a separate subnet. Remember to change the IP addresses to those
of your servers. The "/etc/ha.d/haresources" file conatins the IP address for the cluster, which should also be set by you.
_
IF YOU WISH TO USE A SECOND NIC, you can easily do so by going to STEP 11 to edit drbd.conf and changing
the IP addresses in this file to match those of your dedicated NICs. You will also want to go to STEP 30, and change
ha.cf to state "bcast eth1" instead of eth0.
CHANGE 2- In STEP 11 after-sb-0pri is set to "discard-least-changes" . I felt this was a better option for asterisk. In
the event that a split brain occurs, it really doesn't matter who was primary last. What matters is who was the functioning
server for outgoing calls. The functioning server is likely to have had more changes made, and therefore have the most
current configuration. This is important, because the sb2-pri setting refers back to the sb-0pri setting.
CHANGE 3 - Being logged into the physical console/terminal caused Step 22 to fail. I made an entry in the notes.
CHANGE 4 - Step 28 was changed to "chkconfig drbd on" . It previously had "chkconfig drbd83 on" . This command
did not work for me on 2.4, but my command did work
CHAGE 5 - Step 30 was changed to set "auto_failback" to off. Every time the server fails over or fails back, service is interrupted.
Personally, I'd like to investigate the reason for the failover before I switch back. If the backup server fails,
you will still be able to automatically failover to the original primary server, so there is little risk to this. On the other hand, automatically
failing BACK to the original server could cause constant interruptions if it had a bouncing NIC or the server
keeps restarting due to a hardware issue.
_
Manual failback can be completed by the command "sh /usr/lib/heartbeat/hb_takeover" on
whichever server is secondary, or "sh /usr/lib/heartbeat/hb_standby" on whatever server is
currently the primary.
CHANGE 6 - Step 36 was edited to add "chkconfig heartbeat on" . Heartbeat did not automatically
start until this command was added.
CHANGE 7 - Step 33 added "elastix-updaterd elastix-portknock" to haresources file. The services were also added to
Steps 21,26 and 29. They rely on /var/www resources, so they can't be loaded at boot.
CHANGE 8 - Step 20 & 25 added "rm -rf /var/www" as it was missing from previous documents
CHANGE 9 - Step 33, added a line for email notifications.
CHANGE 10- Step 20, added in "amportal chown". This is our last chance to do this and ensure ownership, as it does not work
once the files are moved to the drbd database. Otherwise, we could be stuck without ownership, which can happen after an update.
- Appendix and Changes by Nick R.

APPENDIX B
Flash Operator Panel Fix
The Flash Operator Panel is dependent on a process in order to receive updates made in FreePBX/Elastix.
This process is normally loaded on boot, and called within the /etc/init.d/rc.local file. It requires files in
/var/www/ to successfully launch. Unfortunately,since we are using DRBD, the clustered data holds /var/www
and is not available on boot. We must configure heartbeat to launch this process, but first we must make
a Linux Standard Base service script. YOU MUST DO THIS ON BOTH THE PRIMARY AND THE SECONDARY
SERVERS!
This only matters if you actually use the flash operator panel. If you don't, then there is little reason to do this.
Step 1 - Type the following command on VOIPSERVER (which I'll assume is in the PRIMARY role)
nano /etc/init.d/fop_start
Step 2 - In the editor, copy the fop_start script, then save & exit (CTRL+O then CTRL+X). The fop_start script is found
on the following page (sorry, it didn't fit on this page!).
Step 3 - Type the command to set the appropriate permissions
chmod 755 /etc/init.d/fop_start
Step 4 - Edit the /etc/ha.d/haresources file ( you can type "nano /etc/ha.d/haresources") and add "fop_start" to the end
of the first line. It should now look like this:
voipserver.drbd drbddisk::r0 Filesystem::/dev/drbd0::/replica::ext3 IPaddr::192.168.1.245/24/eth0/192.168.1.255 mysqld asterisk httpd elastix-updaterd elastix-portknock fop_start
voipserver.drbd MailTo::your@emailgoeshere.com,your@emailgoeshere.com::DRBD/HA-ALERT

Step 5- You can test to make sure the fop_start service is working by typing:
service fop_start start
service fop_start status
service fop_start stop
This should produce a bunch of messages about amportal.conf. It then should end with an "OK" for the stop and the
start command. If it does not, then your current server might be the secondary (in which case a "FAILED" result is
to be expected). To check if the current server is the primary, type "drbdadm role r0".
Step 6 - REPEAT STEPS 1-4 ON THE OTHER SERVER (VOIPBACKUP)! Step 5 will result in a "FAILED" message
until it assumes the primary role. Running "sh /usr/lib/heartbeat/hb_takeover" will make you assume the primary role.

Script for /etc/init.d/fop_start


#!/bin/bash
# description: FOP daemon
# process name: fop_start
# Author: Nick Ross
. /etc/init.d/functions
[ -f /usr/sbin/amportal ] || exit 0
FOP="/usr/sbin/amportal"
FXO="/usr/sbin/fxotune"
RETVAL=0
getpid() {
pid=`ps -eo pid,comm | grep "op_server.pl" | awk '{ print $1 }'`
}
start() {
echo -n $"Starting FOP: "
getpid
if [ -z "$pid" ]; then
$FXO -s >dev/null
$FOP start_fop > /dev/null
RETVAL=$?
fi
if [ $RETVAL -eq 0 ]; then
touch /var/lock/subsys/fop_start
echo_success
else
echo_failure
fi
echo
return $RETVAL
}
stop() {
echo -n $"Stopping FOP: "
getpid
RETVAL=$?
if [ -n "$pid" ]; then
$FOP stop_fop > /dev/null
sleep 1
getpid
if [ -z "$pid" ]; then
rm -f /var/lock/subsys/fop_start
echo_success
else
echo_failure
fi
else
echo_failure
fi
echo
return $RETVAL
}
# See how we were called.
case "$1" in
start)
start
;;
stop)
stop
;;
status)
getpid
if [ -n "$pid" ]; then
echo "FOP (pid $pid) is running..."
else
RETVAL=1
echo "FOP is stopped"
fi
;;
restart)
stop
start
;;
*)
echo $"Usage: $0 {start|stop|status|restart}"
exit 1
;;
esac
exit $RETVAL

APPENDIX C
Updating Asterisk With DRBD
Step 1 - Keep data on both servers by shutting down one server. We want to make sure
only one server is on at a time. I shut down "voipbackup", and leave "voipserver" on.
Step 2 - Stop Services on the active machine, and make sure they always stay off (heartbeat manages them)
chkconfig asterisk off
chkconfig mysqld off
chkconfig httpd off
service mysqld restart
service mysqld stop
service asterisk stop
service httpd stop
service elastix-updaterd stop
service elastix-portknock stop
service fop_start stop
Step 3 - Remove links from the /replica folder to the file system
unlink /etc/asterisk
unlink /var/lib/asterisk
unlink /usr/lib/asterisk
unlink /var/spool/asterisk
unlink /var/lib/mysql
unlink /var/log/asterisk
unlink /var/www/www
unlink /var/www
Step 4 - Copy all the files in the /replica/ folder back to the root partition
mkdir /etc/asterisk
mkdir /var/lib/asterisk
mkdir /usr/lib/asterisk
mkdir /var/spool/asterisk
mkdir /var/lib/mysql
mkdir /var/log/asterisk
mkdir /var/www
cp -arf /replica/etc/asterisk/* /etc/asterisk/
cp -arf /replica/var/lib/asterisk/* /var/lib/asterisk/
cp -arf /replica/usr/lib/asterisk/* /usr/lib/asterisk/
cp -arf /replica/var/spool/asterisk/* /var/spool/asterisk/
cp -arf /replica/var/lib/mysql/* /var/lib/mysql/
cp -arf /replica/var/log/asterisk/* /var/log/asterisk/
yes | cp -arf /replica/var/www/* /var/www/

STEP 5 - Remove everything in the /replica folder.


rm -rf /replica/*
STEP 6 - Run the yum update for elastix, then "amportal chown" to ensure proper ownership after the update:
yum -y update elastix*
amportal chown
STEP 7- Recopy and link everything to the /replica folder.
cd /replica
tar -zcvf etc-asterisk.tgz /etc/asterisk
tar -zxvf etc-asterisk.tgz
tar -zcvf var-lib-asterisk.tgz /var/lib/asterisk
tar -zxvf var-lib-asterisk.tgz
tar -zcvf usr-lib-asterisk.tgz /usr/lib/asterisk/
tar -zcvf var-www.tgz /var/www/
tar -zxvf usr-lib-asterisk.tgz
tar -zcvf var-spool-asterisk.tgz /var/spool/asterisk/
tar -zxvf var-spool-asterisk.tgz
tar -zcvf var-lib-mysql.tgz /var/lib/mysql/
tar -zxvf var-lib-mysql.tgz
tar -zcvf var-log-asterisk.tgz /var/log/asterisk/
tar -zxvf var-log-asterisk.tgz
tar -zxvf var-www.tgz
rm -rf /etc/asterisk
rm -rf /var/lib/asterisk
rm -rf /usr/lib/asterisk/
rm -rf /var/spool/asterisk
rm -rf /var/lib/mysql/
rm -rf /var/log/asterisk/
rm -rf /var/www
ln -s /replica/etc/asterisk/ /etc/asterisk
ln -s /replica/var/lib/asterisk/ /var/lib/asterisk
ln -s /replica/usr/lib/asterisk/ /usr/lib/asterisk
ln -s /replica/var/spool/asterisk/ /var/spool/asterisk
ln -s /replica/var/lib/mysql/ /var/lib/mysql
ln -s /replica/var/log/asterisk/ /var/log/asterisk
ln -s /replica/var/www /var/www
cd /

Step 8 - Shut down the Primary server (voipserver)


Step 9- Start up secondary (voipbackup), run the command
sh /usr/lib/heartbeat/hb_takeover
Step 10 - Repeat Steps 1-7 on voipbackup
Step 11 - Boot up the "voipserver", then once its up for a few minutes and synced,
restart "voipbackup". Use the command "service drbd status" to check syncing. If you
did everything right, you will have an updated elastix/asterisk system, and the DRBD
date will reconcile.
1

APPENDIX D
RESIZING OR MOVING the DRBD Partition
This is a very long and drawn out process, and is not for the weak of heart. The concept is similar to the previous section,
where we move everything off the DRBD partition, and then back on. If you've changed the partition, or want to specify a
different size than what I have, then make the changes accordingly. Please ensure you perform the steps on the correct
server.
%%%%% ON VOIPBACKUP (#2) %%%%%
STEP 1 - We have to make sure that VOIPBACKUP will not start DRBD or HEARTBEAT on reboot.
chkconfig heartbeat off
chkconfig drbd off
<Shut down the VOIP BACKUP server>
@@@@ ON VOIPSERVER (#1) @@@@
Step 2 - We must turn all the services off, unlink the folders in replica, then copy the data back to the root partition
chkconfig asterisk off
chkconfig mysqld off
chkconfig httpd off
service mysqld restart
service mysqld stop
service asterisk stop
service httpd stop
service elastix-updaterd stop
service elastix-portknock stop
service fop_start stop
unlink /etc/asterisk
unlink /var/lib/asterisk
unlink /usr/lib/asterisk
unlink /var/spool/asterisk
unlink /var/lib/mysql
unlink /var/log/asterisk
unlink /var/www/www
unlink /var/www
mkdir /etc/asterisk
mkdir /var/lib/asterisk
mkdir /usr/lib/asterisk
mkdir /var/spool/asterisk
mkdir /var/lib/mysql
mkdir /var/log/asterisk
mkdir /var/www
cp -arf /replica/etc/asterisk/* /etc/asterisk/
cp -arf /replica/var/lib/asterisk/* /var/lib/asterisk/
cp -arf /replica/usr/lib/asterisk/* /usr/lib/asterisk/
cp -arf /replica/var/spool/asterisk/* /var/spool/asterisk/
cp -arf /replica/var/lib/mysql/* /var/lib/mysql/
cp -arf /replica/var/log/asterisk/* /var/log/asterisk/
yes | cp -arf /replica/var/www/* /var/www/
amportal chown
18

@@@@ Still on VOIPSERVER #1 @@@@


Step 3- We must now turn the services off, temporarily prevent them from starting up, and remove & recreate the
DRBD partition. When partitioning please type in the commands manually. It will not copy and paste in, and you
need to be sure you are using the right partitions. Make sureyou've read the information and are answering correctly.
If you are changing the partition or drive, you will need to make modifications accordingly.
chkconfig heartbeat off
chkconfig drbd off
service heartbeat stop
service drbd stop
fdisk /dev/sda
p
d
3
p
n
p
<press enter>
+10000M

(or whatever size you desire)

t
3
83
w
STEP 4- Reboot VOIPSERVER. Start up VOIPBACKUP.
%%%%% ON VOIPBACKUP (#2) %%%%%
STEP 5- We must now force the backup server to a DRBD primary role, and remount the replica folder
service drbd start
drbdadm primary r0 ; mount /dev/drbd0 /replica
STEP 6 - We must perform the copy from the replica folder, back to the root partition, as we did on the other server.
unlink /etc/asterisk
unlink /var/lib/asterisk
unlink /usr/lib/asterisk
unlink /var/spool/asterisk
unlink /var/lib/mysql
unlink /var/log/asterisk
unlink /var/www/www
unlink /var/www
(STEP 6 is continued on the next page)

(STEP 6 is continued here)


mkdir /etc/asterisk
mkdir /var/lib/asterisk
mkdir /usr/lib/asterisk
mkdir /var/spool/asterisk
mkdir /var/lib/mysql
mkdir /var/log/asterisk
mkdir /var/www
cp -arf /replica/etc/asterisk/* /etc/asterisk/
cp -arf /replica/var/lib/asterisk/* /var/lib/asterisk/
cp -arf /replica/usr/lib/asterisk/* /usr/lib/asterisk/
cp -arf /replica/var/spool/asterisk/* /var/spool/asterisk/
cp -arf /replica/var/lib/mysql/* /var/lib/mysql/
cp -arf /replica/var/log/asterisk/* /var/log/asterisk/
yes | cp -arf /replica/var/www/* /var/www/
amportal chown
STEP 7 - We must now end the DRBD service, and repartition VOIPBACKUP, as we did on the previous server
service drbd stop
fdisk /dev/sda
p
d
3
p
n
p
<press enter>
+10000M (or whatever size you desire - just make sure its the SAME SIZE ON BOTH SERVERS!)
t
3
83
w
STEP 8 - Reboot VOIPBACKUP. VOIPSERVER should also be on and running.
!!!!!!!! ON BOTH VOIPSERVER AND VOIPBACKUP !!!!!!
STEP 9- Perform these commands on VOIPSERVER first, then perform them again on VOIPBACKUP. You must
do this on BOTH machines. We run "drbdadm create-md r0" to ensure that the drbd data is a fresh rewrite and not
the old meta data. This ensures it uses the full size of the new partition. Also, do not cut and past "<enter yes>".
mke2fs j /dev/sda3
dd if=/dev/zero bs=1M count=500 of=/dev/sda3; sync
drbdadm create-md r0
<enter yes>
drbdadm create-md r0
service drbd start

@@@@ ON VOIPSERVER (#1) @@@@


STEP 10 - On voipserver only, issue the following commands. Do not copy and paste the second line. It is an
instruction for you to wait. Use "service drbd status" to check if you are 100% synced.
drbdadm -- --overwrite-data-of-peer primary r0
(wait for the systems to sync 100% - check with service drbd status)
mkfs.ext3 /dev/drbd0
mkdir /replica
mount /dev/drbd0 /replica
STEP 11 - We are now going to take the data on VOIPSERVER, and use it to fill the /replica folder. We will then
unmount the replica, put VOIPSERVER in a secondary status, and then set the services to be active when we reboot.
DO NOT REBOOT VOIPSERVER YET.
cd /replica
tar -zcvf etc-asterisk.tgz /etc/asterisk
tar -zxvf etc-asterisk.tgz
tar -zcvf var-lib-asterisk.tgz /var/lib/asterisk
tar -zxvf var-lib-asterisk.tgz
tar -zcvf usr-lib-asterisk.tgz /usr/lib/asterisk/
tar -zcvf var-www.tgz /var/www/
tar -zxvf usr-lib-asterisk.tgz
tar -zcvf var-spool-asterisk.tgz /var/spool/asterisk/
tar -zxvf var-spool-asterisk.tgz
tar -zcvf var-lib-mysql.tgz /var/lib/mysql/
tar -zxvf var-lib-mysql.tgz
tar -zcvf var-log-asterisk.tgz /var/log/asterisk/
tar -zxvf var-log-asterisk.tgz
tar -zxvf var-www.tgz
rm -rf /etc/asterisk
rm -rf /var/lib/asterisk
rm -rf /usr/lib/asterisk/
rm -rf /var/spool/asterisk
rm -rf /var/lib/mysql/
rm -rf /var/log/asterisk/
rm -rf /var/www
ln -s /replica/etc/asterisk/ /etc/asterisk
ln -s /replica/var/lib/asterisk/ /var/lib/asterisk
ln -s /replica/usr/lib/asterisk/ /usr/lib/asterisk
ln -s /replica/var/spool/asterisk/ /var/spool/asterisk
ln -s /replica/var/lib/mysql/ /var/lib/mysql
ln -s /replica/var/log/asterisk/ /var/log/asterisk
ln -s /replica/var/www /var/www
cd /
umount /replica ; drbdadm secondary r0
chkconfig heartbeat on
chkconfig drbd on

%%%% ON VOIPBACKUP (#2) %%%%%


STEP 12 - On VOIPBACKUP, we are now going to assume the primary role and mount the replica partition.
Once this is done we remove the original folders, create the soft links, and then set the services to turn on after a
reboot.
mkdir /replica ; drbdadm primary r0 ; mount /dev/drbd0 /replica
rm -rf /etc/asterisk
rm -rf /var/lib/asterisk
rm -rf /usr/lib/asterisk/
rm -rf /var/spool/asterisk
rm -rf /var/lib/mysql/
rm -rf /var/log/asterisk/
rm -rf /var/www
ln -s /replica/etc/asterisk/ /etc/asterisk
ln -s /replica/var/lib/asterisk/ /var/lib/asterisk
ln -s /replica/usr/lib/asterisk/ /usr/lib/asterisk
ln -s /replica/var/spool/asterisk/ /var/spool/asterisk
ln -s /replica/var/lib/mysql/ /var/lib/mysql
ln -s /replica/var/log/asterisk/ /var/log/asterisk
ln -s /replica/var/www /var/www
chkconfig heartbeat on
chkconfig drbd on
STEP 13 (FINAL STEP) - Reboot VOIPBACKUP first, and then reboot VOIPSERVER. At this point, the servers will
come back up, and one should assume the PRIMARY role, and begin replicating correctly. Your new DRBD partition
should be sized correctly ( run "df -h" to check this on the primary ).
If you have auto_failback set to off, and you do not like which server assumed the primary role, then switch to
whichever server is secondary and type:
sh /usr/lib/heartbeat/hb_takeover
This will set that server as the DRBD primary node. Whew. That was a lot of work. I hope you never have to do this.
It was a complete pain in the ass to both do and to document. The better strategy is to give yourself plenty of space
in the beginning.

APPENDIX E
Using Two Network Interface Cards
The original guide that this was based on used a dual NIC configuration. Using a separate NIC for DRBD is usually a
good idea. You want the replication to work through an independent channel, typically where bandwidth for your
services will not be starved by DRBD replication traffic. There are many situations, however, where this is not
necessarily desirable. For instance, if you use virtual machines, there wouldn't be much point in adding a second
virtual NIC. There is also the case of having an elastix based appliance, with only a single NIC built in, and a second
NIC is unavailable.
Asterisk servers are usually not very disk sensitive, so its unlikely DRBD will be sending much data. If you are worried
about DRBD using too much bandwidth, then you can lower the replication rate limit below 5 MB/s (which is 40 Mbits).
If you wish to use two network cards, the changes to the original implementation are very simple. In this scenario, we
will assume VOIPSERVER has a secondary NIC eth1, with an IP address of 10.1.1.1 . We will assume VOIPBACKUP
has a secondary NIC eth1, with an IP address of 10.1.1.2 .
Note: the ip addresses do not need to be in a different. The can be whatever you assign them to. Just make sure they
are static addresses, or you have set a DHCP reservation for these NICs.
!!!!!!!!! MAKE THE CHANGES ON BOTH SERVERS!!!!!
Step 1- Change the following section of the /etc/drbd.conf:
syncer { rate 5M; }
on voipserver.drbd {
device /dev/drbd0;
disk /dev/sda3;

address 10.1.1.1:7788;
meta-disk internal;
}
on voipbackup.drbd {
device /dev/drbd0;
disk /dev/sda3;

address 10.1.1.2:7788;
meta-disk internal;
}
}
Step 2- Change the following section of the /etc/ha.d/ha.cf file
warntime 10
initdead 120
udpport 694

bcast eth1
auto_failback off
node voipserver.drbd
node voipbackup.drbd
Step 3 - Reboot, both servers and you are done.

APPENDIX F
Moving TFTPBOOT
The TFTP boot can be easily moved by copying it to the replica folder, deleting the original folder, and then creating a symbolic
link to the replica folder. As you can see, it is very similar to the procedure we've used for all other folders.
Step 1- Go on the active/primary computer and type the following commands
cd /replica
tar -zcvf tftpboot.tgz /tftpboot/
tar -zxvf tftpboot.tgz
rm -rf /tftpboot
ln -s /replica/tftpboot /tftpboot
Step 2- Go on the secondary computer and type these commands
rm -rf /tftpboot
ln -s /replica/tftpboot /tftpboot
That is all there is to it!

APPENDIX G
Inserting a passive third node (no HA)
While DRBD and Heartbeat do not offer the ability to fail over to a third node, you can set up a third node for passive replication. It
will copy everything on the current DRBD cluster, and always be a secondary node. There are many complications with this process,
and the only real practical use is for disaster recovery. That being said, this is an order of magnitude more difficult than the two node
cluster, and this documentation proved too time intensive to write thoroughly. Hopefully, this will at least guide you.
Just a note - in order to do this, you basically have to recreate the original DRBD setup on your clustered partitions. If you already
are using DRBD, follow the guide in Appendix D, to the point where you fdisk your partitions. This will copy your asterisk
configs back to both servers so you do not lose your copies.
We have 3 servers. VOIPSERVER, VOIPBACKUP and VOIPDR (our passive node for disaster recovery).
VOIPDR will be 192.168.1.244.
Step 1- Follow the main guide for steps 1-9 on ALL servers.
Step 2- For step 10 on the original guide, on all servers, you will need to add this line in addition to the others in step 10:
192.168.1.244 voipdr.drbd
Step 3- For Step 11 on the original guide, on all servers, add the following to drbd.conf AFTER the very last bracket:
resource r1 {
protocol A;
net
{
}
syncer { rate 5M; }
stacked-on-top-of r0 {
device
address

/dev/drbd10;
192.168.1.245:7788;

}
on voipdr.drbd {
device
disk
address

/dev/drbd10;
/dev/sda3;
192.168.1.244:7788;

meta-disk internal;
}
}
Step 4- Continue with the original guide, on only voipserver and voipbackup. STOP at step 17. DO NOT perform these steps on VoipDR.
- (skip from step 17 to 28 on the original guide)
Step 5- Perform step 28 & 29 in the original guide, on ALL SERVERS.
Step 6 - We now must configure heartbeat temporarily. This is much earlier than in the original guide, but it is required
to even get the third node up. Perform steps 30,31 and 32 on VOIPSERVER and VOIPBACKUP. There will be
NO HEARTBEAT on VOIPDR... EVER. If you try to do this on VOIPDR you will break everything.

Step 7- Instead of step 33, we must provide a very different /etc/ha.d/haresources file. The haresources file must
now have only the single following line and NO OTHERS:
voipserver.drbd IPaddr::192.168.1.245/24/eth0
Step 8 - You may now follow steps 34-36 on the following guide. Remember, VOIPDR should NOT have heartbeat running,
so you do not need to replicate these steps to VOIPDR.
Step 9 - Now is where we really deviate. We must make resource r1 on top of r0. Basically, we are putting a second
layer on our existing DRBD, which we will replicate with VOIPDR.
Type the following We will start on the primary, which should be VOIPSERVER.
#####VOIPSERVER#####
drbdadm --stacked create-md r1
drbdam --stacked up r1
drbdadm --stacked adjust r1
Step 10 - Now we prepare VOIPDR, but in a slightly different way
%%%%%% ON VOIPDR %%%%%%
drbdadm create-md r1
service drbd start
drbdadm adjust r1
Step 11- Now we must have VOIPSERVER assume the primary role for stacked resource r1, and force a replication
####### ON VOIPSERVER ########
drbdadm --stacked -- --overwrite-data-of-peer primary r1
You should wait until this has finished before continuing. Use "service drbd status" to check the replication progress.
Please note, we have not touched VOIPBACKUP at all in this process... and that is ok. It is not necessary. All of this
data will automatically replicate over to VOIPACKUP if VOIPSERVER fails (or it will after we finalize the heartbeat settings).
Step 12 - Once this is completed, we can mount the resource on VOIPSERVER. Run the following commands:
####### ON VOIPSERVER #########
mkfs.ext3 /dev/drbd10
mkdir /replica
mount /dev/drbd10 /replica
Notice that we are using /dev/drbd10 instead of drbd0. drbd10 is the device that represents stacked resource r1, which is on top
of resource r0 and /dev/drbd0. All three device will be using the stacked resource r1 & /dev/drbd10 to mount the data, this way
the data is accessible to all three nodes. If we used r0 & drbd0, VOIPDR would never receive any replicated data.
Step 13- We now must move the asterisk/elastix configuration onto our mounted volume. Still on ##VOIPSERVER##,
follow *only* STEP 20 from the original guide. Do not proceed passed step 20.

Step 14- Now we are going to out in the real heartbeat configuration. The configuration we had before was only temporary,
and it was needed to simply bring VOIPDR into the replication. The following is a copy of the ha resources file you should
have. It needs to match EXACTLY. This must be done on BOTH, VOIPSERVER and VOIPBACKUP. Do NOT do this on
VOIPDR.
voipserver.drbd IPaddr::192.168.1.245/24/eth0/192.168.1.255 drbdupper::r1 Filesystem::/dev/drbd10::/replica::ext3 mysqld asterisk httpd elastix-updaterd elastix-portknock
voipserver.drbd MailTo::your@emailhere.com,your@emailhere.com::DRBD/HA_MAIN-NODE

Please note, if you have implemented the flash operator panel fix (Appendix B), please make the appropriate changes.
Step 15- At this point, we need to start preparing VOIPBACKUP, which is a little tricky. With a stacked resource its a little
more complicated to unmount and switch primaries... but first things first. Lets put a /replica/ foldero n VOIPBACKUP.
$$$$$$$$ ON VOIPBACKUP $$$$$$$$$$$
mkdir /replica
Step 16- At this point, the easiest way to finish up the process is to reboot and have VOIPBACKUP take the primary role
via heartbeat. Reboot both ##VOIPSERVER## and $$VOIPBACKUP$$, but give $$VOIPBACKUP$$ a 30 second headstart.
If VOIPBACKUP does not come up in the primary role, you can run "sh /usr/lib/heartbeat/hb_takeover" on it.
Step 17- The /replica/ folder should be populated now on $$VOIPBACKUP$$. If so, you can complete the transfer by
following step 25 on the original guide. Once you are done, you can give ##VOIPSERVER## control by typing in
"sh /usr/lib/heartbeat/hb_takeover" on that server.
THE GUIDE IS OVER. STOP HERE.
THE INSTRUCTIONS BELOW ARE FOR DISASTER RECOVERY ONLY!!!!!
DO NOT FOLLOW TRY TO FOLLOW THESE STEPS UNLESS YOU HAVE TO.
You may be wondering why you've done so little on the third node, VOIPDR. The short answer is that VOIPDR doesn't do much.
Once you set up DRBD on VOIPDR, it just sits there and replicates. There is no heartbeat, no failover, etc. If something ever
were to happen to the other two nodes, VOIPDR would have to be activated manually.
Assuming some disaster, where the other two nodes caught fire and burned, here is a quick rundown of what you'd do.
%%%%% ON VOIPDR - BUT ONLY IN THE EVENT OF A DISASTER %%%%%
Step 1)
mkdir /replica
drbdadm primary r1
mount /dev/drbd10 /replica
Step 2) FOLLOW STEP 25 ON THE ORIGINAL GUIDE to prepare the links and remove the old asterisk files.
Step 3) issue the following commands:
service mysqld start
service asterisk start
service httpd start
service elastix-updaterd start
service elastix-portknock start
fxotune -s
amportal start_fop
Now, edit the phone configurations so that they connect to VOIPDR. Its not nearly as elegant, but it will work. Alternatively,
instead of STEP 25, you can just delete the original folders, then copy them from the /replica/ folder back to their "real"
locations. After you do this, you can stop drbd ("service drbd stop"), and change the ip address of the server to that of the
old cluster (192.168.1.245). The phones will now automatically find the server and resume operations.

APPENDIX H
IP Sourcing
One of the problems I've encountered in a clustered environment is traffic originating from the original IP address of a device,
instead of the cluster's IP address. This can cause problems with static NAT, firewall rules and port forwarding (among other
things). Ideally, all of our heartbeat monitored services will send traffic from the cluster IP address of 192.168.1.245 . Luckily,
the heartbeat service can do just that, with the command IPsrcaddr::192.168.1.245 . This command should be added after
declaring the cluster IP address in /etc/ha.d/haresources .
It would look something like the following:
voipserver.drbd IPaddr::192.168.1.245/24/eth0/192.168.1.255 IPsrcaddr::192.168.1.245 drbdupper:r0 ......
Below are examples for our 2-way and 3-way replicacation guides
Two way replication:
voipserver.drbd IPaddr::192.168.1.245/24/eth0/192.168.1.255 IPsrcaddr:192.168.1.245 drbddisk::r0 Filesystem::/dev/drbd0::/replica::ext3 mysqld asterisk httpd elastix-updaterd elastix-portknock
voipserver.drbd MailTo::your@emailhere.com,your@emailhere.com::DRBD/HA_MAIN-NODE

Three way replication:


voipserver.drbd IPaddr::192.168.1.245/24/eth0/192.168.1.255 IPsrcaddr::192.168.1.245 drbdupper::r1 Filesystem::/dev/drbd10::/replica::ext3 mysqld asterisk httpd elastix-updaterd elastix-portknock
voipserver.drbd MailTo::your@emailhere.com,your@emailhere.com::DRBD/HA_MAIN-NODE

If you use the Flash operator panel, and use the fop_start guide in the appendix, remember to add the corresponding entry to
the end of your /etc/ha.d/haresources file.

I hope you found these documents useful.


Sincerely,
Nick Ross. of TheServerExpert.com and CTC.
Please contact me - nick (at) theserverexpert.com
if you have any comments or find any mistakes with the guide.

Das könnte Ihnen auch gefallen