Sie sind auf Seite 1von 8

c 



A split brain occurs when two independent systems configured in a cluster assume they have
exclusive access to resources. In SFW HA (VERITAS Cluster Server) this scenario can be
caused when all cluster heartbeat links are simultaneously lost. Each cluster node will then
mark the other cluster node as FAULTED. This is known as a "network partition".

This is represented in the figure below:

This scenario is possible when both of the LLT (Low Latency Transport cluster communication
links) are connected to Node 3 via the same IP network, for example, the same network switch.
This configuration of the common network switch needs careful consideration in Replicated
Data Cluster (RDC) where Node 3 may be located in another Data Centre to get LLT links over
separate network infrastructure.

c  

 

Under cluster logic, VCS will online any groups that it now considers faulted. The service
groups however will be online on the other cluster node(s), that have formed a new cluster.
This may lead to disk resources and volumes being off-lined as each cluster attempts to online
the "failed" service groups.

   
 

Symptoms of a split brain are where a service group is attempted to be on-lined on a cluster
node on the other "side" of the network partition while it is still online elsewhere. Initial errors
will involve the original node recording disk access errors and loss of reservation of the disk
group.

Using the above diagram as an example, after simultaneous LLT link failure creating a network
partitions:
- Partition A containing Node 0, 1, 2
- Parition B containing Node 3

a) in the system event log LLT will log Event ID 10033 for links expired in the other partition, so
Node 3 will log messages such as:
ERROR 10033(0xc0072731) LLT <server> Link expired (tag=Adapter1, link=1, node=1)
ERROR 10033(0xc0072731) LLT <server> Link expired (tag=Adapter0, link=0, node=1)

for node=0, node=1, node=2, and cluster nodes in Partition A will log LLT link expired messages
for node = 3

b) in the application event log the High Availability Daemon (HAD) will log that cluster nodes in
the other partition have changed to state FAULTED, so Node 3 will log:
ERROR 10322(0x05dd2852) Had <server> VCS ERROR V-16-1-10322 System
<server> (Node '0') changed state from RUNNING to FAULTED

for Node '0', Node '1' and Node '2', and cluster node in Partition A will log these messages
against Node '3'.


  
  

VCS uses heartbeats to determine the "health" of its peers. These can be private network
heartbeats and/or public (low-priority) heartbeats. Regardless of the heartbeat configuration,
VCS determines that a system has faulted when all heartbeats fail simultaneously. To prevent a
split brain, following measures can be taken into considerations:

     - Ensure at least 2 private heartbeats are configured and these must be
completely isolated from each other so the failure of one heartbeat link cannot possibly affect
the other. Configurations such as running two shared heartbeats to the same hub or switch, or
using a single virtual local area network (VLAN) to trunk between two switches induce a single
point of failure in the heartbeat architecture.

Refer to the reference Technote in the Related Document section for additional
recommendations on the private heartbeat configurations for SFW HA.

      - Heartbeat over public network does minimum traffic over the
network until you get down to one normal heartbeat remaining. Then it becomes a full functional
heartbeat.
In Replicated Data Cluster, minimize the effects of split-brain by ensuring the cluster heartbeat
links pass through similar physical infrastructure as the replication links so that if one breaks, so
does the other.

ë  
   

 
 ë 
 

 
 
  
  

  
   
  
ë 
 
!
 ! "#$"!ë 


 

% &
 
! 
 '!
 
(  ) 
  
(
  


 *


+

 !

)



(  )) 
(  
     )      
 


 

!! ) 

 

 
   !)
    
  )



 
 
   ,



  !)
! ! 
 
 $!

!  
  

)

  )  - 

  

  

 
 
 !
) )


 ! 
 


(
)% 


  
  



 
 % 

  )  

  

 ! 

 

!


 
   
.   
!)

 


   !
! 
 



  
  !


 

 
 
 ! 



/  
  


 
 
 !  
 
0
 
 
 




!
! $!
 
     
  +&/1 2
!


$
 
  
)



 
    
  ) 
 ) 
 

 
 !
) )


 !
!  



)
!
 
 1


  
ë )
) 
  

 
   
 

+

)$!



 !

  
 
)


) ) 
   )


 

      !)

  

 
 !

3 
)!     ë*4$+565787

%&
 
 9
9
  
 

  
!)
 

 
$+ 
 



$+ !

 

  
    

!0
 

!
+
"
$
! +"$ :1$; 

!  
  
 
 




 
2   
    

 
    ))  


 
  




  
 ' !
 


  

!
) !
 ) 



 ë 




  

 


  

!  


% 



  
 
 
  

  
   
  !
   )   

ë       ) 
:+")$"  
 


 
 *

     

 
 
   

! 9


    



 
    
!
, 



    
    *


 

))
 
  $!    


 

 
)!9

ë 0


  
 



   
  
!) 
) 
 
 !ë  








  
!ë    )
)9   

     < <<% <2 ) !


   

)   )
  $
  
  
  /)
) < <    <  <  '
 


< <  



! 

ë   )! !))) 


)# #ë4 #) 
) !< <  
))
     )
 
 
)
< )<

  
  !
 

!
)) 

 )   1(" ë   
!



!  

< <)


! ë  
 ! 
  
 
 

   <

)) <  
  

 
)) 

  

   


 
 !ë     9

 
  :$
  
(  )  
 =     /9 
!
9 


) :2 
 



  
 9


  

 :$
 

 
  
  
ë

 
!

. 
, 
  
!ë  
  !

 
  
ë  


    
) 
ë  
 

 


 ) 

 )
  
)
* 
)  

) !

) 32  ) 

  
!
 ! 
)
 
)
:  

    ë  
,
 
!
 9

 

 

) 1  ! 
 


)

)
2


!


)  
 ) )
   


!
) >
    )



!
) 

!
!)
> 

   
)

!)
!ë  

! 
  !
99 ! 
 !99


#
#ë4 #!#!)
 !! !
 
  

 ! !
 !


99     )  
   )  

 ! ! 
 
 

!)
 !

!)
!   
#
#

 #
#) 
! +
 
!   ë

 



   
!)

 !


 

 
 

  
 
 !
& 

 

 


 
   



  03  
0!)

 


!
 



 ! 



"

) 
 
)
 
?      
9
 
   
  !5@7  


*( 
 )) 
)
 
&

! 



 
  )
!
! "&3  ) ë

 
 !) 
5@7   
!


9)
 
" $  
/

" $3
  ! 




 )
" $ 
 ! 0 *

 0!


 
!
 *


!  )!

 
  
 
!

 &  $(
(A  13  


 


  

 9
 

 




 
    
  
!

  

 
    
  

   !)


 )
 

  $  
'
ë
 )
!
 
 )  
!

)
!)





 !*


)7 
  

0  

0!74 *

  !
 

 )
)B 0!B  
: 
2
)
!  



$
ë  

 

ë
$
 !  
  



  

 
) 
 

! .) 
! 
 

Ò Ò


$

! !
< 
9< < 
9
<  << 3
 !
 

         

  
   Ò Ò 
 


   Ò Ò 
 



  Ò Ò 
 


&
!  
 

 
  



 
. 
!

  

  %  

!
)
 
ðð   
  ð
!  " # $% &
 " ' $% &
&
 )
  

ë !

 
/ 
 
 
!

ë  )  
 /  

!)ë 

 )

!)
!
     
  

!


    
  
 

(  ) :$ 
> 
 
>  
        "      
 
> 
" #  
  
&

    
 
  


 
 ) 
!
 
 
    

!

 
 
>  "
 " #
>  "
 " '
> 
 "(
> 
" "( " ð
 " # 
> 
" "( " ð
 " '


 

  (
     


 

  

3
    )
0


 
 2 

)
?!)
  !)  

)  

 )   

   !



 
   



   )   




> 
 "(  ) * "(
> 
" "(  ) *    ( (
> 
 "( + "(
> 
" "( +    ( (
> 
" "( ) *    ( (
> 
 "( , "(
> 
" "( , $ Ò 
> 
" "( - " .
> 
" "( -  /
"
> 
" "( ') 
Ò Ò.Ò Ò   ( ( Ò   ( (
> 
 "( "( 
> 
 "( "(
1  

  (
1$  

! 


)7 0!7
   % !
 )7 0!7

 $"   
 )
  


 !   ))$"
  


ë  
)


 !

 
$"(
1$   

 $"!




 
> 
 "(  , 01# "(
> 
 "(  ) 
> 
" "(  )    2 34533
" " #
> 
" "(  )    2 34533
" " #
> 
" "(  )    2 34533
" " '
> 
" "(  )    2 34533
" " '
> 
 "( 0$, 01 "(
> 
" "( # 2 34533&
> 
" "( , 016  "( 
!  
!
  



 
% 
 

   
 
0 


 
$"  

!
 

  
  3)B  
 


 
(  ) :$ 
 )
 
 
  !!

  
  
  
 )ë 





 !! 
    
  3  

)!)
!
   
! 9 
> 
 "(  #  "(
> 
" "(  $- Ò Ò Ò
> 
" "(   $ Ò Ò Ò  Ò 
> 
" "(  $ Ò Ò Ò  Ò 
> 
 "(  "(
> 
 "(  "(
> 
   "(
> 

 
1  



)  


 )


!
 
 $!

  

 

)


 
> 
 "(
" " #
> 
  "(
 " '

   

 

2 š
< <  

<  <



2 "   
 
 
 )! 
 
   




 
     

2 " 

 
     
) 
 2


Das könnte Ihnen auch gefallen