Sie sind auf Seite 1von 1

Blog Home | INE Home | Members | Contact Us | Subscribe

Free Resources

View Archives

All Access Pass

CCIE Bloggers

22 Understanding MSTP Posted by Petr Lapukhov, 4xCCIE/CCDE in Switching


Feb

Search
64 Comments Search Submit

Introduction
Over time I was thinking of putting together the two blog posts made in the past about MSTP and adding more clarification for MSTP multi-region section. This new blog post recaps the information posted previously and provides more details this time. Additionally, it discusses some MSTP design-related questions. Both single-region and multiple-region MSTP configurations are reviewed in the post. The reader is assumed to have good understanding of classic STP and RSTP protocols as well as Ciscos PVST/PVST+ implementations.

Categories
Select Category

Table of Contents
Due to the large size of the document, a table of contents is provided for the ease of navigation. Historical Review Logical and Physical Topologies Implementing MSTP Caveats in MSTP Design MSTP Single-Region Configuration Example Common and Internal Spanning Tree (CIST) Common Spanning Tree (CST) Mapping MSTIs to CIST MSTP Multi Region Design Considerations Interoperating with PVST+ Scenario 1: CIST Root and CIST Regional Root Scenario 2: MSTIs and the Master Port Scenario 3: PVST+ and MSTP Interoperation Conclusions Further Reading

Historical Review
In the beginning, there was IEEE STP protocol, which was preceded by DEC and IBM STP variants. All of them utilized the same logic originally proposed by Radia Perlman in 80s, while she was working in DEC. The IEEE version was adapted for use with multiple VLANs using 802.1q frames tagging. A shared spanning-tree, sometimes called Mono Spanning Tree (MST) by Cisco, or more often Common Spanning Tree (CST) was used to create a single loop-free topology. The drawback of this approach is inability to perform VLAN traffic engineering across redundant links: if a link is blocked, it is blocked for all VLANs. Another issue related to STP construction more traffic is forwarded over the links closer to the root bridge, which puts higher demand on the root bridge resources both in terms of CPU and links capacity utilization. To overcome these limitations, Cisco introduced proprietary Per-VLAN Spanning Tree Protocol (PVST), using separate STP instance per VLAN. Initially, PVST was created to be used with Ciscos proprietary ISL encapsulation only, but the later PVST+ version allowed for tunneling PVST BPDUs over 802.1q trunks and IEEE STP domains. PVST allowed for using different logical topology with every VLAN, enhancing basic Layer 2 traffic engineering. Every VLAN may use its own root bridge and forwarding topology allowing for more fair resource utilization. This method has some limitation as it does not deal with the actual network link capacities and utilization, but rather statistically multiplexes VLANS to different topologies. However, this is the limitation inherent to any load-balancing method based on STP. The main problem of PVST was that with the number of VLANs growing, PVST becomes a waste of switch resources and management burden. This is because the number of different logical topologies is usually much smaller than the number of active VLANs. With time, PVST adopted fast-convergence properties introduced by IEEE RSTP protocol, but the core feature of keeping a separate copy of STP per VLAN did not change. Seeing the problems associated with PVST approach, Cisco came with idea of decoupling the concepts of STP instances and VLANs. The initial implementation was called MISTP (Multiple Instances Spanning Tree) and later evolved into IEEE 802.1s standard called MSTP (Multiple Spanning Trees Protocol).

CCIE Bloggers
Brian Dennis CCIE #2210 Routing & Sw itching ISP Dial Security Service Provider Voice Brian McGahan CCIE #8593 Routing & Sw itching Security Service Provider Petr Lapukhov CCIE #16379

Logical and Physical Topologies


The core idea of MSTP is utilizing the fact that a redundant physical topology only has a small amount of different spanning-trees (logical topologies). The figure below shows a ring topology of three switches and three different spanning trees that may result from different root bridge placements.

Routing & Sw itching Security Service Provider Voice Mark Snow CCIE #14073 Voice Security

Popular Posts
IOS XR Teaser - BGP as PE to CE for MPLS L3VPN CCIE SPv3 Rack Rentals Now Available and New CCIE SPv3 Workbook Updates INE R&S CCIE Product and Rack Updates - May 2012

Instead of running an STP instance for every VLAN, MSTP runs a number of VLAN-independent STP instances (representing logical topologies) and then administrator maps each VLAN to the most appropriate logical topology (STP instance). The number of STP instances is kept to minimum (saving switch resources), but the network capacity is utilized in more optimal fashion, by using all possible paths for VLAN traffic. The switch logic for VLAN traffic forwarding has changed a little bit. In order for a frame to be forwarded out of a port, two conditions must be met: first, VLAN must be active on this port (e.g. not filtered) and second, the STP instance the VLAN maps to, must be in non-discarding state for this port. The second property is normally enforced automatically, as MAC addresses are not learned on discarding ports. It is worth reminding that due to multiple logical topologies active on a port, the port could be blocking for one instance and forwarding for another (note that in (R)PVST+ a port is either forwarding or discarding for a VLAN). The figure below demonstrates six VLANs using two MSTP instances, thus reducing the number of STP trees that would be required with PVST from 6 to 2.

Implementing MSTP
The following is a set of implementation-related questions that a theoretical implementation needs to address: Logical Topology Calculation. How to build multiple STP instances (logical topologies) in a single physical topology? Should we run multiple STP instances sending their BPDUs independently? If yes, then how would we distinguish every instances BPDUs? VLAN tags cannot be utilized for that purpose, as STPs are no bound to VLANs anymore. Information Distribution. What protocol should be used to distribute VLAN to instance mapping tables among switches? Should VLAN IDs be placed in BPDUs along with respective instance numbers? Consistency Check. How to ensure the VLANs to instance mapping is consistent across all switches? That is, how would a switch know that another switch maps VLAN X to the same instance?

MISTP vs MSTP
Original Cisco MISTP pre-standard implementation was sending separate BPDUs for each instance. Every BDPU contained instance number and a list of VLANs, mapped on sending switch to this particular instance that allowed for consistency check between the switches. The table mapping VLANs to instance numbers has to be configured on each switch separately. There was no automated mechanism to distribution VLAN to instance mappings between the switches. The final implementation adopted by the IEEE 802.1s standard made this mechanics more elegant and simple. Before we process with discussing IEEEs implementation, lets define MSTP region as a collection of switches, sharing the same view of physical topology partitioning into set of logical topologies. For two switches to become members of the same region, the following attributes must match: Configuration name. Configuration revision number (16 bit value). The table of 4096 elements that map the respective VLANs to STP instance numbers. The IEEE 802.1s implementation does not send BDPUs for every active STP instance separately, nor does it encapsulate VLAN numbers list configuration messages. Instead, a special STP instance number 0 called Internal Spanning Tree (IST aka MSTI0, Multiple Spanning Tree Instance 0) is designated to carry all STP-related information. The BPDUs for IST contain all standard RSTP-style information for the IST itself, as well as carry additional informational fields. Among those fields are configuration name, revision number and a hash value calculated over VLANs to MSTI mapping table contents. Using just this condensed information switches may detect mis-configuration in VLAN mappings by comparing the hash value received from the peer with the local value.

M-Records
By default, all VLANs are mapped to the IST. This represents the case of classic IEEE RSTP with all VLANs sharing the same spanning-tree. Other MSTP instances could be enabled, and they are referred to as Multiple Spanning Tree Instances (MSTIs). Every MSTI assign its own priorities to the switches and use its own link costs to come up with a private logical topology, separate from the IST. Since MSTP does not send MSTIs information in separate BPDUs, this information is piggybacked into the ISTs BPDUs using special M-Record fields (one for every active MSTI). Using TLVs (Type-Length-Value) those fields carry root priority, designated bridge priority, port priority and root path cost among others.

MSTI Tree Construction


Similar to RSTP, every switch emits its own configuration BPDUs, one every Hello seconds. The BDPUs has full information about the IST and carry M-Records for every active MSTI . Using the RSTP convergence mechanics (Proposal & Agreement bits), separate STP instances are constructed for the IST and every MSTI. It is important to notice that fundamental STP timers such as Hello, ForwardTime, MaxAge could only be tuned for the IST. All other instances (MSTIs) inherit the timers from the IST this is the natural result of all MSTI information being piggybacked in IST BPDUs. MSTP has special mechanism to age out old information out of the domain. The IST BDPUs has special field called Remaining Hops. The IST root sends BPDUs with hop count equal to MaxHops (configurable value) and every downstream switch decrements the hop count field on reception of IST BPDU. As soon as hop count becomes zero, the information in BPDU is ignored, and the switch may start declaring itself as a new IST root. The classic MaxAge and ForwardDelay timers are still used when MSTP interacts with RSTP, STP or (R)PVST+ bridges.

STP Dispute
Cisco switches has long time been implementing LoopGuard feature that allows for blocking the non-designated port when it loses the flow of STP BPDUs. This is helpful for detecting unidirectional link (normally on fiber optical links) and preventing Layer 2 loops that could go undetected by STP. Ciscos implementation of MSTP allows for detecting unidirection condition, by comparing the downstream port state reported in BPDUs. If the upstream switch sends superior root bridge information to the downstream bridge but receives the BPDUs with Designated bit set, the upstream switch concludes that the downstream does not hear its BPDUs. The upstream switch then blocks the downstream port and marks it as STP dispute link.

Caveats in MSTP Design


There are some issues that may arise due to the fact that spanning-tree instances are not mapped one-to-one to VLANs. With PVST, pruning a VLAN on a link would also disable the corresponding STP on the same link. Since MSTIs are decoupled from VLANs, every MSTI is running on every link in the region. The MSTIs differ in their decisions to make this link forwarding or blocking. By pruning VLANs you may end up in situation where VLAN is not enabled on the link where the corresponding MSTI is forwarding OR enabled on the link where the corresponding MSTI is blocking. Consider the following example to illustrate this idea:

In this topology, VLANs are manually pruned on trunks. Since the filtering is not consistent with the respective MSTI blocking decisions, VLAN2s traffic is blocked between SW1 and SW2. To avoid this situation, do not use static VLAN pruning method of distributing VLANs across trunks when you have MSTP enabled. A situation equivalent to the one described is when ports connecting the switches are access ports. MSTP runs on these ports and have logical topologies either blocking or forwarding on the ports. Depending on VLANs to MSTI mapping, a given VLAN could be blocked on the access ports due to MSTP decision even though the access VLANs are different, they use the same STP. To avoid this problem, do not run MSTP on access-ports and use them for connecting stub devices only e.g. hosts and leaf switches.

MSTP Single-Region Configuration Example


Now that we have basic understanding of how MSTP works inside a region lets create a sample configuration. Consider the following physical topology of three switches:

The topology has the following VLANs: 1, 10, 20, 30, 40, 50, 60. Our goals for this scenario are: Make VLANs 10,20,30 follow the link from SW3 to SW1. Make VLANs 40,50,60 follow the link from SW3 to SW2. If any of the above links fail, the affectred VLANs should fall-back to the other link. To accomplish this, we create two MSTIs number 1 and 2. SW1 will be the root for instance 1 and SW2 will be the root for instance 2. As for the IST (MSTI0), we make SW3 the root switch for it (though its not recommended to assign root roles to access switches). As for VLAN to MSTI mappings, VLAN 1 will remain mapped to the IST. Remaining VLANs 10, 20 and 30 would map MSTI1, while VLANs 40, 50 and 60 would map to MSTI2. Here is the configuration:

SW1: spanning-tree mode mst ! spanning-tree mst configuration name REGION1 instance 1 vlan 10, 20, 30 instance 2 vlan 40, 50, 60 ! ! Root for MSTI1 ! spanning-tree mst 1 priority 8192 ! interface FastEthernet0/13 switchport trunk encapsulation dot1q switchport mode trunk ! interface FastEthernet0/16 switchport trunk encapsulation dot1q switchport mode trunk SW2: spanning-tree mode mst ! spanning-tree mst configuration name REGION1 instance 1 vlan 10, 20, 30 instance 2 vlan 40, 50, 60 ! ! Root for MSTI 2 ! spanning-tree mst 2 priority 8192 ! interface FastEthernet0/13 switchport trunk encapsulation dot1q switchport mode trunk ! interface FastEthernet0/16 switchport trunk encapsulation dot1q switchport mode trunk SW3: spanning-tree mode mst ! spanning-tree mst configuration name REGION1 instance 1 vlan 10, 20, 30 instance 2 vlan 40, 50, 60 ! ! Root for the IST ! spanning-tree mst 0 priority 8192 ! interface FastEthernet0/13 switchport trunk encapsulation dot1q switchport mode trunk ! interface FastEthernet0/16 switchport trunk encapsulation dot1q switchport mode trunk

The following show commands will demonstrate the effect our configuration has on traffic forwarding:

SW1#show spanning-tree mst configuration Name Revision Instance -------0 1 2 [REGION1] 0 Instances configured 3

Vlans mapped --------------------------------------------------------------------1-9,11-19,21-29,31-39,41-49,51-59,61-4094 10,20,30 40,50,60

------------------------------------------------------------------------------SW1#show spanning-tree mst ##### MST0 Bridge Root vlans mapped: 1-9,11-19,21-29,31-39,41-49,51-59,61-4094 priority priority path cost priority 32768 (32768 sysid 0) 8192 0 8192 (8192 sysid 0) rem hops 19 20 (8192 sysid 0)

address 0019.5684.3700 address 0012.d939.3700 port Fa0/16

Regional Root address 0012.d939.3700 Operational Configured Interface

internal cost 200000

hello time 2 , forward delay 15, max age 20, txholdcount 6 hello time 2 , forward delay 15, max age 20, max hops Role Sts Cost 128.15 128.18 Prio.Nbr Type P2p P2p

---------------- ---- --- --------- -------- -------------------------------Fa0/13 Desg FWD 200000 Fa0/16 Root FWD 200000 ##### MST1 vlans mapped: Bridge Root Interface

10,20,30 priority 8193 (8192 sysid 1)

address 0019.5684.3700 this switch for MST1 Role Sts Cost 128.15 128.18

Prio.Nbr Type P2p P2p

---------------- ---- --- --------- -------- -------------------------------Fa0/13 Desg FWD 200000 Fa0/16 Desg FWD 200000 ##### MST2 vlans mapped: Bridge Root

40,50,60 priority priority cost Prio.Nbr Type P2p 128.18 P2p 32770 (32768 sysid 2) 8194 (8192 sysid 2) rem hops 19 200000

address 0019.5684.3700 address 001e.bdaa.ba80 port Fa0/13

Interface

Role Sts Cost 128.15

---------------- ---- --- --------- -------- -------------------------------Fa0/13 Root FWD 200000 Fa0/16 Altn BLK 200000

SW1#show spanning-tree mst interface fastEthernet 0/13 FastEthernet0/13 of MST0 is designated forwarding Edge port: no Boundary : internal Bpdus sent 561, received 544 Instance Role Sts Cost 0 1 2 Desg FWD 200000 Desg FWD 200000 Root FWD 200000 Prio.Nbr Vlans mapped 128.15 128.15 128.15 1-9,11-19,21-29,31-39,41-49,51-59 61-4094 10,20,30 40,50,60 (default) port guard : none bpdu filter: disable bpdu guard : disable (default) (default) (default) Link type: point-to-point (auto)

-------- ---- --- --------- -------- -------------------------------

SW1#show spanning-tree mst interface fastEthernet 0/16 FastEthernet0/16 of MST0 is root forwarding Edge port: no Boundary : internal Bpdus sent 550, received 1099 Instance Role Sts Cost 0 1 2 Root FWD 200000 Desg FWD 200000 Altn BLK 200000 Prio.Nbr Vlans mapped 128.18 128.18 128.18 1-9,11-19,21-29,31-39,41-49,51-59 61-4094 10,20,30 40,50,60 (default) port guard : none bpdu filter: disable bpdu guard : disable (default) (default) (default) Link type: point-to-point (auto)

-------- ---- --- --------- -------- -------------------------------

The link cost values are much higher than the default STP costs (IEEE standard values), and MSTIx is called MSTx (e.g. IST is MST0). Aside from that, note the term Regional Root which is to be explained in details below.

Common and Internal Spanning Tree (CIST)


As mentioned before, every MSTP region runs special instance of spanning-tree known as IST or Internal Spanning Tree (=MSTI0). This instance mainly serves the purpose of disseminating STP topology information for MSTIs. IST has a root bridge, elected based on the lowest Bridge ID (Bridge Priority + MAC address). The situation changes with multiple MSTP regions in the network. When a switch detects BPDU messages sourced from another region (or STP/PVST+ BPDU), it marks the corresponding port as MSTP boundary. For the convenience, we would call all other ports as internal. A switch that has boundary ports is known as boundary switch. On the figure below you can see three MSTP regions interconnected in ring topology using pair of links between every pair of regions. The links connecting the regions connect the boundary ports. Since every switch has a connection to some other region, all switches are boundary. Notice the simplified notation for link costs and bridge priorities. We will use those to demonstrate how the CIST is constructed. For the simplicity, assume that all link costs inside the region are the same value of 1.

When multiple regions connect together, every region needs to construct its own IST and all regions should build one common CIST spanning across the regions. To see how this is accomplished, first have a look at the structure of MSTP BPDU. On the figure below, notice MSTP uses protocol version 3 as opposed to RSTPs version 2. Version 4 is reserved to SPT Shortest Path Tree new loop prevention and packet bridging standard defined in emerging IEEE 802.1aq document.

The MSTP BPDU contains two important block of information. One, highlighted in red, is related to CIST Root and CIST Regional Root election. As you will see later, CIST Root is elected among all regions and CIST Regional Root is elected in every region. The green block outlines the information about CIST Regional Root (which becomes the IST Root in presence of multiple regions). The CIST Internal Root path cost is the intra-region cost to reach the CIST Regional Root. It is important to keep in mind that IST Root = CIST Regional Root in case where multiple regions interoperate. This transformation is explained further in the text. Now, to define the CIST Root and CIST Regional Root roles: CIST Root is the bridge that has the lowest Bridge ID among ALL regions. This could be a bridge inside a region or a boundary switch in a region. CIST Regional Root is a boundary switch elected for every region based on the shortest external path cost to reach the CIST Root. Path cost is calculated based on costs of the links connecting the regions, excluding the internal regional paths. CIST Regional Root becomes the root of the IST for the given region as well.

CIST Root Bridges Election Process


When a switch boots up, it declares itself as CIST Root and CIST Regional Root and announces this fact in outgoing BPDUs. The switch will adjust its decision upon reception of better information and continue advertising the best known CIST Root and CIST Regional Root on all internal ports. On the boundary ports, the switch advertises only the CIST Root Bridge ID and CIST External Root Path Cost thus hiding the details of the regions internal topology. CIST External Root Path Cost is the cost to reach the CIST Root across the links connecting the boundary ports i.e. the inter-region links. When a BPDU is received on an internal port, this cost is not changed. When a BPDU is received on a boundary port, this cost is adjusted based on the receiving boundary port cost. In result, the CIST External Root Path Cost is propagated unmodified inside any region. Only a boundary switch could be elected as the CIST Regional Root, and this is the switch with the lowest cost to reach the CIST Root. If a boundary switch hears better CIST External Root Path cost received on its internal link, it will relinquish its role of CIST Regional Root and start announcing the new metric out of its boundary ports. Every boundary switch needs to properly block its boundary ports. If the switch is a CIST Regional Root, it elects one of the boundary ports as the CIST Root port and blocks all other boundary ports. If a boundary switch is not the CIST Regional Root, it will mark the boundary ports as CIST Designated or Alternate. The boundary port on a non regional-root bridge becomes designated only if it has superior information for the CIST Root: better External Root Path cost or if the costs are equal better CIST Regional Root Bridge ID. This follows the normal rules of STP process. As a result of CIST construction, every region will have one switch having single port unblocked in the direction of the CIST Root. This switch is the CIST Regional Root. All boundary switches will advertise the regions CIST Regional Root Bridge ID out of their non-blocking boundary ports. From the outside perspective, the whole region will look like a single virtual bridge with the Bridge ID = CIST Regional Root ID and single root port elected on the CIST Regional Root switch. The region that contains the CIST Root will have all boundary ports unblocked and marked as CIST designated ports. Effectively the region would look like a virtual root bridge with the Bridge ID equal to CIST Root and all ports being designated. Notice that the region with CIST Root has CIST Regional Root equal to CIST Root as they share the same lowest bridge priority value across all regions. Have a look at the diagram below. It demonstrates the CIST topology calculated from the physical topology we outlined above. First, SW1-1 is elected as the CIST Root as it has the lowest Bridge ID among all bridges in all regions. This automatically makes region 1 a virtual bridge with all boundary ports unblocked. Next, SW2-1 and SW3-1 are elected as the CIST Regional Roots in their respective regions. Notice that SW3-1 and SW2-3 have equal External Costs to reach the CIST Root but SW3-1 wins the CIST Regional Root role due to lower priority. Keep in mind that in the topology with multiple MSTP regions, every region that does not contain the CIST Root has to change the IST Root election process and make IST Root equal to CIST Regional Root.

Common Spanning Tree (CST)


From the above information, we may conclude that CIST essentially has organization of a two-level hierarchy. The first level treats all regions as virtual bridges and operates with the External Root Path Cost. The firstlevel spanning tree roots in the CIST Root Bridge and encompasses the virtual bridges. This spanning-tree is known as CST or Common Spanning Tree. The CST connects all boundary ports and perceives every region as a single virtual bridge with the Bridge ID equal to CIST Regional Root Bridge ID.

CST is the construct where MSTP interoperates with the IEEE STP/RSTP regions as well. The legacy switch regions join their STP instance with the CST and perceive MSTP regions as transparent virtual bridges, staying unaware of their internal topology. Thus, connecting to IEEE STP/RSTP domains extended the CST. MSTP discovers the appropriate STP version on a boundary link by listening to external and switches to the respective mode of operations (e.g. RSTP/STP). It may happen so that a switch with the lowest Bridge ID belongs to a RSTP/STP region. This situation results in all MSTP regions electing local CIST Regional Roots and considering the new CIST Root located outside MSTP domain. The second level of CIST hierarchy consists of the various MSTP regional ISTs. Every MSTP region builds IST instance using the internal path costs and following the optimal internal topology, using the CIST Regional Root as the IST Root. The changes to CST may affect the ISTs in every region, as those changes may result in re-electing of the new CIST Regional Roots. Changes to the regions internal topologies normally do not affect the CST, unless those changes partition the region.

Mapping MSTIs to CIST


MSTIs are constructed independently in every region, but they have to be mapped to the CIST at the boundary ports. This means inability to load-balance VLAN traffic on the boundary links by mapping VLANs to different instances. All VLANs use the same non-blocking boundary ports, which are either upstream or downstream with respect to the CIST Root. This statement is only valid with respect to the CST paths connecting the regional virtual bridges. Inside any region VLANs follow the internal topology paths, based on the respective MSTI configurations. The MSTIs have no idea of the CIST Root whatsoever; they only use internal paths and internal MSTI root to build the spanning trees. However, all MSTP instances see the root port (towards the CIST Root) of the CIST Regional Bridge as a special Master Port connecting them to the CIST Root bridge. This port serves the purpose of the gateway linking MSTIs to other regions. Recall that switches do not send M-Records (MSTI information) out of boundary ports, only CIST information. Thus, the CIST and MSTIs may converge independently and in parallel. The Master Port will only begin forwarding when all respective MSTI ports are in sync and forwarding to avoid temporary bridging loops.

MSTP Multi-Region Design Considerations


Ethernet is known for its broadcast nature that tends propagating faults across the whole Layer 2 domain. There are tree main problems with Ethernet that affect MSTP designs: Unknown unicast flooding results in traffic surges under topology changes. Those are either result of asymmetric routing or persistent topology changes. Every topology change causes massive invalidation of MAC address tables and unicast traffic flooding. This process is the result of Ethernet topology unawareness the bridges dont know MAC addresses location. Broadcast and Multicast flooding. This is a separate problem as many core protocols (ARP, IGP, PIM) rely on multicasting or broadcasting. Those packets should be delivered to every node in a broadcast domain and under intense load network could be congested at every point. Spanning-Tree Convergence. MSTP uses RSTP procedure for STP re-negotiation. Since it is based on distance-vector behavior, it is prone to some convergence issues, such as counting to infinity (old information circulation). This is especially noticeable in larger topologies with 10+ switches and under special conditions, such as failure of the root bridge. The concept of MSTP region allows for bounding STP re-computations. Since MSTIs in every region are independent, any change affecting MSTI in one region will not affect MSTIs in other regions. This is a direct result of the fact that M-Record information is not exchanged between the regions. However, the CIST recalculations affect every region and might be slow converging. This is why it is a good idea not to map any VLAN to CIST and avoid connecting MSTP regions to IEEE STP domains. Topology changes in MSTP are treated the same way as in RSTP. That is, only non-edge links going to forwarding state will cause a topology change and the switch detecting the change will flood this information through the domain. However, single physical link may be forwarding for one MSTI and blocking for another. Thus, a single physical change may have different effect on MSTIs and the CIST. Topology changes in MSTIs are bounded to a single region, while topology changes to the CIST propagate through all regions. Every region treats the TC notification from another region as external and applies them to CIST-associated ports only. A topology change to CST (the tree connecting the virtual bridges) will affect all MSTIs in all regions and the CIST. This is due to the fact that new link becoming forwarding between the virtual bridges may change all paths in the topology and thus require massive MAC address re-learning. Thus, from the standpoint of topology change, something happening to the CST will have most massive impact of flooding in the set of interconnected MSTP regions. The above observations advise a good design rule for MSTP networks separate meshy topologies in their own regions and interconnect regions using sparse mesh, keeping in mind balance between redundancy and topology changes effect. This is an adaptation of well-know design principle separate complexity from complexity to keep networks more stable and isolate fault domains. In addition, exposing a lot of links to CST will reduce your load-balancing choices, as CST supports only one STP instance. You want to avoid designs like the one diagrammed below, which effectively disabled load balancing on the mesh of links that belong to CST. The reason is that now the full-mesh of links belongs on CST and it elects only one unblocked path between the two regions.

Even though region partitioning offers better fault isolation it still does not eliminate well-known Ethernet issues such as unicast and broadcast flooding. Those may still occur and disrupt network connectivity. For example, unicast flooding could be caused by unidirectional traffic and broadcast flooding may be a result of transient bridging loops when a root bridge fails. Transient bridging loops are reality with RSTP/MSTP especially in larger topologies due to various synchronization problems resulting in count to infinity behavior. This problem is especially dangerous when a root bridge crashes and the remaining topology contains loops old information may circulate until its aged out using hop counting (counting to infinity).

Interoperating with PVST+


Per its design, PSVST+ runs a separate STP instance for every VLAN. On a contrary, MSTP maps VLANs to MSTIs, so one-to-one mapping between VLAN and STP instance no longer holds true. How should an MSTP switch operate on a border link connected to the PVST+ domain? MSTP runs multiple MSTIs inside a region and maps them all to CIST on the border ports. The interoperation model needs to ensure that internal MSTIs could be aware of changes in any of PVST+ trees. Its hard to automatically map VLAN-bounded STPs to the MSTIs and so the simplest way to accomplish the desired behavior is to join ALL PVST+ trees with the CST. By connecting PVST+ trees to the CST, the solution ensure that changes in any of PVST+ STP instances will affect the CST and all MSTIs as a consequence. While not the optimal solution, it ensures that no changes go unnoticed and no black holes occur in a single VLAN due to the topology changes. As with the IEEE STP, every tree in PVST+ domain perceives MSTP regions like virtual bridges with multiple boundary ports. A topology change in any of PVST+ trees will affect the CST and impact every MSTI instance in all MSTP regions. This behavior makes the MSTP topology less stable and fully exposed to changes in PVST+ domain. The MSTP implementation simulates PVST+ by replicating CIST BPDUs on the link facing the PVST+ domain and sending those BPDUs on ALL VLANs active on the trunk. The MSTP switch consumes all BDPUs received from PVST+ domain and processes them using the CIST instance. The PSVT+ domain sees the MSTP domain as a PVST+ bridge with all per-VLAN instances claiming the CIST Root as the root of their STP. With respect to the common STP Root elected between MSTP and PVST+ the two following options are possible: MSTP domain (either a single region or multiple regions) contains the root bridge for ALL VLANs. This means the CIST Root Bridge ID is better than any PVST+ STP root Bridge ID. If there is only one MSTP region connecting to PVST+ domain, then all boundary ports on the virtual-bridge will be unblocked and could be used by PVST+ trees. This is the preferred design, as administrator can manipulate uplink costs on the PVST+ side and obtain optimal traffic engineering results. On the figure below, VLANs 2 and 3 have their STP costs adjusted so that they select different uplinks connected to MSTP regions boundary ports. Since the CIST Root is inside the MSTP region, both boundary ports are non-blocking designated and thus the load balancing scheme works fine.

PVST+ domain contains the root bridges for ALL VLANs. This is only true is all PVST+ root bridges Bridge IDs for all VLANs are better than the MSTP CIST Root Bridge ID. This is not the preferred design, since all MSTIs map to CIST on the border link, and you cannot load-balance the MSTIs as they enter the PVST+ domain. Cisco implementation does not support the second option. MSTP domain should contain the bridge with the best Bridge ID, to ensure that the CIST Root is also the root for all PVST+ trees. In any other case, the MSTP border switch will complain and place the ports that receive superior BPDUs from PVST+ region in root-inconsistent state. To fix this issue, ensure that PVST+ domain does not have any bridges with Bridge IDs better than the CIST Root Bridge ID. And lastly a few words on MSTP and PVST interoperations. The operate in exactly the same manner and follow the same rules as PVST+ interoperation, just ISL is used for trunking encapsulation.

Scenario 1: CIST Root and CIST Regional Root


In this scenario, we configure four switches in two regions. The first region consists of one switch (SW1) and the second region consists of three switches: SW2, SW3 and SW4. SW1 is the CIST Root thanks to its lowest priority; SW2 is the CIST Regional Root for REGION234. We modify links costs in REGION234 to make SW3 prefer path to SW2 via SW4 and not via the directly connected link.

SW1: spanning-tree mode mst ! ! Minimum Priority among all bridges ! spanning-tree mst 0 priority 4096 ! spanning-tree mst configuration name REGION1 exit SW2: spanning-tree mode mst spanning-tree mst configuration name REGION234 exit ! ! SW2 priority is less than SW4s, but it has better cost to CIST Root ! spanning-tree mst 0 priority 16384 ! ! This is the active boundary port ! interface FastEthernet 0/13 spanning-tree mst 0 cost 50 ! interface FastEthernet 0/14 spanning-tree mst 0 cost 200 ! interface FastEthernet 0/16 spanning-tree mst 0 cost 100 SW3: spanning-tree mode mst spanning-tree mst configuration name REGION234 exit ! interface FastEthernet 0/16 spanning-tree mst 0 cost 100 ! interface FastEthernet 0/19 spanning-tree mst 0 cost 10 SW4: spanning-tree mode mst spanning-tree mst configuration name REGION234 exit ! ! SW4 has better IST priority but higher cost to the CIST Root ! spanning-tree mst 0 priority 8192 ! interface FastEthernet 0/13 spanning-tree mst 0 cost 100 ! interface FastEthernet 0/16 spanning-tree mst 0 cost 10 ! interface FastEthernet 0/19 spanning-tree mst 0 cost 10

In the above configuration we adjusted link costs to ensure the following: MSTP REGION 234 elects SW2 as the CIST Regional Root due to its shortest path to the CIST Root. SW2 is the CIST Regional Root for REGION234, even though SW4 has better switch priority inside the region. SW3 selects the path through SW4 to reach internal root bridge due to the shortest path cost to SW1 (CIST Regional Root) via SW4. Verifications for the above configuration follow:

SW1#show spanning-tree mst 0 ##### MST0 Bridge Root Operational Configured Interface Fa0/13 Fa0/14 Fa0/19 vlans mapped: 1-4094 priority 4096 (4096 sysid 0)

address 0019.55e6.6800

this switch for the CIST hello time 2 , forward delay 15, max age 20, txholdcount 6 hello time 2 , forward delay 15, max age 20, max hops Role Sts Cost Desg FWD 200000 Desg FWD 200000 Desg FWD 200000 Prio.Nbr Type 128.15 128.16 128.21 P2p P2p P2p 20

---------------- ---- --- --------- -------- --------------------------------

SW1 says it is the CIST Root Bridge and all of its ports are unblocked (designated). The bridge priority is set to 4096 this will allow us to distinguish this switch in the show outputs below. The ports are not marked as boundary, since SW1 is not receiving any BPDUs on these ports all downstream ports suppress sending their own BPDUs since SW1 is the root bridge. You may confirm this using the following debugging commands:

SW1#debug spanning-tree mstp bpdu receive MSTP BPDUs RECEIVEd dump debugging is on SW1#

Next try dumping the BPDUs being sent by SW1. Notice that they all have Mum_mrec set to zero which means zero M-records. SW1 claims itself as the CIST Root and CIST Regional root on all ports. The cost to both roots is set to zero.

SW1#debug spanning-tree mstp bpdu transmit MSTP BPDUs TRANSMITted dump debugging is on MST[0]:-TX Fa0/13 MST[0]: MST[0]: MST[0]: MST[0]: MST[0]: MST[0]: Role BPDU Prot:0 Vers:3 Type:2 : Desg Flags[AFL] Age:0 RemHops:20 :0 :0

CIST_root: 4096.0019.55e6.d380 Cost Reg_root : 4096.0019.55e6.d380 Cost max_age:20 hello:2 fwdelay:15

Bridge_ID: 4096.0019.55e6.d380 Port_ID:32783 V3_len:64 region:REGION1 rev:0 Num_mrec: 0 BPDU Prot:0 Vers:3 Type:2 : Desg Flags[AFL] Age:0 RemHops:20 :0 :0ll

MST[0]:-TX Fa0/14 MST[0]: MST[0]: MST[0]: MST[0]: MST[0]: MST[0]: Role

CIST_root: 4096.0019.55e6.d380 Cost Reg_root : 4096.0019.55e6.d380 Cost max_age:20 hello:2 fwdelay:15

Bridge_ID: 4096.0019.55e6.d380 Port_ID:32784 V3_len:64 region:REGION1 rev:0 Num_mrec: 0 BPDU Prot:0 Vers:3 Type:2 : Desg Flags[AFL] Age:0 RemHops:20 :0 :0

MST[0]:-TX Fa0/19 MST[0]: MST[0]: MST[0]: MST[0]: MST[0]: MST[0]: Role

CIST_root: 4096.0019.55e6.d380 Cost Reg_root : 4096.0019.55e6.d380 Cost max_age:20 hello:2 fwdelay:15

Bridge_ID: 4096.0019.55e6.d380 Port_ID:32789 V3_len:64 region:REGION1 rev:0 Num_mrec: 0

Inspect CIST statistics on SW2:

SW2#show spanning-tree mst 0 ##### MST0 Bridge Root vlans mapped: 1-4094 priority priority path cost 16384 (16384 sysid 0) 4096 50 (4096 sysid 0)

address 001b.8f0c.2a00 address 0019.55e6.6800 port Fa0/13

Regional Root this switch Operational Configured Interface Fa0/13 Fa0/14 Fa0/16 Fa0/19 hello time 2 , forward delay 15, max age 20, txholdcount 6 hello time 2 , forward delay 15, max age 20, max hops Role Sts Cost Root FWD 50 Altn BLK 200 Desg FWD 10 Desg FWD 200000 Prio.Nbr Type 128.15 128.16 128.18 128.21 P2p Bound(RSTP) P2p Bound(RSTP) P2p P2p 20

---------------- ---- --- --------- -------- --------------------------------

SW2 is the CIST Regional Root Bridge (CIST Root) with the priority value of 16384. SW2 learns that SW1 is the CIST Root with the priority value of 4096. The boundary root port is Fa 0/13, elected based on regular STP rules (shortest path, lowest upstream port priority). The other boundary uplink is blocking. Both ports show up as Boundary since they face the other MSTP domain. Of course, SW2 is the Regional Root due to the fact that is has the shortest path to the CIST Root. Dump the BPDUs being sent by SW2:

SW2#debug spanning-tree mstp bpdu transmit MSTP BPDUs TRANSMITted dump debugging is on MST[0]:-TX MST[0]: MST[0]: MST[0]: MST[0]: MST[0]: MST[0]: Fa0/16 BPDU Prot:0 Vers:3 Type:2 : Desg Flags[FL] Age:1 RemHops:20 :50 :0

Role

CIST_root: 4096.0019.55e6.d380 Cost Reg_root :16384.0019.564c.c580 Cost max_age:20 hello:2 fwdelay:15

Bridge_ID:16384.0019.564c.c580 Port_ID:32786 V3_len:64 region:REGION234 rev:0 Num_mrec: 0 Fa0/19 BPDU Prot:0 Vers:3 Type:2 : Desg Flags[FL] Age:1 RemHops:20 :50 :0

MST[0]:-TX MST[0]: MST[0]: MST[0]: MST[0]: MST[0]: MST[0]:

Role

CIST_root: 4096.0019.55e6.d380 Cost Reg_root :16384.0019.564c.c580 Cost max_age:20 hello:2 fwdelay:15

Bridge_ID:16384.0019.564c.c580 Port_ID:32789 V3_len:64 region:REGION234 rev:0 Num_mrec: 0

SW2 only sends BPDUs out of its designated ports that are Fa 0/16 and Fa 0/19. The output also signifies that SW2 announces SW1 as the CIST Root and itself as the regional root. Notice the CIST External Root Path Cost and the CIST Regional Root Cost. Now, check the MSTI0 statistics on SW3:

SW3#show spanning-tree mst 0 ##### MST0 Bridge Root vlans mapped: 1-4094 priority priority path cost priority 32768 (32768 sysid 0) 4096 50 16384 (16384 sysid 0) rem hops 18 20 (4096 sysid 0)

address 000c.85be.c680 address 0019.55e6.6800 port Fa0/19

Regional Root address 001b.8f0c.2a00 Operational Configured Interface Fa0/16 Fa0/19

internal cost 20

hello time 2 , forward delay 15, max age 20, txholdcount 6 hello time 2 , forward delay 15, max age 20, max hops Role Sts Cost Altn BLK 100 Root FWD 10 Prio.Nbr Type 128.16 128.19 P2p P2p

---------------- ---- --- --------- -------- --------------------------------

SW3 sees SW1 as the CIST Root and SW2 as the CIST Regional Root. Since SW2 is also the root for IST, SW3 needs to select the root port to reach it. It selects the link via SW4 as our cost manipulations made this path more preferred. The internal cost to reach the CIST Regional root is 10+10=20. The CIST External Root Path Cost is 50, as it is not incremented when transported from SW2.

SW4#show spanning-tree mst 0 ##### MST0 Bridge Root vlans mapped: 1-4094 priority priority path cost priority 8192 4096 50 16384 (16384 sysid 0) rem hops 19 20 (8192 sysid 0) (4096 sysid 0)

address 000d.2840.ab00 address 0019.55e6.6800 port Fa0/16

Regional Root address 001b.8f0c.2a00 Operational Configured Interface Fa0/13 Fa0/16 Fa0/19

internal cost 10

hello time 2 , forward delay 15, max age 20, txholdcount 6 hello time 2 , forward delay 15, max age 20, max hops Role Sts Cost Altn BLK 100 Root FWD 10 Desg FWD 10 Prio.Nbr Type 128.13 128.16 128.19 P2p Bound(RSTP) P2p P2p

---------------- ---- --- --------- -------- --------------------------------

SW4 has lower bridge priority than SW2, but it is not elected as the CIST Regional Root, for SW4 path cost to the CIST Root is worse. Notice the External and Internal root path costs values 50 and 10 respectively. Pay attention to the fact that SW4s port Fa 0/13 is marked as alternate and blocking while the root port toward the CIST Regional Root is unblocked.

Scenario 2: MSTIs and the Master Port


In this scenario we add another STP instance to REGION234. We also create two VLANs 10 and 20 in all switches and map them to MSTI 1 in REGION234.

SW1: vlan 10,20 SW2, SW3 & SW4: vlan 10,20 ! spanning-tree mst configuration instance 1 vlan 10, 20

We are concerned with the show commands for new MSTI 1 in REGION 234. Notice that we didnt do any path manipulations and simply mapped the new VLANs to MSTI 1. It was not even necessary creating new VLANs, the configuration would work without VLANs every existing as MSTIs are separated from the VLANs.

SW2#show spanning-tree mst 1 ##### MST1 Bridge Root Interface Fa0/13 Fa0/14 Fa0/16 Fa0/19 vlans mapped: 10,20 priority 32769 (32768 sysid 1)

address 001b.8f0c.2a00 this switch for MST1 Role Sts Cost Mstr FWD 200000 Altn BLK 200000 Desg FWD 200000 Desg FWD 200000

Prio.Nbr Type 128.15 128.16 128.18 128.21 P2p Bound(RSTP) P2p Bound(RSTP) P2p P2p

---------------- ---- --- --------- -------- --------------------------------

There is no Regional Root for MSTI1. Just regular root, which is the root of MSTI, elected using the regular STP rules. In our case, the root is SW4 and the root port is Fa 0/19. Next note the special Master Port. This is the uplink port of CIST Regional Root. All MSTIs map to CIST here, and follow the single path. This port is also forwarding and provides the path upstream to the CIST Root for all MSTIs and their mapped VLANs. It is interesting to dump the MSTP BPDUs sent and received by SW2:

SW2#debug spanning-tree mstp bpdu receive MSTP BPDUs RECEIVEd dump debugging is on MST[0]: RX- Fa0/13 repeated designated BPDU Prot:0 Vers:3 Type:2 MST[0]: MST[0]: MST[0]: MST[0]: MST[0]: MST[0]: Role : Desg Flags[AFL] Age:0 RemHops:20 :0 :0 CIST_root: 4096.0019.55e6.d380 Cost Reg_root : 4096.0019.55e6.d380 Cost max_age:20 hello:2 fwdelay:15 V3_len:64 region:REGION1 rev:0 Num_mrec: 0

Bridge_ID: 4096.0019.55e6.d380 Port_ID:32783

MST[0]: RX- Fa0/14 repeated designated BPDU Prot:0 Vers:3 Type:2 MST[0]: MST[0]: MST[0]: MST[0]: MST[0]: MST[0]: Role : Desg Flags[AFL] Age:0 RemHops:20 :0 :0 CIST_root: 4096.0019.55e6.d380 Cost Reg_root : 4096.0019.55e6.d380 Cost max_age:20 hello:2 fwdelay:15 V3_len:64 region:REGION1 rev:0 Num_mrec: 0

Bridge_ID: 4096.0019.55e6.d380 Port_ID:32784

Nothing special in received BPDUs, only SW1 claiming itself as the CIST Root/CIST Regional Root. Look at the BPDUs that SW2 is sending though:

SW2#debug spanning-tree mstp bpdu transmit MSTP BPDUs TRANSMITted dump debugging is on MST[0]:-TX Fa0/16 MST[0]: MST[0]: MST[0]: MST[0]: MST[0]: MST[0]: Role BPDU Prot:0 Vers:3 Type:2 : Desg Flags[AFL] Age:1 RemHops:20 :50 :0

CIST_root: 4096.0019.55e6.d380 Cost Reg_root :16384.0019.564c.c580 Cost max_age:20 hello:2 fwdelay:15

Bridge_ID:16384.0019.564c.c580 Port_ID:32786 V3_len:80 region:REGION234 rev:0 Num_mrec: 1 MREC : Desg Flags[AFL] RemHops:20 :32769.0019.564c.c580 Cost :0

MST[1]:-TX> Fa0/16 MST[1]: MST[1]: MST[1]: Role Root_ID

Bridge_ID:16385.0019.564c.c580 Port_id:146 BPDU Prot:0 Vers:3 Type:2 : Desg Flags[AFL] Age:1 RemHops:20 :50 :0

MST[0]:-TX Fa0/19 MST[0]: MST[0]: MST[0]: MST[0]: MST[0]: MST[0]: Role

CIST_root: 4096.0019.55e6.d380 Cost Reg_root :16384.0019.564c.c580 Cost max_age:20 hello:2 fwdelay:15

Bridge_ID:16384.0019.564c.c580 Port_ID:32789 V3_len:80 region:REGION234 rev:0 Num_mrec: 1 MREC : Desg Flags[AFL] RemHops:20 :32769.0019.564c.c580 Cost :0

MST[1]:-TX Fa0/19 MST[1]: MST[1]: MST[1]: Role Root_ID

Bridge_ID:16385.0019.564c.c580 Port_id:149

The output now shows one M-Record attached to the IST BPDU. This M-Record specifies SW2 as the root bridge for IST1 you can tell that by looking at the Cost field.

Scenario 3: PVST+ and MSTP Interoperation


In this scenario, we configure SW1 to use PSVT+ and see how it interworks with MSTP. First, configure SW1 as the root bridge for all VLANs and make it win over any bridge in MSTP region.

SW1: spanning-tree mode pvst spanning-tree vlan 1-4094 priority 4096

Lets see what happens on SW2:

%SPANTREE-2-PVSTSIM_FAIL: Superior PVST BPDU received on VLAN 5 SW2#show spanning-tree mst 0 ##### MST0 Bridge Root vlans mapped: 1-9,11-19,21-4094 priority priority path cost 16384 (16384 sysid 0) 4097 50 (4096 sysid 1)

address 001b.8f0c.2a00 address 0019.55e6.6800 port Fa0/13

Regional Root this switch Operational Configured Interface Fa0/13 Fa0/14 Fa0/16 Fa0/19 hello time 2 , forward delay 15, max age 20, txholdcount 6 hello time 2 , forward delay 15, max age 20, max hops Role Sts Cost Desg BLK 50 Desg BKN*200 Root FWD 100 Altn BLK 200000 Prio.Nbr Type 128.15 128.16 128.18 128.21 P2p Bound(PVST) P2p Bound(PVST) *PVST_Inc P2p Bound(RSTP) P2p Bound(RSTP) 20

---------------- ---- --- --------- -------- --------------------------------

Notice the syslog message in the beginning. It says that while emulating PVST+ operation the MSTP code encountered the situation where PVST+ domain claims itself as a root for one or more VLANs. That is, the PVST+ Bridge has better BID than the current CIST Root. As a result, even though the MSTP considers the new root as the legitimate new CIST Root, it blocks the uplink port as PVST Simulation Inconsistent. It is interesting to notice that SW1 is still considered to be the CIST Root and SW2 is the CIST Regional Root but all ports to the CIST Root are blocking! Check the flow of BPDUs received from SW1. The first is the VLAN 1 BPDUs perceived on SW2 as IEEE STP BPDUs. They claim SW1 as the Root Bridge you may see extended system ID carrying the VLAN number of 1.

SW2#debug spanning-tree mstp bpdu receive MSTP BPDUs RECEIVEd dump debugging is on MST[0]: RX- Fa0/13 repeated designated BPDU Prot:0 Vers:0 Type:0 MST[0]: MST[0]: MST[0]: MST[0]: Flags[] Age:0 CIST_root: Bridge_ID: 1.0019.55e6.d380 Cost :0 1.0019.55e6.d380 Port_ID:32783

max_age:20 hello:2 fwdelay:15

MST[0]: RX- Fa0/14 repeated designated BPDU Prot:0 Vers:2 Type:2 MST[0]: MST[0]: MST[0]: MST[0]: Role : Desg Flags[FL] Age:0 1.0019.55e6.d380 Cost :0 1.0019.55e6.d380 Port_ID:32784 CIST_root: Bridge_ID:

max_age:20 hello:2 fwdelay:15

There are other BPDUs received on SW2 due to the fact that 802.1Q is the trunking encapsulation SW2 receives PVST+ BPDUs for VLANs 10 and 20:

SW2#debug spanning-tree bpdu receive Spanning Tree BPDU Received debugging is on STP: MST0 rx BPDU: config protocol = mstp, packet from FastEthernet0/13 enctype 2, encsize 17 STP: enc 01 80 C2 00 00 00 00 19 55 E6 D3 8F 00 26 42 42 03 STP: Data 00000000010001001955E6D380000000000001001955E6D380800F0000140002000F00 , linktype IEEE_SPANNING ,

STP: MST0 Fa0/13:0000 00 00 01 0001001955E6D380 00000000 0001001955E6D380 800F 0000 1400 0200 0F00 STP: MST0 rx BPDU: config protocol = mstp, packet from FastEthernet0/14 enctype 2, encsize 17 STP: enc 01 80 C2 00 00 00 00 19 55 E6 D3 90 00 27 42 42 03 STP: Data 000002021E0001001955E6D380000000000001001955E6D38080100000140002000F00 , linktype IEEE_SPANNING ,

STP: MST0 Fa0/14:0000 02 02 1E 0001001955E6D380 00000000 0001001955E6D380 8010 0000 1400 0200 0F00 STP: MST0 rx BPDU: config protocol = mstp, packet from FastEthernet0/13 encsize 22 STP: enc 01 00 0C CC CC CD 00 19 55 E6 D3 8F 00 32 AA AA 03 00 00 0C 01 0B STP: Data 0000000001000A001955E6D38000000000000A001955E6D380800F0000140002000F00 , linktype SSTP , enctype 3,

STP: MST0 Fa0/13:0000 00 00 01 000A001955E6D380 00000000 000A001955E6D380 800F 0000 1400 0200 0F00 STP: MST0 rx BPDU: config protocol = mstp, packet from FastEthernet0/14 encsize 22 STP: enc 01 00 0C CC CC CD 00 19 55 E6 D3 90 00 32 AA AA 03 00 00 0C 01 0B STP: Data 000002021E000A001955E6D38000000000000A001955E6D38080100000140002000F00 , linktype SSTP , enctype 3,

STP: MST0 Fa0/14:0000 02 02 1E 000A001955E6D380 00000000 000A001955E6D380 8010 0000 1400 0200 0F00 STP: MST0 rx BPDU: config protocol = mstp, packet from FastEthernet0/13 encsize 22 STP: enc 01 00 0C CC CC CD 00 19 55 E6 D3 8F 00 32 AA AA 03 00 00 0C 01 0B STP: Data 00000000010014001955E6D380000000000014001955E6D380800F0000140002000F00 , linktype SSTP , enctype 3,

STP: MST0 Fa0/13:0000 00 00 01 0014001955E6D380 00000000 0014001955E6D380 800F 0000 1400 0200 0F00 STP: MST0 rx BPDU: config protocol = mstp, packet from FastEthernet0/14 encsize 22 STP: enc 01 00 0C CC CC CD 00 19 55 E6 D3 90 00 32 AA AA 03 00 00 0C 01 0B STP: Data 000002021E0014001955E6D380000000000014001955E6D38080100000140002000F00 , linktype SSTP , enctype 3,

The above output shows that both ports receive IEEE native STP BPDUs along with PVST+ SSTP BPDUs for VLAN numbers 0xA (10) and 014 (20). Now check how MSTI1 sees the inconsistent port:

SW2#show spanning-tree mst 1 ##### MST1 Bridge Root Interface Fa0/13 Fa0/14 Fa0/16 Fa0/19 vlans mapped: 10,20 priority 32769 (32768 sysid 1)

address 001b.8f0c.2a00 this switch for MST1 Role Sts Cost Mstr BKN*200000 Altn BLK 200000 Desg FWD 200000 Desg FWD 200000

Prio.Nbr Type 128.15 128.16 128.18 128.21 P2p Bound(PVST) *PVST_Inc P2p Bound(PVST) P2p P2p

---------------- ---- --- --------- -------- --------------------------------

Here we see that the Master Port is blocked as well, due to the PSVT simulation inconsistency. To resolve this issue you need to ensure that MSTP domain contains the root bridge for all PVST+ trees. This is accomplished by tuning priority value for the CIST to a number lower than any PVST+ bridge priority.

SW1: spanning-tree mode pvst spanning-tree vlan 1-4094 priority 8192 SW2: spanning-tree mst 0 priority 4096

Now SW2 is the new CIST Root. Look at the show command output again:

SW2#show spanning-tree mst 0 ##### MST0 Bridge Root Operational Configured Interface Fa0/13 Fa0/14 Fa0/16 Fa0/19 vlans mapped: 1-9,11-19,21-4094 priority 4096 (4096 sysid 0)

address 001b.8f0c.2a00

this switch for the CIST hello time 2 , forward delay 15, max age 20, txholdcount 6 hello time 2 , forward delay 15, max age 20, max hops Role Sts Cost Desg FWD 50 Desg FWD 200 Desg FWD 100 Desg FWD 200000 Prio.Nbr Type 128.15 128.16 128.18 128.21 P2p Bound(PVST) P2p Bound(PVST) P2p P2p 20

---------------- ---- --- --------- -------- --------------------------------

SW2#show spanning-tree mst 1 ##### MST1 Bridge Root vlans mapped: 10,20 priority priority cost Prio.Nbr Type 128.15 128.16 128.18 128.21 P2p Bound(PVST) P2p Bound(PVST) P2p P2p 32769 (32768 sysid 1) 32769 (32768 sysid 1) 200000 rem hops 19

address 001b.8f0c.2a00 address 000d.2840.ab00 port Fa0/19

Interface Fa0/13 Fa0/14 Fa0/16 Fa0/19

Role Sts Cost Desg FWD 200000 Desg FWD 200000 Desg FWD 200000 Root FWD 200000

---------------- ---- --- --------- -------- --------------------------------

SW1#show spanning-tree vlan 1 VLAN0001 Spanning tree enabled protocol ieee Root ID Priority Address Cost Port Hello Time Bridge ID Priority Address Hello Time 4096 001b.8f0c.2a00 19 15 (FastEthernet0/13) 2 sec 8193 2 sec Max Age 20 sec Forward Delay 15 sec

(priority 8192 sys-id-ext 1) Max Age 20 sec Forward Delay 15 sec

0019.55e6.6800

Aging Time 300 Interface Fa0/13 Fa0/14 Fa0/19 Role Sts Cost Root FWD 19 Altn BLK 19 Altn BLK 19 Prio.Nbr Type 128.15 128.16 128.21 P2p P2p P2p

------------------- ---- --- --------- -------- --------------------------------

All SW2 ports are now designated. SW2 correctly emulates the PVST+ interactions, and SW1 sees SW2 as the root of all PVST+ instances. SW1 will then block one of its redundant uplink based on the regular STP rules. In this situation, traffic may flow between the PVST+ and MSTP domains and you can achieve optimal load-balancing using the PSVT+ cost tuning on SW1.

Conclusions
MSTP was designed to overcome one major problem with classic STP protocol inability to use blocked links for traffic forwarding due to single STP instance present. This is accomplished by running multiple spanning trees in a topology and mapping VLANs to different trees for traffic forwarding. Even though this feature does not allow for precise and optimal traffic engineering it improves redundant link utilization. By using regions, MSTP allows for isolating different physical topologies from each other while maintaining Layer 2 connectivity between the regions. However, even with improved fault isolation, MSTP still suffers from the problems inherent to Ethernet topology uncast and broadcast flooding and slow spanning-tree convergence. This limits MSTP deployments to small Layer 2 domains, such as single access-distribution switch block. Larger MSTP deployments should be planned carefully and require strict administrative control. As a suggestion, Private VLANs could be used for larger domains to minimize traffic flooding.

Further Reading
IEEE 802.1 Series Standards MSTP Configuration Guide RFC 5517: Cisco Systems Private VLANs Tags: 802.1s, ccie.mstp, cist, cst, multiple spanning tree Download this page as a PDF
About Petr Lapukhov, 4xCCIE/CCDE:
Petr Lapukhov's career in IT begain in 1988 w ith a focus on computer programming, and progressed into netw orking w ith his first exposure to Novell NetWare in 1991. Initially involved w ith Kazan State University's campus netw ork support and UNIX system administration, he w ent through the path of becoming a netw orking consultant, taking part in many netw ork deployment projects. Petr currently has over 12 years of experience w orking in the Cisco netw orking field, and is the only person in the w orld to have obtained four CCIEs in under tw o years, passing each on his first attempt. Petr is an exceptional case in that he has been w orking w ith all of the technologies covered in his four CCIE tracks (R&S, Security, SP, and Voice) on a daily basis for many years. When not actively teaching classes, developing self-paced products, studying for the CCDE Practical & the CCIE Storage Lab Exam, and completing his PhD in Applied Mathematics. Find all posts by Petr Lapukhov, 4xCCIE/CCDE | Visit Website

You can leave a response, or trackback from your own site.

64 Responses to Understanding MSTP


February 22, 2010 at 12:49 am

Ronald
Damn.. I needed *some* MSTP information. This seems like it! Much appreciated.

Reply
February 22, 2010 at 3:12 am

bitje
Please keep them coming!! Understanding Frame-Relay, Understanding RIP/EIGRP/OSPF/BGP/Multicast and last but not least Understanding routing filtering using standard access-list/extended access-lists etc

Reply
February 22, 2010 at 3:15 am

bitje
Oh and also Understanding MPLS/MPLS VPNs

Reply
February 22, 2010 at 4:14 am

C.
Hi,.. brilliant explanations. Very well done. Could You please stress on the point, why MSTP would be limited to small L2 deployments ? I thought this would be the great advantage of MSTP compared to R-PVST+. Many thanks,.. /Christian

Reply
February 22, 2010 at 4:54 am

MCL.Nicolas
Omg Petr lol ! MST is the hardest Spanning-tree protocol to understand for me .. I am a CCNP but I used to hate all the MST question. I have not the required lvl to really understand it deeply again I have read the article but still a lot of confusion for me Hope that I ll understand each sentence after few more reads

Reply
February 22, 2010 at 5:04 am

Jihad
excellent post. really excellent. Well done Petr. the good news is STP is reaching EOL soon and will be replaced with TRILL. so people wont get confused with all STP hassles. Thanks.

Reply
February 22, 2010 at 5:40 am

Terry Vinson
Great Post! (I think I wore my down arrow out on this one though)

Reply
February 22, 2010 at 6:05 am

Dennis
Peter you are great! !

Reply
February 22, 2010 at 10:03 am

Iman
Thanks Peter ! I think it was very great !

Reply
February 22, 2010 at 10:05 am

Internets of Interest:22 Feb 10 | My Etherealmind


[...] Understanding MSTPCCIE BlogDefinitive, complete, with the backstory and references on the MSTP. [...]

Reply
February 22, 2010 at 10:36 am

Ofori
Great post!!!

Reply
February 22, 2010 at 1:20 pm

martijn
Very well written! Ive enjoyed reading it.

Reply
February 22, 2010 at 1:27 pm

Petr Lapukhov, CCIE #16379


@ Jihad Well, its hard to EOL something that has been deployed for years Plus, even though TRILL offers solutions to some of Ethernet problems it retains dynamic data-plane learning and flooding. I believe Ethernet should be converted into something similar to Fibre Channel fabric with explicit nodes login/logout and separated identifiers/location addresses (e.g. by using 802.1x). But that would wear off the plug-and-play behavior you have to pay the price!

Reply
February 22, 2010 at 1:29 pm

Petr Lapukhov, CCIE #16379


@Christian The problem is mostly not related to MSTP regions, but rather to STP and Ethernet characteristics. MSTP regions bound topology changes, but they cannot prevent slow CIST convergence for large domains nor cant they eliminate unicast/broadcast flooding. Those are two main factors liminting large Ethernet deployments.

Reply
February 22, 2010 at 4:05 pm

orestis
Petr Would there be in the future an article for REP and Metro-Ethernet applications ? BR Orestis

Reply
February 22, 2010 at 4:23 pm

Petr Lapukhov, CCIE #16379


@orestis I was thinking of making a writeup on REP, RRRP, EAPS etc for quite some time. Those protocol are relatively simple and predictable, unlike STP. The only issue is having access to the lab gear supporting REP or EAPS (extreme networks). Without the lab equipment the article would remain purely theoretical.

Reply
February 23, 2010 at 3:06 am

Manouchehr
Petr, WONDERFUL, I was looking for it so badly. Thanks!

Reply
February 23, 2010 at 10:15 am

jonbov
Issue tested in my lab a couple of weeks ago: I have 4 core switches connected together on a shared media. (multipoint QinQ this is a replica of production network) All switches currently running RSTP. My plan was to upgrade 2 switches in one maintenance window, and the other 2 a few weeks later. Step one: configure root for all VLAN one switch, and configure MST on this switch. No problem. Step two: add one more switch to MST. Problems started: this non root switch will now recieve MST BPDUs from MST root + RSTP BPDUs from RSTP switches. As long as this situation occur the non root MST switch will be inconsistent and STP state Broken. So I need to redo my plan. Bring all 4 switches to MST at same time would probably be easiest. Make core links PtP another possibility.

Reply
February 23, 2010 at 11:15 am

MSTP Tutorial Part II: Outside a Region - CCIE Blog


[...] Understanding MSTP [...]

Reply
February 23, 2010 at 11:16 am

MSTP Tutorial Part I: Inside a Region - CCIE Blog


[...] Understanding MSTP [...]

Reply
February 24, 2010 at 3:32 pm

Antonio Soares
Petr, Cisco should invite you to write the new Cisco LAN Switching book.

Reply
February 24, 2010 at 9:06 pm

Petr Lapukhov, CCIE #16379


@Antonio, hehe, I wish they would! I still have so much crazy stuff to tell about! Just not enought time to spill it on your heads

Reply
February 25, 2010 at 5:59 am

Rajj Anbu
Petr, is there any issue between MST0 and changing the native vlan? what can happen if I assign vlan 1 to another spt instance or change a native vlan to a another which belongs to another spt instance? /RA

Reply
February 25, 2010 at 9:54 am

Petr Lapukhov, CCIE #16379


@Rajj MSTP instances are not linked to VLANs anymore, like it was in PVST+. You may map VLAN1 to any STP instance, just make sure it is not filtered on a trunk where the corresponding instance is active. Changing trunks native VLAN only affects tagging, but not STP decisions. Though youd better map everything out of MST0 and use other intance for traffic engineering.

Reply
February 26, 2010 at 8:50 am

rakesh
Thank you Peter . Things made much sense after the post Regards Rakesh

Reply
March 16, 2010 at 7:22 pm

Peter Ashwood-Smith
Yes, a very interesting explaination. By the way, should folks be interested in the evolution of MSTP etc. we have a wikipedia page on IEEE 802.1aq Shortest Path Bridging at [ http://en.wikipedia.org/wiki/IEEE_802.1aq ] Essentially shortest path routing for Ethernet Frames using IS-IS link state and either .1Q or .1ah encapsulation for transport. Cheers Peter

Reply
March 16, 2010 at 8:11 pm

Petr Lapukhov, CCIE #16379


@Peter I think I made a blog post on OTV which actually involved some discussions of TRILL/802.1AQ and others

Reply
April 5, 2010 at 12:07 am

Understanding STP and RSTP Convergence - CCIE Blog


[...] and RSTP Convergence By Petr Lapukhov, CCIE #16379 For some time, I believed a companion post to Understanding MSTP is required in order to completely cover all aspects of MSTP. The post should discuss convergence [...]

Reply
April 12, 2010 at 7:05 pm

ajay
Notice that SW3-1 and SW2-3 have equal External Costs to reach the CIST Root but SW3-1 wins the CIST Regional Root role due to lower priority. Do you mean to say SW3-1 and SW3-3? Possibly a typo?

Reply
April 23, 2010 at 11:15 am

Aaron
Radia Perlmans name is misspelled in this article, just an FYI if someone is searching for additional information on her.

Reply
May 3, 2010 at 6:00 am

Lukasz Grzesiowski
Does MSTP allows to communicate between devices in the same vlan number but in different mst regions? What conditions must be met to ensure this kind of communication?

Reply
May 3, 2010 at 8:07 am

Petr Lapukhov, CCIE #16379


@Luckasz Sure, MSTP separates STP instances (forwarding paths) from VLANs (virtual bridges) and there no restrictions for allowing two VLANs in different MSTP regions to communicate. MSTP may only affect the forwarding paths between the regions.

Reply
May 9, 2010 at 4:11 pm

Sergey U
Thanks Petr, this is another great article. I could not find any good up-to-date fundamental book on switching, your articles are of a huge value for me.

Reply
June 7, 2010 at 6:05 pm

Samit
Great post! Thanks Petr, 1. Can you let me know how many no. of switches is recommended in one flat L2 network running MSTP? 2. How many no. of Vlans can be mapped in each instance? any limitation? 3. If the core switch support 4000 Vlan but the aggregation switch support 1000vlan then how does the MSTP mapping takes place? In this scenario are we limited to 1000vlan only in total? Thanks once again.

Reply
August 1, 2010 at 1:37 pm

DPS GUARD
Great Article. So how will MSTP beneifts be reaped if you have only a single root (core) switch and you are not in a position to etherchannel the multiple links going to closets? Assume we have a 10G uplink and a 1G uplink for backup and at this time regular RSTP will block 1G uplink, which can be utilizied for normally carrying VOIP traffic (so both links are active) and if main 10G link fails, all vlans will be carried by 1G uplink.

Reply
August 10, 2010 at 3:49 pm

Bibil
Hi, Will MSTP increase the diameter of the network to more than 40? (STP/RSTP set a maximum of 40 hops due to the maxage limit). As each region has its own STP does it mean that each region can have a diameter of 40 which allow the network with multiple regions to scale more than 40 hops diameter??? Thanks Bibil

Reply
September 20, 2010 at 7:53 am

MST ! My Cisco Learning Path


[...] http://blog.ine.com/2010/02/22/understanding-mstp/ [...]

Reply
September 26, 2010 at 12:55 am

CCDE Practical Exam Recommended Reading | CCIE Blog


[...] Understanding MSTP Tags: ccde practical, reading list Download this page as a PDF [...]

Reply
September 28, 2010 at 11:48 am

Jimbo
And how are MaxHops and network diameter related? Isnt the diameter defined as the max # of switch hops between any two endstations? And in a tree scenario, isnt this by default larger than the # of hops from the root to the end of the tree since two end stations could be at the end of two different branches? So why is network diameter limited to 7 while MaxHops can be as much as 40? Or am I missing something? thanks, jk

Reply
September 28, 2010 at 1:54 pm

Petr Lapukhov, 4xCCIE/CCDE


@Jimbo The Diameter value was never an explicit limit for STP. Rather, it was in input value to calculate optimum STP timers. The MaxHops counter is the number of hop a BPDU may traverse before being aged out. It is similar to MaxAge used in STP, but has been given more concrete meaning.

Reply
October 7, 2010 at 8:43 pm

FRANCIS XAVIER SAUCEDO


Understanding MSTP PROBLEM WITH THIS DOCUMENT LINK. DOWNLOAD THIS PAGE AS A PDF. OPEN RETURNS A BLANK PDF PAGE. SAVE RETURNS A CORRUPT FILE ERROR MESSAGE.

Reply
November 1, 2010 at 1:46 am

Patrick
Thanks for a really useful post, particularly with all of the MSTP info in one place. Looking at scenario 1, CIST regional root (and by definition the IST root) for region 234. I assume that if we shut down F0/13 14 on SW2 and F0/13 on SW4 that SW4 would become the IST root for region 234? I suppose that this behaviour would be undesirable in a production environment book cool to demonstrate the virtual bridge behaviour in a lab configuration.

Reply
January 5, 2011 at 1:44 am

abjoint
Hi. I have a question: There are 7 switches (1-2-3-4-5-6-7), all of them are in a same region. The switch 1 is root and I configure the maxhops to 6 on switch1. Now the switch7s REGION ROOT and CIST ROOT are itself and the ports role between 6 and 7 are all Designated, but their state are BLOCK. Its correct or incorrect, why? Thanks !!

Reply
January 5, 2011 at 3:37 pm

Petr Lapukhov, 4xCCIE/CCDE


@abjoint I think what you have is STP dispute condition. Both switches detect their peer having Designated bit set in BPDU, while the local port is designated as well. This results in link being blocked to avoid potential loops.

Reply
January 5, 2011 at 6:35 pm

abjoint
Thanks Petr, All of the switches are in mstp mode. May I understand as follow? The switch7 will drop the bpdu received from switch6 because the maxhops is 1, and it will announce itself as root bridge. But switch6 knows the root is still switch1, so there are two cist root in this network. And may I say that actually they are in different region even their mst configurations are same ? The question is should the ports between switch6 and switch7 keep the block state forever in this situation? I cant find the description for how to deal this situation in 802.1q or 802.1s. Thanks again!

Reply
January 10, 2011 at 4:59 am

Don
Thanks for the wonderful write-up. STP just keeps getting more and more confusing! One thing Im confused out: I understand what a regional root is and how it is elected, but why cant multiple MST regions work with only an IST root in each region, without electing regional roots? why do you need regional roots?

Reply
January 22, 2011 at 9:03 am

JJ
Can you please add your recommendation regarding situation when you want to add/modify VLAN-to-instance mapping ? Also, can you detail a little bit the use of revision number? Thank you for this great document !!!

Reply
January 24, 2011 at 1:01 pm

JJ
I really tried to find some references about the revision number. All I got till now is that this revision number is only used by administrators to keep track of their changes (manually modified). My real problem is this: when you need to modify the vlan-to-instance mapping, the bridge where you 1st perform it, will become isolated from the rest in the region (because the vlan-to-instance mapping does not match anymore) => there will be STP recalculation for ALL VLANS (!!) => there will be outage. Question: what is the best practice to modify the MST vlan-to-instance mapping in order to have the least outage in the network ?

Reply
February 1, 2011 at 9:49 am

Angu
Petr, Please clarify my doubt in the scenario 2, How did the sw2 become root for the MSTP instance 1 ? It has the default priority for instance 1 and its mac address seems to be higher when compared with sw3 and sw4. If all the priorities are default inside MSTP region for instance 1 will not the sw3 become the root for instance 1 ? Also i am confused with the line In our case, the root is SW4 and the root port is Fa 0/19. in the scenario 2.

Reply
February 11, 2011 at 9:03 am

VinceM
Very nice article, thank you ! I have a question regarding scenario 3, and the damage generated by the PVST switch : would RootGuard be helpful in that case, to prevent the PVST switch from becoming the CIST root bridge ?

Reply
Marc h 8, 2011 at 9:34 am

Abel
By the way this is a very well written article I have depended upon in many of the MST deployments. I can say this is very very well written and spot on! Cheers, -Abel

Reply
April 20, 2011 at 6:12 pm

vishnu
Hello All I am wondering if IEEE802.1s implementation can be converted to pvst+. If we run multiple CIST for each pvst+ neighbor does it work??? My goal is to have one implementation and should work purely IEEE802.1s (or) work as PVST+ switch.

Reply
July 16, 2011 at 1:11 pm

Lilou
Many, many thanks for your blog! Just amazing how you have made the subject clear! Cheers, Lilou

Reply
August 2, 2011 at 12:25 am

Pawel K.
Hi Petr, Great article, it helps me a lot with MST understanding. But I have noticed that there is small mistake in the following: Notice that SW3-1 and SW2-3 have equal External Costs to reach the CIST Root but SW3-1 wins the CIST Regional Root role due to lower priority. Shouldnt there be SW3-3 instead SW2-3?

Reply
August 21, 2011 at 3:39 am

screp
Thank you! Excellent article !!!

Reply
September 6, 2011 at 8:42 am

mongolio
Can any1 explain me how will be tie broke if Regional Root in one region connected to another region with CIST Root by multiple links with equal cost?

Reply

Cisco / MST. Multi-region implementation considerations by SOS Admin!


September 12, 2011 at 5:55 am

[...] , (4xCCIE) Understanding MSTP. [...]

Reply
September 21, 2011 at 10:25 pm

Dhivya
Great job!!! I was really confused about MSTP. This seems awesome. You have the ability to teach!!!

Reply
September 28, 2011 at 9:06 pm

Nick
First of all, thanks for the fantastic blog posting that you developed. It was very interesting reading. While MST does a nice job of eliminating the overhead of running many per-VLAN STP instances, I find the requirement for all external (non-MST) devices to form ONE STP domain to be cumbersome. In you example, consider REGION234 to be the core of a service-provider network, and multiply REGION1 by 100 (each representing a separate customer network). By default, with every REGIONx talking to MST0, and seeing it as the root, we would have to adjust every customer switch to forcethe two switches in each REGION to communicate using their direct links, instead of across the service-provider backbone. We really want each customer to be the root of their own STP, but MST doesnt allow for such a design. Instead of per-VLAN STP, one really wants per-Customer STP, chunking the VLANs for the customer into their own STP instance (of which they could be the root). Additional issues are that any change to the MST configuration requires an outage-producing maintenance window, and that, by forcing all the switches into one giant CST instance, any problem with the CST STP can take down the whole network. Great writeup though!!

Reply
October 10, 2011 at 9:02 am

MSTP en INE en "CCIE en castellano"


[...] http://blog.ine.com/2010/02/22/understanding-mstp/ [...]

Reply
January 4, 2012 at 6:48 am

paramesh
Thanks Realy help full to me, i started my carrier in this netwok from 3 year.. in ISP.. intialy my swtiching netwok is in pvst modedue increas in customer base planned to upgrad into mstp.. henceforth this is the best artical, tool for migrating existing network in mstp Realy thank full

Reply
February 6, 2012 at 1:17 pm

dj
WELL done ! This article helped me very very much in understanding MSTP.

Reply
February 6, 2012 at 5:32 pm

Cisco VTP on HP Switches? | Daz's bits and bobs


[...] MSTP nice blog on its inner workings here - http://blog.ine.com/2010/02/22/understanding-mstp/ [...]

Reply
March 14, 2012 at 9:53 am

Ashvin Paija
Nice explanation with example!!! Cisco should add all your blogs on different technology in their books. The content of books would be more nice and explanatory then. Ashvin

Reply

Leave a Reply
Name (required)

Mail (will not be published) (required)

Submit Comment

FREE Rack Time w ith Purchase of CCIE Service Provider v3.0 Workbook - http://t.co/0bncsTPF

Not at this time. But thank you for asking. @febilous @inetraining is there any scholarship for #SP track?

twitter.com/inetraining

SIGN UP for our ~LIVE ONLINE~ CCNP Security 6-Day Bootcamp July 23rd - July 28th for only $499! http://t.co/JSosx2w 1

2011 INE, Inc., All Rights Reserved

pdfcrowd.com

Das könnte Ihnen auch gefallen