STP is your friend
By stretch | Thursday, October 15, 2009 at 4:36 p.m. UTC
Every once in a while I encounter someone who wants to disable Spanning Tree Protocol (STP) on their LAN switches. This is a bad idea. Consider the typical LAN topology below:
The two access switches are trunked, and Switch 1 maintains a trunk to the local router. All access ports are assigned to VLAN 10. An admittedly simplistic but valid design. Let's observe what happens when a redundant layer two connection is introduced between the two access switches, first with STP enabled and then without.
In out first scenario, STP is left to its default configuration, with one exception: Switch 1 has been manually configured as the STP root for clarity.
Switch1# show spanning-tree summary Switch is in pvst mode Root bridge for: VLAN0001, VLAN0010 Extended system ID is enabled Portfast Default is disabled PortFast BPDU Guard Default is disabled Portfast BPDU Filter Default is disabled Loopguard Default is disabled EtherChannel misconfig guard is enabled UplinkFast is disabled BackboneFast is disabled Configured Pathcost method used is short Name Blocking Listening Learning Forwarding STP Active ---------------------- -------- --------- -------- ---------- ---------- VLAN0001 0 0 0 1 1 VLAN0010 0 0 0 1 1 ---------------------- -------- --------- -------- ---------- ---------- 2 vlans 0 0 0 2 2
Now imagine that after some office furniture was rearranged, two cable drops have been mistakenly cross-wired by end users. (Note that newer Auto-MDIX-enabled switches such as the Catalyst 2960 and 3560 will form a link even with a normal non-crossover cable.) Port F0/19 on Switch 1 has been connected to F0/8 on Switch 2, forming a layer two loop within VLAN 10.
With STP enabled, this isn't a problem; the redundant connection is marked as an alternate path to root on Switch 2 and subsequently blocked to prevent a switching loop:
Switch2# show spanning-tree vlan 10 VLAN0010 Spanning tree enabled protocol ieee Root ID Priority 10 Address 000f.345f.1680 Cost 19 Port 1 (FastEthernet0/1) Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec Bridge ID Priority 32778 (priority 32768 sys-id-ext 10) Address 000e.8316.f500 Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec Aging Time 300 Interface Role Sts Cost Prio.Nbr Type ------------------- ---- --- --------- -------- -------------------------------- Fa0/1 Root FWD 19 128.1 P2p Fa0/8 Altn BLK 19 128.8 P2p
This makes for a happy network. But now we'll test what happens if STP is disabled by a negligent administrator.
Switch1(config)# no spanning-tree vlan 10
Switch2(config)# no spanning-tree vlan 10
When a redundant connection is made again, we immediately being receiving MAC address flapping notifications:
%SW_MATM-4-MACFLAP_NOTIF: Host 001d.60b3.0184 in vlan 10 is flapping between port Fa0/8 and port Fa0/1
These are triggered by the never-ending broadcast storm initiated by every layer two broadcast (such as an ARP request) from any host on the VLAN. Because STP is not present, neither switch can detect the switching loop. Ethernet frames have no concept of TTL, so they circle the loop endlessly.
We can confirm the occurence of a broadcast storm by examining the traffic counters on either of the inter-switch links:
Switch2# show interfaces f0/1 FastEthernet0/1 is up, line protocol is up (connected) Hardware is Fast Ethernet, address is 000e.8316.f501 (bia 000e.8316.f501) MTU 1500 bytes, BW 100000 Kbit, DLY 100 usec, reliability 255/255, txload 57/255, rxload 57/255 Encapsulation ARPA, loopback not set Keepalive set (10 sec) Full-duplex, 100Mb/s, media type is 10/100BaseTX input flow-control is off, output flow-control is unsupported ARP type: ARPA, ARP Timeout 04:00:00 Last input 00:00:01, output 00:00:08, output hang never Last clearing of "show interface" counters never Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0 Queueing strategy: fifo Output queue: 0/40 (size/max) 5 minute input rate 22566000 bits/sec, 13031 packets/sec 5 minute output rate 22569000 bits/sec, 12969 packets/sec 4655802 packets input, 1003268438 bytes, 0 no buffer Received 4654349 broadcasts (0 multicasts) 0 runts, 0 giants, 0 throttles 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored 0 watchdog, 4637 multicast, 0 pause input 0 input packets with dribble condition detected 4609557 packets output, 1003475840 bytes, 5676 underruns 0 output errors, 0 collisions, 1 interface resets 0 babbles, 0 late collision, 0 deferred 0 lost carrier, 0 no carrier, 0 PAUSE output 5676 output buffer failures, 0 output buffers swapped out
You may note that the observed throughput is only about 20% of the maximum speed (22 out of 100 Mbps). This is because the average throughput is calculated over a five-minute interval by default, and the above capture was taken roughly a minute into the storm. (As an aside, the default five-minute interval can be modified with the
load-interval command under interface configuration.)
One might be tempted to argue that a network is invulnerable to this effect if routed links are used between all access switches. Wrong. It is still possible for a layer two loop to occur if two or more erroneous connections are made (versus just the one in this example).
Bottom line: STP exists for a reason. Let it be.
About the Author
Jeremy Stretch is a network engineer living in the Raleigh-Durham, North Carolina area. He is known for his blog and cheat sheets here at Packet Life. You can reach him by email or follow him on Twitter.
October 15, 2009 at 4:52 p.m. UTC
I agree and I had to face a problem on one of my site on a new job. A technician connected two ports from two switches to a hub and brought the whole site down. STP fixed it. The two switches had no spanntree configured.
October 15, 2009 at 6:22 p.m. UTC
Seems like a common misconception is that switches are like 'extension cords' their purpose is only to provide more holes to switch things into. Spanning-tree is one area of my studies that I'm heavily interesting. Currently working with a network that is running purely PVST which has been working great until non-Cisco switches were introduced.
Spanning-tree is a topic I find interesting personally.
October 15, 2009 at 9:02 p.m. UTC
STP isn't necessary if you're using Flex Links. Just a thought.
October 15, 2009 at 11:26 p.m. UTC
@ekaleido: STP is still necessary. Flex links do nothing to protect against loops on access ports.
October 16, 2009 at 6:03 a.m. UTC
In a company I worked for few years ago, the STP was disabled by default for all unused VLANs. It was in the company's switches config templates. Silly enough but I'm pretty sure the guy who wrote the templates did not knew exactly what STP is. Of course, one day someone configured a new vlan and forgot to enable STP... The entire network has been down almost instantly! Now, guess what, templates have been adapted and STP is always enabled even on standalone switches. :-)
October 16, 2009 at 11:04 a.m. UTC
I'm very interested on Jim's comment: There are so many interaction-problems when using pvst+ with other standard stp implementations.
The "[...] to maintain compatibility bpdu's are sent over the native vlan [...]" affirmation seems to be false in many scenarios.
One example of this is the interaction between a Cisco Catalyst and a Mikrotik Router OS (or a linux bridge): bpdu's are sent and received only if the trunk between the boxes contains the vlan 1.
October 16, 2009 at 2:14 p.m. UTC
STP killed my parents, burned down my house, and ran off with my girlfriend. STP is not my friend.
October 17, 2009 at 4:30 a.m. UTC
Using Flex Links in addition to broadcast storm protection ( which you should have on if using STP or not) absolutely works as an alternative to STP. Access ports and all. Much more simple to implement and trouble-shoot than STP.
October 17, 2009 at 1:00 p.m. UTC
@mmilholland: One more time: STP is still necessary.
Storm control does nothing to prevent broadcast storms (that's what STP is for); it's simply a damage control tool to limit the amount of bandwidth they can assume. Without STP, every broadcast frame will continue to circle a loop unless the storm control threshold has been met for that interval. Note that this also inhibits legitimate broadcast traffic.
October 17, 2009 at 1:05 p.m. UTC
STP is for chumps.. Layer-3 at the access layer!
October 17, 2009 at 3:42 p.m. UTC
You are flat out wrong. Storm Control err disables the port if someone loops a hub or switch off the access port. So to say it does nothing to prevent broadcast storms is pure fiction. Further we have many broadcast based applications in our production network ( that uses flex links and storm control ) and NONE of them are affected by the storm control ( AGAIN you should have this feature on any way ). Storm control doesn't affect any broadcast based traffic unless it reaches a certain percentage of the bandwidth of the port.
October 17, 2009 at 4:00 p.m. UTC
@mmilholland: The default action of storm control is exactly what I described. Check the documentation if you don't believe me. You can optionally configure error-disabling on a breach of the threshold, though careful consideration should be made before doing so as it opens an avenue for self-DoS.
While I agree that implementing storm control is an excellent measure, it is not a substitute for spanning tree. Not sure why this is taking so long to sink in.
October 17, 2009 at 7:40 p.m. UTC
My access ports are configured with portfast and protected with bpduguard as they should be. Flexlinks is an absolute substitute for STP in this respect. Yes we can debate whether STP is involved in light of portfast and bpduguard but why waste either of our time? My $0.02.
October 18, 2009 at 5:12 p.m. UTC
First, I believe STP should always be left on, no matter if you have a very small topology or are using Etherchannels or are convinced you have alternate systems in place. This is because there can always be fat finger moments, and STP will protect your network from meltdown. Simple Belt and Braces. A good logging system will warn you something has gone wrong (STP changes) and you have an opportunity to fix things before the customer gets on the phone!
Second, thanks for this great article! I'm just working on something similar for our company internal blog. There are a lot of IT staff who think STP causes problems or is unreliable. In my experience this is often down to buggy STP implementations on cheap switches (e.g. too much CPU load). I say: don't disable STP, just buy better switches.
Third, I've also seen that in a loop scenario the packets per second can overwhelm small switches with low bus/backplane capacities, even if the data quantity is lower than the link bandwidth.
Again, thanks for the good article, keep it up!
October 18, 2009 at 7:41 p.m. UTC
I must include the security aspect. STP has no authentication , I suggest using root guard and/or bpduguard.
October 23, 2009 at 3:49 p.m. UTC
STP is way past it's sell by date!!
I never design a network with STP anymore, I have had my fingers burned too many times :(
Cisco need to come up with another technology like their competitors. Extreme have redundant ports, with loopback prevention on access ports. Nortel have S-MLT. Cabletron (I know they are not around anymore) had Securefast, which was OSPF at layer 2.
1) Layer2 at the edge and HSRP/GLBP/VRRP at the core. This allows multiple active links to the edge switches.
2) Layer3 throughout E.g. OSPF
October 23, 2009 at 8:58 p.m. UTC
STP is by no means an obsolete technology; it is a core component to any Ethernet network. Also, engineers don't get "burned" by protocols, but rather by poor design choices, whether their own or others'.
Layer three to the access edge is a very useful tool, but it does not negate the need for spanning tree, as I noted at the end of the article.
October 24, 2009 at 8:52 p.m. UTC
I just enjoy how this site gets everyone talking, funny I did not realize anyone would be against STP, but at least they present good arguments. What I think is that STP must be a dinosaur, I mean, look at all the enhancements that have been necessary to fix it. UplinkFast, BackboneFast, UDLD, Root Guard, etc etc..
October 24, 2009 at 9:02 p.m. UTC
@Dano: STP has undergone many transitions since its inception, and I think that's the primary reason so many people fail to understand it thoroughly. Enhancements like UplinkFast and BackboneFast were Cisco-proprietary modifications to the legacy spanning tree protocol, which was ratified back in 1998. These modifications were contributed to the IEEE 802.1w amendment to form Rapid STP, which was merged back into IEEE 802.1D in 2004. Likewise, the 802.1s amendment which defined Multiple Spanning Tree (MST) was merged into the 802.1Q specification in 2003 (MST relies on RSTP).
The Spanning Tree cheat sheet actually includes a pseudo-timeline of the various STP specifications, including Cisco's proprietary PVST+ family.
October 26, 2009 at 8:59 p.m. UTC
I see a few people arguing why they might be able to get away without STP, but I'm curious to hear why they would want to? I've never heard a good explaination (other than superstition and cargo-cult thinking) for why someone would disable STP.
Even if you are using something other than STP as your primary inter switch redundancy protocol (L3 links, etc), it still seems stupid to disable STP.
March 29, 2010 at 6:07 a.m. UTC
For some protocols like VRRP and OSPF STP is a nightmare, because of the STP convergence time of 55 secs approx., while using these protocols we always disable STP on switches/ rather move to RSTP (convergence time of < 1 sec).
April 12, 2010 at 12:24 a.m. UTC
Well, to be frank.. if you're arguing that turning off STP is a valid solution because you have other tools that "protects" you and because STP is so "hard" to troubleshoot just seriously points out that you have books to read and docs to study.
If you think STP is difficult wouldn't it be a better solution to master it than to implement fancy workarounds that'll just bite you in the ass later on.
May 19, 2010 at 8:34 a.m. UTC
I have to say from a beginners point of view (around 3/4 of the way through the CCNA Exploration course) this article has been very informative and has certainly helped me understand STP and it's importance just from looking at the differing points of views and highlighting the other technologies which enhance/replace (depending on your view point) STP.
February 27, 2013 at 3:26 p.m. UTC
I can say that I fully endorse and recommend anyone designing a network to keep layer 2 as small as possible and to incorporate rstp. This is a dead easy config and leave the more complex network design to level 3 only. More stable and manageable!! Worked on a 3,500 node network I designed in a multinational company.
March 24, 2014 at 12:37 a.m. UTC
Hi guys... Not sure if this is the right place to post this, but I've been reading about STP tiebreakers and I have a little question:
As far as I know, STP tiebreakers for electing the root ports are:
- BID (Electing the root bridge)
- Lower Path Cost (Selecting the best route to the root brige
- Lower BID (If two or more paths have the same cost to the root brige)
- Lower Recieved Port-Priority (This only happens if I a switch have multiple links connected to the same neighboor switch)
- Lower Local Port-Priority (This only happens if a switch have multiple ports connected to the same port of a neighboor switch. I think using a hub, for example)
I've made some test and tiebreakers work fine. But I've also noticed that this tiebreakers apply for electing root ports, and not when electing designated ports.
Let's say that I have a root switch (Switch A) connected to (Switch B). Switch A have two ports connected to the same segment ( Port 1 and 2 through a hub). Switch B have one port to that segment (Port 1). Obviosly, switch B is going to elect that port as the root port, but Switch A have to choose one designated port and it also have to block the the other one.
In that case, if you see the tiebreakers rules, switch A should go to rule number 4 (Lower Received port priority), because It received the same path coost to root (0) on both ports, and the same BID (his BID). Port 1 should receive a port priority of 128.2, which is the default priority sent by port 2. And port 2 should receive a priority of 128.1 which is the default priority sent by port 1. The rule says that lowe received por priority should break the tie, so port 2 recieve the lower priority and should be elected as the designated port.
When I did my test, port 1 was elected as the designated port, ignoring rule number 4 and going directly to rule number 5.
So, when electing designated ports, does the tiebreak rules apply?
I'm sorry if this post is too long or if it is in the wrong place, but this seems a good place to ask because I found really good information on this site.
Thank you all.