Spanning tree, the bad guy on the block?

Posted on: May 23, 2012 by Adam Dolman

I recently spent a week at Cisco Live, and one thing I kept seeing across presentations was the theme of how 'insert acronym here' means the solution to no more spanning tree loops.

Spanning tree, in my opinion, is an old hero of networking and personally something I still see as very much relevant. But let's take a look at some of these alternatives....

What we seem to have as helpers and alternatives are a minefield of acronyms... MEC, VPC, VCS, FabricPath, TRILL, VPLS, OTV... Some of these are complementary technologies; others are there to solve different problems, and some just don’t like each other! And just as a forewarning, without a basic knowledge of spanning tree, some of this blog post might not make much sense!

So what problem are we solving in kicking out our old saviour of resilient topologies and relegating it to the rubbish bin? The goal of all these new technologies is to prevent loops and through that to utilise the full fabric, maximise bandwidth, and of course to prevent the broadcast storm that takes down your network!

There's no denying that spanning tree (or STP), despite allowing us to build resilient networks for the last 20 years, has a lot of flaws. A number of these have been addressed over the years, but ultimately you always ended up with links doing nothing waiting for a failure to step up to action. And in the occasions where something went wrong, whether it be human error, hardware failure or a simple software bug, the results could be catastrophic to a network. So let's say we're addressing two issues - maximum utilisation of the infrastructure, and preventing the failures that could take down a network. Sounds simple enough!

STP has certainly been doing well on gathering replacements; there are a lot of standards out there that act to either replace or reduce the need for it. Let’s briefly look at some of these acronyms.

MEC is a Cisco proprietary technology related to their VSS technology, which allows for multiple links to act as one across two switches running as a VSS pair – this was an area traditionally spanning tree would be required. No loops... problem solved. Except it's only applicable with a 6500 VSS solution...

So VPC... except that relates to the Cisco Nexus 7000 product line, and performs a similar job to MEC as an end result... but it's a different technology. And VCS is Brocade’s version of the same thing. Three acronyms, all incompatible, all doing something along the same lines!

FabricPath is again a Cisco proprietary technology that came out before the TRILL (Transparent Interconnection of Lots of Links) standard was finalised in 2011. Both are quite similar and use ‘Layer 2 Multi-Pathing’ (layer 2 routing effectively) to make use of your resilient links, but do this through different (and incompatible) methods. This might bring back some memories of Cisco’s spanning tree enhancements, released to improve spanning tree before standards caught up. Even though they aren’t interoperable, it does give us a solution in the data centre for removing spanning tree and making use of redundant links. The biggest issue again is still limited support platform. Presuming we do get a wider range of platform support, surely that's the solution then?

But no... At least not yet in my opinion. VPC and MEC are widely deployed and understood... they create etherchannels, a staple of networking for years... the latter are basically replacing spanning tree with something new... and it certainly doesn't have the proven track record yet to be free of the problems that dog spanning tree in the first place! However it is a big step forward, and both have methods inherited from routing protocols to prevent loops.

The last two acronyms on my starting list are something different – they are designed to connect data centres (or perhaps even just areas in one data centre) together and allow for the same LANs to exist across them, but keep each site as its own spanning tree (or STP replacement) domain. Certainly one of the bigger disasters waiting to happen was connecting data centres together via trunks using spanning tree to make it all work. As soon as there was a resilient link, there was a potential to take out not one but two or more data centres. VPLS is the open of the two, OTV the Cisco standard. And of course they both work quite differently, and presently have different scale and hardware support.

So where does that leave us? What should I do if I implemented a network today? Indeed will the technology I decide to implement still be a supported standard in a few years? Should my hardware platform be dictated by these technologies?

Please feel free to comment and share your stories of how you view the various new and old technologies. For me, I see a hybrid approach based on the infrastructure you have. MEC, VPC and VCS, although I hope there will be unification, both serve a place on their particular hardware platforms. TRILL might be a new guy, but again, if you are running a multi-vendor network that supports it why not give it a trial run, or FabricPath if you’ve got lots of Cisco Nexus switches around and don’t intend to deploy another vendor (or inter-operate with TRILL in the future). But in my opinion, to fully allow interoperability and ensure your network will work with whatever might need to be plugged in, you certainly can't go typing in 'no spanning-tree' on your switch for a while to come, even if you do think you maintain a loop free topology through the above technologies.

Share this blog article

About Adam Dolman

Head of Microsoft Azure Cloud Product Engineering, Atos Global IDM and member of the Scientific Community
Adam Dolman is currently the Head of Azure Cloud Product Engineering for Atos Global IDM, and a member of the Atos Scientific Community and an Atos Expert.  He is responsible for the engineering and development of the Microsoft Cloud offerings as part of the Atos Orchestrated Hybrid Cloud suite. Previously he worked for Atos Major Events for 7 years, most recently as the Technical Operations Manager for the Rio 2016 Olympic Games for Atos Major Events, responsible for the architecture, security, deployment and management of the Rio 2016 Games infrastructure.  He has worked at Atos since 2005, starting in networking, with particular interests in cloud, digital transformation and virtualisation. Adam holds an MA in Computer Science from the University of Oxford and numerous Cisco and VMWare qualifications.

Follow or contact Adam