Stretch's Hierarchy of Network Needs
By stretch | Monday, December 14, 2015 at 3:07 a.m. UTC
Remember Maslow's hierarchy of needs from school? The theory asserts that every human shares the same set of basic physical and psychological needs in order to be happy, with more primal needs like food and shelter taking precedence over emotional needs like love and companionship.
A while back, I was pondering what would be necessary to fully automate a network, and it occurred to me that a very similar hierarchy of needs can be laid out for a computer network to achieve its optimal state.
1. A Functioning Network
At the very bottom of the hierarchy is everything a network requires to function: Routers, switches, cabling, power, and so on, just as tier one of Maslow's hierarchy encompasses everything a human needs to stay alive. At this stage, a network can function, and can even function well, but it cannot adapt or grow.
Many small businesses operate their networks at this stage for years with no major problems. After all, when left alone, computers and networks tend to just keep chugging along. And if your entire network comprises a cable modem, a switch, and a few access points, it's entirely possible that it will run for years without needing any tweaking.
2. Someone to Operate the Network
Of course, more advanced networks need a human operator to make changes and address faults. At this stage, a network is associated with some human operator capable of inspecting and manipulating network components to modify the network's overall behavior. This could be a full-time network staff or just a systems administrator who pokes at a router every now and then.
At this stage, a network can grow and adapt. However, every change imposes a substantial overhead, as the human operator must first research the network's state (ascertaining the network topology, routing table, etc.) and apply all configuration changes by hand. I think it's fair to say that most networks today exist at this stage.
3. Abstraction of Network State
This is where things start to get interesting. This stage entails defining all of the network's components and how they should interoperate, in some entity separate from the network itself. As an illustration, consider running
show lldp neighbors on a switch. The output will tell you what's currently connected to that switch and via which interfaces. However, it doesn't tell you what should be connected. Similarly, just reviewing the running configuration on a router doesn't tell you whether that configuration is correct, only that it exists.
But, if we employ IP address and infrastructure management systems and configuration templating, we can form some assertions indicating how the network should function. This model becomes the "source of truth" for the network: If some attribute doesn't exist in the model, it shouldn't exist in the real world. We can compare the real-word state of the network to the source of truth to identify and resolve any discrepancies.
4. Automated Provisioning
At this stage, we abandon manual configuration changes in favor of deployments driven through automation. For example, instead of loading a configuration into a new switch by hand, we simply hook the switch up to the network and it downloads the appropriate configuration — generated according to our source of truth established in stage three — through an automated provisioning service. Humans no longer need to be involved in the process.
Extending this idea further, we can develop systems to automatically apply configuration changes to keep the network synchronized with its abstracted state. If we want to add a link between two nodes, we simply update the model through some human-friendly UI, and the system automatically configures the relevant interface on either node in anticipation of the new link.
5. Automated Remediation
This is really the holy grail of our field: networks that fix themselves. And I don't mean routing traffic via an alternate path when a link goes down. I mean things like:
External monitoring alerts for sub-SLA performance of traffic via a certain Internet provider. Edge routers are automatically reconfigured to de-preference links for that provider to restore performance to acceptable levels.
A router detects that one of its line cards is experiencing hardware failures. The line card is taken offline and a TAC case (complete with logs and inventory data) is raised automatically with the vendor to initiate an RMA.
A rack loses one of its two redundant access switches. The network detects the hazardous condition and begins migrating virtual machines to racks that have full connectivity.
Networks that reach this stage are truly awesome. And here's the best part: All of these things are easily in reach with modern technology. We just need to build it.
About the Author
Jeremy Stretch is a network engineer living in the Raleigh-Durham, North Carolina area. He is known for his blog and cheat sheets here at Packet Life. You can reach him by email or follow him on Twitter.
Posted in Opinion
December 14, 2015 at 5:46 p.m. UTC
Was this article written to get us in management all excited? :) I totally agree, with you and very much desire this land of joy! Do you think we should start some follow on discussions about possible solutions to your wonderful visionary Utopian world?
December 20, 2015 at 2:05 a.m. UTC
December 21, 2015 at 2:08 p.m. UTC
The a parallel drawn between Maslow's hierarchy and evolutionary Network hierarchy is quite amazing!
December 30, 2015 at 8:53 a.m. UTC
Having implemented only up to Level 2, I will put down Level 3 for a new year's resolution!
January 4, 2016 at 1:00 p.m. UTC
The challenge I see with this, is retrofit. This is probably the reason most are stuck at Level 2. Designing and building a network like that is pretty easy from a blank canvas, but...
On a large complex network, making the move from level 2 to level 3 is a mammoth task, as it involves going through all the config on all the devices and ascertaining the state of the network. Half of that job is easy, obtaining a snapshot of the running config to tell you what the network IS doing. However the not-so-easy second half of that task is figuring out WHY each part of the config exists. This route is in the routing table with a particular next hop, great. But why is it there? why is that the next hop? What is the end goal of that and the related config?
Answering those questions is no small task in configs that have existed through several hardware life cycles and staff life cycles.
Then once you get the network blueprint down it must be maintained accurately. It's your one truth by which all else is measured, the truth cannot turn out to be a lie!
This is a problem as the end goal for a lot of network config is application related. Making things link up between teams is tough. App teams stop a listener on a server, this changes the needs of the network. This needs to be updated on the blueprint as the correct state of the network no longer needs that port reachable across it. This then gets implemented at the next true up and ACLs are modified to no longer include it. - How likely is that process to actually be followed? Since when do app teams care about the network team's documentation?
However once those rather large hurdles are overcome... Levels 4 and 5 are somewhat trivial to achieve if your state blueprint is good.
I am definitely working towards this, but it's likely going to be years yet before I get there.
January 28, 2016 at 11:28 p.m. UTC
Wonder if there is a software for step 3 and 4. Is SDN going to the ultimate solution?
March 15, 2016 at 7:18 p.m. UTC
Nice work. I had seen similar from jim French (believe its frenchjim.com, but yours is good (there is little similarity except in expansion),
I had planned on expanding upon this concept (maslow : network), will build on your insights and reference, will bounce off of you. but the content you have made is good work and important to consider when layering in automation framworks and SDN/SDWan, (twill be on williamnellis.com)