The premiere source of truth powering network automation. Open and extensible, trusted by thousands.

NetBox is now available as a managed cloud solution! Stop worrying about your tooling and get back to building networks.

Cumulus Linux: First Impressions

By stretch | Wednesday, October 1, 2014 at 2:05 a.m. UTC

Typically, when you buy a network router or switch, it comes bundled with some version of the manufacturer's operating system. Cisco routers come with IOS (or some derivative), Juniper routers come with Junos, and so on. But with the recent proliferation of merchant silicon, there seem to be fewer and fewer differences between competing devices under the hood. For instance, the Juniper QFX3500, the Cisco Nexus 3064, and the Arista 7050S are all powered by an off-the-shelf Broadcom chipset rather than custom ASICs developed in-house. Among such similar hardware platforms, the remaining differentiator is the software.

One company looking to benefit from this trend is Cumulus Networks. Cumulus does not produce or sell hardware, only a network operating system: Cumulus Linux. The Debian-based OS is built to run on whitebox hardware you can purchase from a number of partner Original Device Manufacturers (ODMs). (Their hardware compatability list includes a number of 10GE and 40GE switch models from different vendors.)

Cumulus Linux is, as the name implies, Linux. There is no "front end" CLI as on, for example, Arista platforms. Upon login you are presented with a Bash terminal and all the standard Linux utilities (plus a number of not-so-standard bits). Like any OS, Cumulus handles interactions with the underlying hardware and among processes.

cl_architecture.png

The Platform

As you would expect, the software is licensed. (However, the combined cost of the Cumulus software coupled with whitebox hardware is remarkably competetive.) A license is required to activate the front-panel interfaces, which are presented to user space as swp* interfaces, starting with swp1. These appear as Ethernet interfaces normally do on Linux, visible with ip link show. (ifconfig has long been deprecated by the community in favor of the iproute2 family of tools.)

cumulus@leaf1$ ip link show
1: lo:  mtu 16436 qdisc noqueue state UNKNOWN mode DEFAULT 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0:  mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 1000
    link/ether 00:e0:ec:25:7d:43 brd ff:ff:ff:ff:ff:ff
3: swp1:  mtu 1500 qdisc pfifo_fast master bond0 state UP mode DEFAULT qlen 500
    link/ether 00:e0:ec:25:7d:45 brd ff:ff:ff:ff:ff:ff
4: swp2:  mtu 1500 qdisc pfifo_fast master bond0 state UP mode DEFAULT qlen 500
    link/ether 00:e0:ec:25:7d:45 brd ff:ff:ff:ff:ff:ff
5: swp3:  mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 500
    link/ether 00:e0:ec:25:7d:46 brd ff:ff:ff:ff:ff:ff
...

Interfaces are configured the same as most other Linux platforms as well. For example, configuring an LACP port-channel is done with the following bit of config in /etc/network/interfaces:

auto swp1
iface swp1

auto swp2
iface swp2

auto bond0
iface bond0
  bond-slaves swp1 swp2
  bond-mode 802.3ad
  bond-miimon 100
  bond-use-carrier 1
  bond-lacp-rate 1
  bond-min-links 1
  bond-xmit-hash-policy layer3+4

The interface is then brought up manually with sudo ifup bond0. Its status can be read from /proc/net/bonding/bond0:

cumulus@leaf1$ cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer3+4 (1)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
cumulus@leaf1$ cat /proc/net/bonding/bond0 | less
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer3+4 (1)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: fast
Min links: 1
Aggregator selection policy (ad_select): stable
System Identification: 65535 00:e0:ec:25:7d:45
Active Aggregator Info:
        Aggregator ID: 1
        Number of ports: 2
        Actor Key: 17
        Partner Key: 17
        Partner Mac Address: 00:e0:ec:25:7d:af

Slave Interface: swp2
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: 00:e0:ec:25:7d:45
Aggregator ID: 1
Slave queue ID: 0

Slave Interface: swp1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 2
Permanent HW addr: 00:e0:ec:25:7d:44
Aggregator ID: 1
Slave queue ID: 0

Similarly, we can configure switch ports as part of a bridge group (analogous to a VLAN or SVI interface):

auto swp5
iface swp5

auto swp6
iface swp6

auto vlan100
iface vlan100
  bridge-ports swp5 swp6
  bridge-stp on 

Again, we then turn up the interface. brctl displays the status of all bridge interfaces.

cumulus@leaf1$ sudo ifup vlan100
cumulus@leaf1$ brctl show
bridge name     bridge id               STP enabled     interfaces
vlan100         8000.00e0ec257d48       yes             swp5
                                                        swp6

We can use ethtool to display layer 1 and layer 2 link information:

cumulus@leaf1$ sudo ethtool swp1
Settings for swp1:
        Supported ports: [ TP ]
        Supported link modes:   100baseT/Half 100baseT/Full 
                                1000baseT/Full 
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Advertised link modes:  100baseT/Half 100baseT/Full 
                                1000baseT/Full 
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: No
        Speed: 1000Mb/s
        Duplex: Full
        Port: FIBRE
        PHYAD: 0
        Transceiver: external
        Auto-negotiation: on
        Current message level: 0x00000000 (0)

        Link detected: yes

Yet another utility, mstpctl, displays STP information:

cumulus@leaf1$ mstpctl showbridge
vlan100 CIST info
  enabled         yes
  bridge id       8.000.00:E0:EC:25:7D:48
  designated root 8.000.00:E0:EC:25:7D:48
  regional root   8.000.00:E0:EC:25:7D:48
  root port       none
  path cost     0          internal path cost   0
  max age       20         bridge max age       20
  forward delay 15         bridge forward delay 15
  tx hold count 6          max hops             20
  hello time    2          ageing time          300
  force protocol version     rstp
  time since topology change 137s
  topology change count      0
  topology change            no
  topology change port       None
  last topology change port  None

How about LLDP? Yep, there's a utility for that too:

cumulus@leaf1$ sudo lldpcli show neighbors
-------------------------------------------------------------------------------
LLDP neighbors:
-------------------------------------------------------------------------------
Interface:    eth0, via: LLDP, RID: 1, Time: 0 day, 06:09:36
  Chassis:     
    ChassisID:    mac 08:9e:01:ce:dc:22
    SysName:      colo-mgmt-switch
    SysDescr:     Cumulus Linux
    MgmtIP:       100.64.3.2
    Capability:   Bridge, on
    Capability:   Router, on
  Port:        
    PortID:       mac 08:9e:01:ce:dc:34
    PortDescr:    swp18
-------------------------------------------------------------------------------
Interface:    swp2, via: LLDP, RID: 3, Time: 0 day, 05:39:46
  Chassis:     
    ChassisID:    mac 00:e0:ec:25:7d:ad
    SysName:      leaf2
    SysDescr:     Cumulus Linux
    MgmtIP:       192.168.0.12
    Capability:   Bridge, off
    Capability:   Router, on
  Port:        
    PortID:       ifname swp2
    PortDescr:    swp2
...

None of this should look unfamiliar to anyone who spent a fair amount of time configuring networking on Linux, as none of the commands above are specific to Cumulus (although Cumulus does come packaged with a good number of custom tools).

Dynamic routing is all handled by Quagga, which offers a modal (Cisco IOS-like) configuration interface through vtysh. In the interest of automatability, Cumulus also includes their own non-modal (Juniper "set" style) configuration utility. For example, the two configurations below are identical in effect:

quagga(config)# router bgp 65002
quagga(config-router)# redistribute  static
cumulus@switch:~$ sudo cl-bgp as 65002 redistribute add static

The routing table is, of course, displayed with ip route show:

cumulus@leaf1$ ip route show
default via 192.168.0.1 dev eth0 
192.168.0.0/24 dev eth0  proto kernel  scope link  src 192.168.0.11 

Linux? In my Switch?

I'm not going to lie: Cumulus Linux was not immediately appealing to me. Network engineers have never been quite at peace with the idea of using Linux for traffic forwarding, even if it is running under the hood of more network devices than we care to admit. Cumulus is unapologetically Linux, and makes no effort to replicate something comfortable for the experienced Cisco or Juniper CLI jockey. Unlike most network platforms, it doesn't have a portable, atomic configuration file, and its command line experience is probably never going to be as smooth as IOS or Junos. But that's okay. Cumulus is focused on the future: network automation and orchestration.

Most customers who deploy Cumulus today rarely, if ever, configure a switch by hand. Instead, they rely on tools like Ansible, Chef, or Puppet to automate configuration tasks for them. Practically speaking, this is the only way to effectively manage a data center with hundreds of racks of gear, and that's where Cumulus has a distinct edge.

When you order a whitebox switch from an ODM, it probably won't come with Cumulus installed. It will have only a small bootloader named ONIE. The Open Network Install Environment is essentially GRUB and busybox: It provides just enough functionality to download, install, and boot a "real" operating system like Cumulus. (Note that ONIE is not specific to Cumulus: It can theoretically be used to download whatever OS you choose.) This is analogous to how PXE booting is used to automatically provision servers.

Cumulus supports optional zero-touch provisioning. I haven't played with it yet, but the idea is that you can precreate a configuration script for every device to be deployed and keep them on a file server somewhere inside your network. When a switch boots up, it downloads and runs the instructed script. With a bit of planning, it's entirely feasible to deploy switches straight out of the box into production without any interaction with the local command line, and that's pretty attractive prospect.

Cumulus is still a relatively young platform, and sure to be full of bumps and rough edges, but it holds quite a bit of promise, especially positioned as a cost-effective underlay platform for NSX or Contrail or whatever comes out next week. I'm hoping to write up some more posts on Cumulus once I've gotten to spend more time with it.

Posted in Data Center, Hardware

Comments


Jeremy Duncan
October 1, 2014 at 2:22 p.m. UTC

Cumulus Networks has quite a few automated labs you can follow along in their workbench. https://support.cumulusnetworks.com/hc/en-us/articles/201787566-Setting-up-a-Basic-Puppet-Lab


Dee
October 1, 2014 at 3:49 p.m. UTC

Pretty good write up.

I would love to see this automation actually working on a network


Gabriel
October 1, 2014 at 9:22 p.m. UTC

The front end Arista offers presents some additional (and quite useful) info compared to the vanilla Linux. For instance, regarding LACP:

A1(config)#sh port-channel 3
Port Channel Port-Channel3:
  Active Ports: Ethernet13 PeerEthernet13
  Configured, but inactive ports:
       Port             Reason unconfigured
    ------------------- -----------------------------------------
       Ethernet3        Link down while waiting for LACP response
       PeerEthernet3    Link down while waiting for LACP response

I wasn't able to find details about the state of the LACP negotiation in /proc/net/bonding/bond0 or elsewhere.


noobie
October 3, 2014 at 11:32 a.m. UTC

is this bash shell susceptible to shell shock?


Kamlesh Shah
October 3, 2014 at 4:46 p.m. UTC

This is a good write up! The discussion on value for Linux based OS is very interesting as well. For instance, it would be good to understand differentiation Cumulus Linux has from any other Linux for a network operations team. Especially given network operations team know and love their tools, how would this change or impact the tooling etc. In the end it is about dollars and cents.


Ron
October 3, 2014 at 8:06 p.m. UTC

I tried to find pricing, but every vendor tells me to contact them first.

I would be interested in PoE switches running on Linux, so I can put our VoIP product on them.


klr
October 6, 2014 at 3:38 p.m. UTC

Can you lab this on a normal server or even esx?


James
March 27, 2015 at 12:01 a.m. UTC

There another product that's been quietly in the market doing this very thing. Check out Mikrotik.com

http://mikrotik.com

It runs on off the shelf hardware or they have their own. It's robust and support almost all of the modern routing protocols.

I've been using them for years now.


CCNA
July 1, 2015 at 3:34 p.m. UTC

There are carrier equipment vendors that are using solely Linux in their boxes. Check Extreme Networks for example.

Comments have closed for this article due to its age.