Ever since I started in networking, it’s been beaten into my head that you can’t touch the production network during the day. The reasons for that mindset were numerous and mostly focused around the possibility for misconfiguration or the network reacting in an unexpected fashion. While this practice is common with most data center infrastructure, it's not unique to infrastructure teams. Application delivery teams have historically followed these same guidelines.
However, in the past few years we’ve seen a steady rise in the practice of continuous deployment in the application space. These practices allow for developers to make many small changes to production systems at any time rather than waiting until after hours to do much larger bulk changes. It’s my belief that infrastructure teams, particularly network teams, can and should start learning from this methodology and adopt similar practices.
When you examine how different application teams accomplish continuous deployment, there’s one similarity across the board: The methodology is reliant on specific toolsets and is heavily automated. Up until very recently, networking has had little focus in both of these areas. In other words, we’re missing some of the prerequisites needed to make this model feasible in the network space. If we’re going to seriously consider the possibility of continuous deployment in networking, we must first ensure we have an operational model that lends itself to this type of methodology. In many cases, this will require significant changes to how the network is built and operated.
The first thing that has to change is the means by which we configure the network. The practice of configuring devices individually by copying configurations into their terminals has to stop. Adopting a practice of network automation to manage the configuration of many devices is an easy first step in this direction. Configuration management tools like Puppet, Chef, and Ansible have recently started gaining traction in the network space. The use of a configuration manager provides a central point of control and deployment for a large set of devices. Many of these tools also provide the ability to template configurations, allowing the base configurations to be more generic while still providing the ability to insert device-specific information.
While a configuration manager is a powerful tool on its own, it becomes much more powerful when coupled with a version-control platform such as Git. We need to start treating our configuration files like source code. Any changes to device configuration files or templates need to be properly recorded in the version control system. While there are many reasons why this is ideal, the chief reason is collaboration: Version control systems are inherently designed to support many active users working on the same code base. Not only does this promote the concept of collaboration, it provides other hidden benefits such as accounting and statistics, which can be helpful during problem isolation.
Learn about technologies critical to the Future of Networking at a two-day summit presented by Packet Pushers at Interop Las Vegas this spring. Don't miss out! Register now for Interop, May 2-6, and receive $200 off.
Another area we can improve upon focuses around communication. While we have well-defined processes to document network changes, they are typically non-interactive and viewed as more of a formality. Alert notifications often fall into a similar category, taking too long to get to the right team to be acted upon. We need to put renewed focus on communication and find ways to make it more consumable. Collaboration through group-chat systems has become a popular way for teams to interact with each other. Imagine if that same group-chat system was tied into configuration and alert management systems: Everything from alerts to notifications of when people enter configuration mode on a device would be instantly consumable.
So while improved network operations provide a solid foundation, I’ll be careful to point out that this doesn’t represent a checklist for continuous deployment of network configuration. While I believe automation, version control, and improved communications are prerequisites for this to work, every network is different. Just in the same manner that not all applications benefit from continuous deployment, not all networks will either.
That said, this is mostly unchartered territory. We may find that there’s very little appetite for network changes during peak hours. On the other hand, with the right deployment pipeline, network configuration may become more of an on-demand model. In either case, my hope is that by improving network operations, we may start finding value in at least some types of changes being executed in a more continuous manner.