Automating network configuration and management has many clear benefits, from simplifying and scaling repetitive, error-prone tasks to increasing the stability of networks through consistency and automatic testing. However, there can still be organizational, political, and cultural challenges to overcome. There are four pillars often discussed in the DevOps movement and referred to as CAMS: culture, automation, measurement, and sharing. Culture is first, not just because it makes a great acronym; it's fundamental to success.
We see a variety of cultural and political challenges to network automation. They often parallel what happened in the server world a few years ago and can be summarized as fear and lack of trust. To be fair, the stakes are high in the network. It's costly and complicated to build highly redundant networks and, even then, certain configuration errors can propagate quickly and defeat the redundancy of the entire design. Moreover, while misconfiguration of a server might be isolated to a single application or several VMs, a network misconfiguration could affect 30 to40 servers, which could mean hundreds or more VMs and containers.
Executive decisions to implement automation without communicating the business drivers or understanding that automation doesn’t happen overnight may not be received well. Automation is a process that requires time and other resources and must be supported and nurtured along the way. Engineers can become frustrated and resistive when told to change how they work without understanding the reasoning behind the new process or tool selection. Involving the team affected early in the process and providing sufficient education go a long way. They may even know of additional solutions. Leadership also has to be willing to address roadblocks along the journey. Simply telling people to change without providing sufficient resources and protecting them from conflicting influences throughout the process is a recipe for disaster.
In other cases, a network team may want to take advantage of automation and needs consent and support from leadership to move forward. To get executive buy-in, engineers need to translate their world into business terms. Demonstrate the initiative to get a proof-of-concept working, then present it in terms of benefit to the business as a whole. Assess frequent, repetitive activities, such as provisioning new host-facing top-of-rack ports, deploying new VLANS, or updating access-lists for new applications, then understand your workflow. Look at where you can use APIs to reduce repetitive entry of data. Measure it. How long did it take you manually? How much more can you accomplish with automation? What could you get to that has been on your to-do list for far too long?
This is the real goal: condense the results to terms that matter to your audience. This is also the most challenging part for many engineers, but you have to translate your technical information into business impact. Reduced errors and time to complete items are good, but take it up a notch. The faster you can complete a request, the more rapidly the business can deploy a new application to respond to new opportunities in the market. Reducing errors, thus outages, may increase customer satisfaction and allow the company to increase revenue, if you are talking about an external-facing system.
These are the types of information that help executives assess the value of a solution compared to the cost. They are generally less interested in how cool the tech is that you used to make it happen. Your peers, however, are more likely to want to discuss implementation details. Be sure to share your successes as early and broadly as you can within your team and organization. That gets others onboard and builds momentum.
Education and long-term support are also required. That support includes leadership at all levels to help adapt or eliminate change review boards. Network misconfigurations, which can cause outages, have driven political solutions like change review boards, which can stymie automation efforts. For example, when my team helped a customer automate some processes, a routine change took four to six weeks to get approved, and that was only one small part of a nine-month process to deploy a new application. In that case, we discovered that the actual network work was already automated and demonstrated how the automation itself was a documented process that could be approved once and reused as an ITIL Standard Change.
When we treat the “code” that defines the network like any other code in a software development environment with automated testing, peer code-review, and revision control, we are able to build even greater levels of trust in automated deployment. Automation is able to deploy the same change in the same way as many times and places as necessary with exactly the same result. Nothing gets changed or skipped between test and production environments or when updating the 30th or 200th device. Once the automation has been technically reviewed and approved, if a change review is still required, it can focus on higher level goals, such as adding a new site or rack, instead of the details of the actual configuration commands.
Once implemented, it's important to encourage team members to not slip back into manual ways. Be especially vigilant of manual changes when troubleshooting, otherwise, there could be an unpleasant surprise the next time the automation touches that same setting. The other side to long-term support is technical. Just like any other software, automation needs to be maintained and updated. Time must be allocated regularly for maintenance.
Take time to understand how your message will be received when sharing your automation vision. A little empathy and knowing how to translate your message into terms that resonate with your audience have significant impact for getting everyone on board successfully. You will likely encounter some resistance, but don’t be discouraged. Demonstrate successes, even small ones, and be a champion when sharing.