Over my career as a network engineer and architect contributing to several different organizations, I've inherited several different networks. My pattern has been to start a job, get acclimated, and then crawl the network by hand so that I learn how it's actually been configured. My process is to start in the core and work my way out, documenting important layer two and layer three connections as I go.
Along the way, I inventory device configurations, analyzing how well they've been secured and the overall consistency of configuration from device to device. And then I usually have a bit of a cry. For whatever reason, it seems that many organizations struggle with enforcing a standard configuration across the board. Sometimes the issue is the lack of an overall standard. In other cases, there is no audit process to enforce standards compliance.
There are many tools on the market to help with device standards and compliance. The larger the organization, the more likely you've deployed one or more of them. But what if you're not ready to go with a tool like that, which is probably expensive and difficult to implement? What if you don't even have a standard for network device configuration as yet? How do you get started? While the most appropriate standard for every organization will vary, there are several features that I like to bring to a network that I'm responsible for. Let's start with SNMP and logging.
SNMP
Most network devices are managed via Simple Network Management Protocol (SNMP) these days. Network management systems at all price points rely on SNMP to poll statistics, using the gathered information to create alarms, generate reports, graph trends, and display statuses. Here are my recommendations for configuring SNMP.
1. Use SNMPv3. While SNMPv2c is easier to configure, the data is sent across the network unencrypted, including the community strings (password) required to gain access to the device's SNMP engine. The nature of the data shared in SNMP is potentially quite interesting to someone sniffing that data off the wire, something certain kinds of malware can do. SNMPv3 has the ability to encrypt the data going between the network device and the SNMP manager.
2. Consider making views. SNMP views are limited parts of the SNMP database that authorized users can access. Think of this like limiting access to the folders of a server by assigning specific permissions to a user or group. While it is more work to maintain, there's merit in giving specific users access only to the specific SNMP object identifiers that they require.
3. Limit the hosts allowed to query SNMP. All too often, I've seen devices configured so that anyone, anywhere who knows the community strings can access the SNMP engine of a device. The best practice is to limit what devices are allowed to access the SNMP engine. SNMP has a lot of valuable and potentially sensitive data about your network to offer someone with malicious intent; that data should be carefully protected.
Logging
When it comes to logging, I see the same problems over and over again: insufficient local logging, inconsistent remote logging, and logging to servers that no longer exist. Network device logging is absolutely critical both when troubleshooting and performing forensic analysis of network outages or security break-ins. The time to discover you haven't got the logs you need is not when you actually need them.
1. Make sure the local logging buffer is used effectively. When troubleshooting an issue, the first place network engineers will look is on the local device itself. Different network devices handle their logging in different ways, all depending on their underlying operating system. For instance, the logging process on a Juniper Junos-based device is distinct from that of a Cisco device running IOS. However the device in question performs local logging, the key is to make sure that there's enough local buffer to be useful for initial troubleshooting. For example, on Cisco IOS, the default buffer is only 4K. I usually bump this up to 128K, which usually provides several days worth of logs.
2. Ensure the network device is logging to a remote server, or even servers if you have a redundant logging design. Also, make sure that all devices are logging to the same server(s). Many times, I've run into network devices that don't log off-box at all. If the box itself is inaccessible, you want to be able to access whatever logs it had to offer before it fell off the network. In addition, you want to be sure that logs are being sent to a log server that's actually still there. It's not uncommon for log server infrastructure to be replaced by the server team; the network team isn't in the loop and doesn't realize there's an issue until it's missing log data that was shipped into the darkness.
3. Log the important stuff. Juniper firewalls don't log traffic flows by default. Cisco routers don't log SSH connection attempts by default. Not all platforms log routing adjacency changes consistently. There's no one size fits all here, but the big idea is realizing that enabling logging as a feature does not enable the logging of all possible events. The network engineer needs to spend the time to ensure the different types of important events are logged.
Next steps
With a good handle on monitoring your network device and handling the events that your network device is generating, a good next step to think about is securing remote access to the device. I've found that network device authentication, authorization, and accounting (AAA) are often under-addressed. I frequently see a single, generic local account with administrative privileges shared across all devices in the network.
This approach has the benefit of convenience, but isn't particularly secure. Those who know the account have the potential to gain access to other devices in the organization. In addition, user specifics are obfuscated behind the generic account ID, making it easier for someone with malicious intent to hide. In the next post, I'll discuss a practical strategy for implementing AAA.