Doing things at scale is a hard proposition. There are inherent limits to what a system is capable of doing with a given set of resources. When tie multiple systems together to increase availability or performance, the number of factors contributing to complexity increases exponentially. However, the drive to software define various parts of IT has given us hope on the scaling front. I will examine this trend, but first, let's look at the problem.
Part of the scaling issue comes from design decisions. Distributed systems have been the workhorse of enterprise IT for decades. These systems function in much the same way as the global Internet. In the event of a failure in one of these systems, the rest of them would stay functional. This required a great deal of decentralization. Distributed systems need enough information to function in the event that everything is taken away from them.
Decentralization is a huge barrier to scaling. If every unit has a supervisor module or director program, then there are many brains in charge of one body. A single system doesn't work well with multiple supervisors. Most redundant systems use an active/passive system that shuts down all but one brain.
This allows the system to function in a decentralized manner but introduces complexity, as protocols and rules are needed to ensure that the passive brains are not brought online unless the active brain has failed. How can you scale a system that only has one active supervisor unit? The upper limit of your system is directly proportional to the amount of data that single unit can process.
Enter software. With software making all the decisions, the artificial limitations of hardware no longer exist. The software defined (SD) movement in networking and storage has allowed for some interesting developments that promise to break down the obstacles to scaling.
For example, storage company Coho Data is using OpenFlow-capable switches from Arista to scale past limitations in NFS. This allows it to present a single array to the hypervisor that can scale performance linearly. In this demo, Andrew Warfield, Coho's CTO, shows how the Coho system can use software to take care of complex I/O operations while presenting a single IP address to clients.
[Read Howard Marks' analysis of Coho Data's approach in "Coho Applies SDN to Scale-Out Storage."]
Networking is doing many of the same things today. VMware's NSX and Cisco's ACI platforms are both attacking the problem of abstracting complexity behind a policy engine. NSX seeks to make the network more agreeable to server administrators. ACI seeks to tie pieces of the network together to present a unified interface to those that work on the network. The key comes in policy deployment.
Engineers know what outcome they are trying to accomplish when they program a system. They may want quality of service for a given packet stream. Or they may need a packet stream diverted to a secondary location before being sent on its way. These are complex tasks that break at scale. Hardware intrusion detection systems (IDSes) and firewalls show what happens when packets are forced through choke points that can't handle the load.
With software running the show, the complexity can be handled automatically in the background without the need for engineer intervention. Networks can scale to include necessary security services without breaking due to bad design.
Scaling is hard. If it was easy, everyone would be doing it already. Thanks to software, companies are examining new ways to make their products grow out with customer needs as opposed to using a forklift to scale up to a new artificial limitation. I think we are going to see explosive growth around the idea of starting small and building big, a piece at a time.