My first job in the storage market was working for a backup software manufacturer in the late '80s. We were one of the first companies that started advocating the idea of backing up data across a network to an automated tape library. For a lot of reasons, it was very complicated and failed more often than it worked. Fast forward almost 25 years later and, overall, the data protection process still doesn't work very well.
Why is this the case? Why is backup the job that nobody wants? Why do you even have to think about backups? Great questions. While backup software and the devices we back up to have made significant improvements in capabilities and ease of use, they still fail almost as often as they work.
Rapid file growth
The single biggest challenge to the backup process is not the size but the quantity of data. The size or amount of data that you have to protect is clearly an issue, but it's something we have dealt with for years. Addressing it means the constant upgrading of network connections, as well as faster backup storage devices.
[Watch as virtualization and cloud solutions architect Bill Kleyman explains Why The Datacenter Is The Center Of The Universe.]
The bigger challenge is the number of files that need to be protected. We used to warn customers about servers that had millions of files; that is now commonplace. Now we warn customers about billions of files. Backing up these servers via the file-by-file copy method common in legacy backup systems is almost impossible. In many cases, it takes longer to walk the file system than it does to actually copy the files to the backup target.
Rapid server growth
Virtualization of servers, networks, storage, and just about everything else brings significant flexibility to datacenter operations. It has also led to the creation of an "app" mentality among users and line-of-business owners. Everything is an app now, and that means yet another virtual machine created in the virtual infrastructure. The growth rate of VMs within an organization after a successful virtualization rollout is staggering.
All of these VMs, or at least most of them, must be protected. While most, if not all, backup solutions have resolved the issue of in-VM backups, few are dealing with the massive growth of VM count. Often each VM needs to be its own job, and that means managing and monitoring potentially hundreds of jobs.
Growth of user expectations
These realities are compounded by the fact that user expectations are at an all-time high. They now interact with online services that appear to never be down, and they expect the same from their IT. In other words, recovery has to be instant -- or at least fast. Even the time to copy data from the backup server may take too long, especially if there are billions of files to manage.
The fix may be better primary storage
The fix for all this may be to make primary storage more accountable for its own protection. Clearly it does that to some extent already, providing protection from drive and controller failure. But given all the above challenges, it also needs to provide longer term, point-in-time data protection, so that if an application corrupts you can roll back to a version you made a copy of an hour ago, instantly.
At the same time, data protection needs to change. We've seen intelligence added to systems so they can incrementally and rapidly back up large file stores. We've also seen instant-recovery products that allow for a VM to run directly from a backup. But there are challenges with instant recovery that need to be addressed, like how well that instantly recovered VM will perform from a disk backup device and how that VM will be migrated back into production.
I'll dive more into some of the potential solutions, such as better protected primary storage and smart data protection, in my next couple of columns.
George Crump is president and founder of Storage Switzerland, an IT analyst firm focused on storage and virtualization systems. He writes InformationWeek's storage blog and is a regular contributor to SearchStorage, eWeek, and other publications.
Are you better protected renting space for disaster recovery or owning a private cloud? Follow one company's decision-making process. Also in the Disaster Recovery issue of InformationWeek: Five lessons from Facebook on analytics success. (Free registration required.)