Ask any IT administrator -- storage can be a headache. It’s expensive, troublesome, and it has a shelf life of roughly three years before hardware needs to be replaced. While other elements of data center infrastructure, such as networking and compute, have evolved to a point where the latest innovations solve many of the management issues inherent with large-scale deployment, common problems wrought by storage systems still remain.
For example, flash has established itself as the fastest, most expensive tier in the data center. With every major vendor pitching a flash vision, it’s not surprising it's seen as the answer to most data storage problems. However, flash is a major investment for most companies, and its high pricing makes an all-flash data center an unrealistic option for any environment at scale.
The real issue IT pros struggle with when it comes to budgeting for flash storage isn’t just coming up with the capital; it’s scaling out the right amount of storage in all tiers, and distributing data in the way that creates the most business value for users. Next time you find yourself questioning whether an expanded flash investment will pay off, ask the following questions about the lifecycle of your data.
1. How much performance do we really need?
In all areas of technology, speed tends to be the most overvalued feature. Adding performance can create new problems and often bumps up against practical limitations. This is the same reason that airplane and car manufacturers aren’t introducing faster models each year: Each incremental speed improvement comes with its own baggage.
For storage, increased performance can require major investments in software, networking and data protection. However, some IT pros make the mistake of thinking performance at a high cost is the answer to many problems in their data centers.
To extend that car metaphor, even if you can buy a Ferrari capable of going 200 miles per hour, will you ever be able to use that speed on any road or track? What for? In all likelihood, you could find a way to do it, but only at great expense and risk. You certainly wouldn’t do it every day.
The same tradeoff applies to your storage infrastructure. If you could identify the areas of your application infrastructure where higher storage performance produces a payoff, you would be able to be far more efficient in your expenditures for flash. We know for a fact that only a very small percentage of primary data is actually accessed with any regularity. In most cases, it's 10% or less. Ideally, that would be the amount of high-performance storage that you would deploy.
Even in a non-ideal world, though, the vast majority of data storage footprints should be of the lower-cost variety (i.e., not flash).
2. Can we easily move data between tiers of storage?
It’s possible but not easy; unfortunately, the Holy Grail of tiering has not been found yet. The problem with data sets is that they are not all one temperature and they do not stay at the same temperature permanently. Today’s technology generally can only move data between tiers at a very slow rate, often defeating the benefits by not moving quickly enough to reflect the current data usage. Other brute force methods can be applied by physically segregating data based on predicted usage. This approach only magnifies user pain when the prediction is wrong.
Of course, underlying all of this is the supposition that all of the tiers are big enough to accommodate all of the data that needs to be moved into them at any time in the future. This, by itself, creates a risk problem where an administrator may be overspending at multiple price points rather than just at the high end.
3. Will our environment scale as our data footprint grows?
Scaling effectively is by far the most difficult data storage problem. Companies build intellectual property as they grow. This data must be preserved and protected, but it’s really no more important than the hordes of data employees amass in a growing business, like emails, file versions and personal folders. Whether these files are critical to business or unstructured and taking up space, they are proprietary, and can’t be compromised or simply lost.
As a result, as individual tiers of storage grow, all tiers -- disaster recovery, backup, archive -- and the data management systems themselves must scale in order to provide uninterrupted support. The way to do this today is to keep close track of your usage and work with vendors to ensure that newly acquitted storage space is sized appropriately to your predicted needs. The process has many tradeoffs, and your users as well as your vendors may have conflicting interests, so it needs to be properly managed.
Any great technology is made better when it’s used strategically, and often that means applying it in moderation during situations where it can have its greatest impact. If flash storage is already a part of your data center -- and for many IT admins, this is the case -- it has the potential to deliver even better results if it’s used carefully to support your data management landscape.