As IT infrastructures become more complicated and automation technologies move to the foreground, new technologies continue to emerge in the virtualization space to solve business problems.
One of the more interesting and practical developments in IT recently is the use of predictive analytics in a virtualization- and infrastructure-aware context. While Fortune 500 organizations have been using predictive analytics to answer key questions about customers, products, and services for years, applying predictive analytics techniques to infrastructure is new.
Predictive analysis works by applying statistical data analysis techniques to existing data silos. IT teams essentially give the analysis product access to a fully loaded network and systems management (NSM) system with a large data pool to analyze and let it go to work.
The predictive analytics software attempts to determine baseline operating data and find patterns. Then the use of regression analysis to write mathematical formulas that describe the relationship between data sets allows correlations to be easily drawn with small data volumes in memory, enabling rapid predictions. Additionally, advances in modern artificial neural networks are rapidly decreasing the time needed to detect patterns in large data sets.
So what's the payoff? Predictive analytics technology works on an existing data silo or silos and starts analyzing historical data. Ideally, the software stores only regression models of relationships between seemingly dissimilar infrastructure elements. After a discrete period of analysis, the product will begin to make predictions based on the current data set. For instance, the analytics software watches storage I/O statistics collected from an NSM system. At the same time, it watches network connectivity statistics from two buildings, 1 and 2. Buildings 1 and 2 are served by the same storage array. The analytics software notices that every Wednesday at 10 a.m., network I/O from building 1 rises. It also notices that when this occurs, the storage I/O from that particular array comes under contention -- and that during this time, help desk tickets categorized as "performance" issues rise sharply at building 2.
For the first two weeks of analysis we don't get any alerts. However, at 9:30 a.m. Wednesday on week three we get an alert: "Building 2 performance will be degraded in 30 minutes, confidence: 85%."
Analytics has discovered a causal, mathematical relationship between network I/O, storage I/O, time of day, and service desk tickets submitted at building 2. While the analytics system doesn't actually have any idea what network, storage, and help desk actually are, it doesn't need to. Because it's found the causal relationships between data and abstracted them to simple math, it's capable of prediction without specific knowledge of what the system under analysis does.
So far, there aren't a lot of companies offering this type of analysis. Netuitive, a predictive analytics vendor for IT, is the only one we're aware of that provides true regression analysis-based analytics of IT systems. Large vendors in the NSM space are quickly joining in, however, and will likely offer viable products within the next couple of years. IBM is probably the closest through its acquisition of SPSS, a small public predictive analytics company headquartered in Chicago. The SPSS acquisition dramatically strengthens IBM's analytics portfolio and positions the company to sell analytics into a very large customer base.
Predictive analytics is expensive now, but the technology is starting to make its way to more mainstream pricing. It's only a matter of time before vendors start offering "analytics as a service" at bargain rates.
Automation and orchestration technologies can make IT more efficient and better able to serve the business by streamlining common tasks and speeding service delivery. In this report, we outline the potential snags and share strategies and best practices to ensure successful implementation. Download our report here. (Free registration required.)