In the data-driven enterprise, IT departments are on the hook to move the business forward, but this is impossible when all IT resources are going toward putting out fires and remediating avoidable issues. Being stuck in a reactive stance precludes development of an innovation culture. It’s difficult for IT managers to gain credibility with executives and business leaders if basic operations are not running smoothly most of the time. Without a solid, collaborative relationship with business teams, it's harder to procure and deploy new technology, not to mention justifying the expense of hiring skilled specialists to run it.
When new tools are purchased and rolled out, it’s nearly impossible to realize their full potential if the necessary procedures, policies, and personnel are not already in place. Failing to implement important features, make effective use of collected data, or align new capabilities with strategic goals guts the value of the investment. Moreover, incomplete or flawed deployment of software and hardware systems often leaves the enterprise open to security risks.
We all know that enterprise technology is not plug-and-play, no matter what vendor hype would have you believe. We can get closer to this ideal only by building a company-wide culture that wholeheartedly supports and continuously improves upon efficient and thorough processes underlying everything from procurement to analytics, capacity management, testing, and incident response.
In today’s virtualized computing environments, complexity and interdependence present constant challenges. It's impossible, even for the most mature IT organizations, to avoid negative incidents such as outages, breaches, and supply chain failures. When an outage occurs, the speed and effectiveness of the response makes the difference between a minor slip and a major setback. In highly complex systems, finding and fixing the root cause of incidents is a hit-and-miss exercise unless the working dynamic among technology, people, and process components is carefully cultivated, monitored, and tested.
Far too many enterprises don’t even know what they don’t know. Among IT managers TeamQuest surveyed, 42% believed they were at a service level of maturity, characterized by the ability to predict and prevent incidents, continuously increase efficiency, and proactively manage service quality. After completing an IT maturity assessment, 53% of these managers found their organization actually qualified as “chaotic,” a level characterized by a near constant state of putting out fires with no strategy or process for proactively managing service quality. In fact, nearly three-quarters of those polled landed in the two lowest maturity levels: chaotic and reactive.
On average, IT managers deal with eight unexpected issues each week. Each incident requires an average of 3.5 hours to resolve, taking up the time of seven staff members. The most common “fires” were network slowdowns and outages, poor performing applications, availability and equipment failures. On the surface, this may look like technology failure, but most of these incidents can be avoided or quickly remediated with proper capacity planning and management, automated analytics, well-trained staff, and strategic alignment with business processes and objectives.
Having only enough resources to deal with current problems prevents IT managers from getting ahead of the curve. If only 10% of their staff’s time, on average, is spent on proactive improvement efforts like application tuning, capacity planning or data management, how will they climb out of the pit of reactivity to reach greater IT maturity and proactivity? How can they help move the business forward when they are too busy keeping servers on? It’s important to remember that availability, while an important goal, is only a baseline function, and doesn’t prepare your organization to be agile or innovative. If merely maintaining availability is all-consuming, achieving competitive advantage gets further and further away.
To move beyond reactivity into optimization efforts, IT leaders say they need more visibility into performance, more staff and better training, and more sustainable and repeatable processes for resolving incidents by identifying and fixing root causes. In complex, virtualized ecosystems, visibility and proactive remediation require automated analytics capabilities. Ongoing, systematic analysis of incoming component data ensures you will have enough information to quickly characterize and mitigate incidents.
Adding in historical data makes capacity planning and forecasting more accurate. Further aggregating sensor, configuration, and facilities data with business metrics like assets, costs, and transaction volumes will significantly advance your analytics program, enabling diagnostic and predictive capabilities.
Armed with advanced, automated analytics, your people are empowered to identify more interdependencies, generate new insights, and offer creative solutions. If they are also encouraged to understand the context in which technology and services are used, remediation and planning efforts can become even more focused and impactful. Following through with documentation, evaluation, and testing deeply integrates processes and boosts the credibility of your efforts. At this level of IT maturity, you have shifted focus from putting out fires to addressing nuances of service health based on response times, latency, and end-user experience.
With a clear picture of strategic priorities, component dependencies, and forecasted future behavior based on empirical trends, you can proceed with confidence to maximize efficiency and prevent incidents. When you “get” business, and not just machines, you can prioritize actions and communicate in terms that make sense to your customers. That’s the sweet spot: When technology, process and people are interwoven, the fabric of the enterprise is stronger and more resilient.