Artificial Intelligence for IT Operations – AIOps for short. You’ve no doubt heard the term being bandied around – but what is it, and what can it do for your organization?
Let’s begin with a definition from Gartner, the research firm which coined the term a couple of years ago, and has since been responsible for popularizing both the phrase and the concept.
“AIOps platforms combine big data and machine learning functionality to support all primary IT operations functions through the scalable ingestion and analysis of the ever-increasing volume, variety and velocity of data generated by IT. The platform enables the concurrent use of multiple data sources, data collection methods, and analytical and presentation technologies.”
Translation: AIOps is essentially an umbrella term to describe the use of machine learning and big data analytics technologies to automate the identification and subsequent resolution of common IT issues.
Why Do We Need a New Approach to IT Operations – and a New Term to Go with It?
It’s no secret that IT operations are in the midst of a revolution. Driven by the forces of digital transformation, businesses are going through rapid change. To remain competitive and meet evolving market demands, organizations are rushing to implement new technologies, develop applications and migrate to the cloud – and IT operations are suddenly dealing with a plethora of new systems in a whole new complex and connected environment that simply didn’t exist in the past.
And here’s the problem. All of these new systems, services and applications produce huge volumes of log and performance data for IT to contend with – and it’s straining traditional performance and service management tools and strategies to breaking point.
You can’t manage today’s rapidly-evolving IT landscape with yesterday’s tools. Instead, Gartner says, organizations need a software system that combines big data and machine learning to enhance the processes of IT operations – including analysis, performance monitoring, and IT service management – across an increasingly diverse range of cloud, third party services, SaaS integrations, mobile, and much more besides.
AIOps is the term Gartner has coined to describe both the platforms and the paradigm shift required to handle digital transformation in IT operations. In essence, AIOps is a new platform approach that utilizes AI to overhaul and automate various IT operations and performance monitoring processes that have traditionally been performed manually.
At its heart, AIOps relies on two things – big data and machine learning. In its Market Guide for AIOps Platforms, Gartner uses the diagram below to illustrate how an AIOps platform works.
(Image source: gartner.com)
AIOps platforms enhance and automate IT operations through analyzing big data collected from various IT operations tools and devices in order to automatically identify and react to issues in real time.
The approach aggregates observational data (i.e. data found in job logs and monitoring systems) alongside engagement data (i.e. data usually found in incident, ticket, and event recording) within a big data platform. Analytics and machine learning is then implemented against that IT data in order to produce continual, real-time insights that enable continuous improvements through automation.
Gartner says that the goal of the analytics effort is the discovery of patters – novel elements used to look forward in time to predict possible incidents and emerging usage profiles – and to look backward in time to determine the root causes of current system behaviors. As such, AIOps can enhance a broad range of IT operations processes and tasks, including performance analysis, anomaly detection, event correlation and analysis, IT service management and automation.
What Problems Will AIOps Solve?
The short answer is that adopting an AIOps approach will give an organization immediate access to all of its data – spread across increasingly complex and interconnected IT environments – and make it simple to effectively use that data to drive meaningful action.
Rapid growth in data volumes (generated by expanding IT infrastructure and applications) combined with the increasing variety of data types (generated by both machines and humans) and the increasing velocity at which this data is generated puts existing monitoring tools under severe stress – stress that they are simply unequipped to cope with. Moreover, today’s monitoring tools do not cut across the multiple data types required for extracting useful insights.
Instead, AIOps aims to bring together information and insights that were previously locked in siloes – and it does so by integrating with existing tools and processes. AIOps platforms draw from huge data sources and unifies them – along with IT resources – to optimize work processes. All data from these sources are processed by machine learning algorithms, which are able to identify significant events automatically without requiring laborious manual pre-filtering. Then, a second layer of algorithms analyze these events to identify similarities between them that point towards symptoms of the same underlying issue.
As such, an AIOps platform should bring a number of new beneficial capabilities to the enterprise. First, routine practices – including user requests and non-critical IT system alerts – can be automated. Rather than spending hours of time manually sifting through every single alert, the platform will automatically perform the evaluation and determine that the alert does not require action because the relevant metrics and supporting data are within normal parameters.
Second, AIOps platforms are able to identify and prioritize serious problems faster and with greater accuracy than humans. For example, human teams tend to sort alerts by severity of threat. This means a known malware event taking place on a minor system may take precedence over an unfamiliar download or process starting on the main data center because the team isn’t looking for it. AIOps solves this problem by prioritizing the event on the critical system – i.e. the issue putting the organization’s most important assets at the most risk – and deprioritizing the known malware event by automatically running an anti-malware function.
Third, AIOps enables streamlined interactions between IT groups and teams by drawing from all data sources at once and providing each functional group with relevant data and perspectives. Traditionally, teams have had to share, parse and process information by manually sending around data. But with AIOps, the platform learns what monitoring and analysis data to show to each group, ensuring everyone sees what they need to see, when and how they need to see it.
If you haven’t had your ear to the ground, AIOps is expected to become the next big thing in IT management. Set to play a pivotal role in overcoming the complexities of the current enterprise IT operations model, the approach paves the way for more streamlined digital transformations, and it’s highly likely that we’ll see more AIOps strategies and technologies being implemented in the very near future – watch this space.