The role of AI in AIOps

AIOps solutions need both traditional AI and generative AI, says Lori MacVittie, F5 Distinguished Engineer.

Generative AI has breathed new life into AIOps, but it’s a bad idea to believe that it is the only type of AI necessary to keep it alive in the future.

AIOps seemed, in 2022, to be a technology on life support. Nearly every so-called AIOps solution was little more than traditional monitoring on steroids. The use cases for which AIOps was destined at that point were mundane and did very little in terms of forwarding progress toward a vision of adaptive applications. That is, to bring intelligence and the ability to execute real-time changes to operations.

But then generative AI entered the room and suddenly, AIOps was given a second chance at life. Except it wasn’t. Not really.

To understand why, we need to define AIOps in terms of the capabilities it is intended to deliver.

What is AIOps?

AIOps is a broad term generally used to refer to the use of artificial intelligence for IT operations. AIOps solutions — usually called platforms — typically provide four distinct capabilities:

  • Observe: Covers telemetry ingestion and anomaly detection
  • Analyse: Uses AI to uncover patterns, correlate events in context, identify root causes, and generate insights
  • Engage: Enable users to visualise and interact with data and insights
  • Act: Turns insights into action through automation.

Largely, what we’ve seen thus far is the integration of generative AI with the engage capability. Specifically, generative AI has advanced the interaction functions of this capability by exchanging GUIs (Graphic User Interfaces) and APIs for NLIs (Natural Language Interfaces). This largely moves AIOps toward the most often cited use case for AIOps, according to a ZK Research report:

  • 64% IT operational efficiencies/productivity
  • 54% improved network or app performance
  • 54% improved security or compliance.

There are some indications that generative AI is being used to act by generating policies and configurations to mitigate risks and address incidents, but these are faint at the moment and largely confined to a single target ecosystem, i.e., one vendor or provider’s portfolio.  

Generative AI has done very little to forward the ability to analyse the data collected by the observe capability. That’s not a mark against it, but rather recognition that it wasn’t designed to do this. For that, we need to rely on good old traditional AI.

What is traditional AI?

Since the introduction of generative AI, what we’ve always referred to as AI has needed a moniker of its own. The market has settled on traditional AI to describe ‘every AI but generative.’

Traditional AI uses many of the same training techniques and underlying principles as generative AI, but it is intended to analyse and even predict results by identifying patterns and relationships in a corpus of largely structured data. Traditional AI is excellent at classification and recognition.

Traditional AI has been used for decades to:

  1. Identify bots to prevent abuse and fraud such as account takeovers (ATO)
  2. Detect attacks based on behavioral patterns in network (DDoS) and application (L7) traffic
  3. Make recommendations for products and/or services based on consumption patterns
  4. Handwriting and image recognition.

The role of AI in AIOps

The ability of traditional AI to not only make predictions of existing patterns and relationships but uncover new ones, in near real-time, is what makes the technology invaluable to a true AIOps solution. A model that can analyse new data and infer the presence of an attack or predict a problem that will interfere with availability or degrade performance is necessary. Traditional AI is particularly adept at both; imbuing the observe capability with the ability to detect anomalies and providing the discovery of patterns and relationships for the analyse capability.

Generative AI can generate code, configurations, and content. It replicates existing patterns and applies them to generate new content. It does not create anything, but instead relies on relationships between objects that are either strengthened or weakened based on feedback.

Indeed, asking generative AI to analyse new data is fraught with risk. It may give you a correct answer, but it also might hallucinate and give you the wrong answer. That’s because ultimately generative AI is a numbers game, and if the relationships or patterns aren’t strong or don’t exist enough between data points, it will simply fill in the blanks — right or wrong.  

What generative AI offers us is accessibility to the data — because we do not need to be experts in a query language or rely on developers to build us an interface — and the ability to automatically generate code or configurations that can be executed to resolve a problem. That too is accessibility, as it alleviates the need to be an expert in writing code or leveraging multiple APIs.

But without traditional AI to analyse telemetry data in real-time, such a system is only partially addressing two of the three capabilities needed for a fully functional AIOps solution. Thus, we need to use both traditional and generative AI:

  • Observe: traditional AI for anomaly detection
  • Analyse: traditional AI to uncover patterns and relationships
  • Engage: generative AI to enable users to visualise and interact with data and insights
  • Act: generative AI to turn insights into action.   

So, while I’m bullish on generative AI and its ability to simplify and even accelerate operations, I’m also aware that to realise the full potential of the AI in AIOps, it will take both traditional and generative AI.

Related Articles

Top Stories