Share this article on:
Leaping at the opportunities AI and cloud have to offer, businesses are accelerating investment in IT infrastructure, with Australia’s spend set to tip over $133 billion.
In fact, Australian organisations were expected to have spent more than $23.3 billion on public cloud services in 2024 alone. Despite this heightened spend, most organisations aren’t diving into a cloud-first strategy but rather focusing on a hybrid multi-cloud approach.
This comes as demand for flexibility, disaster recovery, and enhanced data sovereignty grows. By relying on multiple cloud providers and on-prem storage methods, organisations have a better recovery plan if things should go sideways.
Yet, as more cloud providers and applications come onboard, it becomes more difficult for security and operations teams to have clear visibility over their growing platforms. Australian businesses are simultaneously implementing protective layers, with security and risk management spend expected to reach $7.3 billion this year.
In the quest to implement strong security, organisations unknowingly overwhelm systems and teams. These foundational measures are compounding and causing “alert storms”, leaving businesses bombarded by notifications – and not all of them are threats or problems.
There is no shortage of real, serious cyber threats. In the 2023 financial year, the Australian Cyber Security Centre reported 94,000 cyber crimes. Where there is heightened activity, it’s important to consider – and take seriously – the adverse effects of alert fatigue.
Like the age-old tale of The Boy Who Cried Wolf, panicked service and operations teams don’t have an option but to find what alerts pose genuine threats. Better safe than sorry, right?
As you would expect, that task becomes more difficult amid expanding threat vectors, resulting from increased technology spend. Sifting through a barrage of alerts not only overwhelms technology teams but also takes resources away from what could be spent on more high-value projects.
If teams feel they are spending most of their time in the weeds of alert diagnoses, they’re not likely to feel satisfied, challenged, or supported in their role. Talented people are hard to find, and losing them to alert storms is a real waste.
To keep talent and, therefore, minimise the impact of alert storms, businesses have an opportunity to establish a framework to filter through the “noisiest alerts”, or those triggered most often, to ensure critical issues aren’t overlooked or hidden among the noise. IT chiefs can alleviate some of this pressure by continuously reviewing and updating their organisation’s monitoring strategy, targeting the removal of unnecessary or unhelpful alerts.
This is especially important for larger companies that generate thousands of alerts due to multiple service dependencies and potential failure points. The priority is to find the problem alerts – those that are triggered most often and can be broadly categorised as either predictable or unstable.
Predictable alerts form consistent patterns and include notifications about the start and end of automated backups, as well as warnings about regularly occurring issues that tend not to pose any real danger.
On the other hand, unstable alerts are a bit more attention-seeking. They occur at an unnecessarily high frequency, typically informing teams that something is switching back and forth between different states. In both scenarios, such alerts can quickly overwhelm a team, potentially causing genuine warnings to be missed in all the noise.
So, how can technology teams combat this?
Alerts can be managed to reduce their frequency and increase their effectiveness. By extending an organisation’s monitoring evaluation window and leaning on artificial intelligence, relevant data can be defined and evaluated to compare alert conditions.
Thereby, the system will assess more data points before triggering a decision, giving automated processes more time to work their magic. This will reduce the frequency of alerts and ensure a user is only notified when an alert that stands out as an anomaly is identified.
However, once your team manages to reduce the frequency of alerts, the next task is perhaps the most important: what is this alert telling you?
As with most things in life, alerts are not working in isolation, and it is this correlation to other factors that is key. Like a detective solving a crime, one cannot simply find the answer by looking at the final act. You must understand the history of the perpetrator and analyse the interconnected web of activities and patterns that led to the final action.
Finding the correlation between a singular alert and the many factors that may be contributing is the key to assessing whether it poses a serious outage or security threat.
One must ask themselves: what do you actually need to know about this alert? What makes it an anomaly, and why has it been brought to your attention?
When answering these questions, gaining greater visibility over the origins of the alert is essential. In a sense, you have to undergo behavioural analysis of an alert in the broader context of the past to provide insight into why this alert needs your attention. This is crucial, as often the loudest alerts are not the problem. The root cause isn’t the one you know about; it’s the one you don’t.
Although you may be getting an alert about a certain process failure, there is every possibility that the outage is coming from a separate issue that happens to be connected to the original alert.
Once the root cause of the alert has been identified, the remediation steps need to be executed quickly. You can do this by executing workflows automatically or manually – by restarting manual workflows or kicking off automation.
With the number of both genuine and false-positive alerts accelerating, Australian businesses need to put both the reduction of alerts and the slowing down or blocking of attacks top of mind.
Nobody likes unnecessary notifications or inefficient response times. There’s nothing worse than being woken up at 4am by your health app telling you they’ve seen a change in activity levels. But for technology teams, the bombardment of annoying notifications could result in a critical threat slipping through the cracks.