When AWS experienced its third outage in just one month (December 2021), it brought down major apps and services across the internet. Slack, Google, Venmo, McDonalds, DoorDash, and Spotify were just some of the companies affected and it’s fair to say it had a far-reaching impact across the internet.
According to AWS – the world’s largest public cloud provider – the issue was caused by a power outage at a data centre in Northern Virginia, which went on to trigger roughly three hours’ worth of severe connectivity issues for some of the world’s biggest brands.
If a similar issue happened at a major bank or financial services (FS) institution, it could cause catastrophic economic damage and unprecedented market turmoil.
Despite cyber-attacks increasing in sophistication, a more common cause of system outages are user errors. This is typically compounded by the way technology environments are composed of a complex stack of often outdated and fragile systems. So, what’s keeping this all together in the FS sector?
Enter digital operational resilience
The Financial Conduct Authority (FCA) is aiming to improve digital resilience in the UK over the next three years. Last year, it published guidance on digital operational resilience for banks and FS firms, and, according to its 2022-25 strategy, will also look to include critical vendors in the future. In parallel, the EU is progressing its Digital Operation Resilience Act (DORA) regulation through the final stages.
These strict new rules mean that banks, credit card providers, insurers and other financial institutions will have to identify any vulnerabilities in their operational resilience and take proactive steps to address them. This includes ensuring their services are available to customers in the event of disruption and that any outage is reported within two hours of occurring.
To comply with the FCA’s guidance, as of the end of March 2022, firms must have identified their important business services and set impact tolerances for the maximum tolerable disruption to these areas. This entails carrying out sophisticated mapping and testing, as well as identifying any vulnerabilities in their operational resilience. Crucially, between now and March 2025, FS firms must work to remain within those tolerances. After that, any firm that fails to comply will potentially face hefty penalties and fines.
Creating a fit-for-purpose cloud strategy
Unsurprisingly, the move has huge implications for FS firms of all shapes and sizes, as they must have a back-up plan ready to activate should any part of their complex technology estate suffer an outage – whether it runs on an on-prem server, a virtual machine or public cloud.
With DORA and the FCA’s guidance pushing FS organisations to consider the risks of reliance on a single third-party cloud provider, many businesses are choosing specific clouds for specific workloads. This is also reflected outside the FS sector, with Flexera’s State of the Cloud 2022 report stating that nearly nine in 10 organisations (89%) are pursuing a multi-cloud strategy.
However, pursuing a multi-cloud strategy to switch business critical applications between public clouds (what’s known as interoperability) can be expensive and challenging; cloud applications will be built using provider-specific tools and services meaning failover between clouds is technically quite hard.
In addition, according to Advance, two thirds of large enterprises are still running mainframe apps dating back well over a decade – and in the FS sector, some systems and applications were built more than 30 years ago. This means that cloud adoption and estate modernisation are creating increasingly complex, large scale hybrid IT environments for banks and FS firms. Because of the risks associated with moving these often old, business critical applications to the cloud, the hybrid estate will be the norm for a long time, despite the growth in the cloud.
Accelerated orchestration key to compliance
The continuous resilience regulators are aiming for can only be delivered with orchestration, a combination of intelligent infrastructure automation and panoramic visibility that provides the control, governance and visibility needed to ensure every part of a disparate technology environment functions in harmony with the other. With the new regulations, it’s not enough just to be resilient – you have to demonstrate how you’re attaining resilience.
This can only be achieved by tools that can operate across both on and off-premises, managing your hybrid estate and enabling the (hyper)automation of workloads ‘at the right time, using the right technology, in the right location, for the right price’.
Gartner last year announced a new product category that meets this need: the Digital Platform Conductor (DPC) tool. DPCs go beyond solving the resilience problem; they are a solution to the complexity crisis many large organisations are facing. DPCs enable business and IT leaders to strategically manage the full spectrum of their hybrid estate, irrespective of the environment or location. They can be deployed across a series of verticals, including banking and finance, and are a crucial tool for building resilience, as well as dealing with automation at scale and bringing the public cloud-like benefits of such automation to the entirety of the IT estate.
It’s clear the sector is at the start of a long journey towards compliance. Three years may sound like a long time, but when you consider the complex legacy environments teams need to grapple with, March 2025 will be here before many of us know it. Now’s the time to start getting serious about compliance and creating a business strategy to achieve continuous resilience and competitive advantage at the same time.