Why financial institutions need forensic data lakes

It’s no secret that there’s a direct relationship between the volume of digital assets in use at an organisation and the increased risk of cybersecurity attacks.

Financial institutions are prone to excessive damage from attacks and should heed the directives provided by the June 2022 Digital Operational Resilience Act (DORA), adopted by the European Council in November 2022. 

Welcome to DORA

A formal political agreement announced by the EU, DORA provides guidelines to reduce vulnerabilities and enhance physical resilience in the critical operational entities supporting EU citizens, including banks, insurance companies and investment firms.

DORA prescribes a number of critical steps are taken, such as:

  • Increasing Information and Communications Technology (ICT) risk management and governance
  • Creating a consistent incident management mechanism
  • Managing third-party risks
  • Establishing thorough testing of ICT systems
  • Implementing organisational resilience assessments

Achieving DORA compliance easier said than done

The guidelines are clear, but organising financial operations to align with them is not straightforward.

For financial institutions to successfully act on DORA, they must implement a comprehensive cloud service provider (CSP) and software-as-a-service (SaaS) log collection information repository. This is because many business services and application workloads have moved from on-premises data centres to numerous forms of cloud computing.

It’s important to note here that since the dwell time of a cyber attacker can be more than 250 days prior to the cybersecurity team’s breach identification, these cloud and SaaS log collection repositories should contain an equivalent amount of forensic data as the baseline for successful investigations and ongoing organisational readiness and resilience activities. But here various challenges come into play:

  • There are limits on how information technology (IT) teams can access cloud and SaaS logs
  • Many cloud providers store logs for only short periods (e.g., 90 days) — if at all. Anything older than seven days is throttled and capped by the CSP or SaaS vendor.
  • For larger organisations, it can take more than one minute to extract one minute of forensics data. As you try to extract the old data, it disappears from your hands.


Enter the forensic data lake

Three key aspects for reducing a cyber attack’s damage can be tied to the concept of a forensic data lake. Once established, the forensic data lake collects and stores high-value forensics artefacts on a continuous basis:

  • The forensic data lake allows pre-existing and normalised data sets to be rapidly queried with pre-built signatures to answer time-critical questions. Because the data is normalised and the environment known, the processes of collection, normalisation and query building result in getting to these answers days faster than a response initiated in the wake of an incident.
  • Reporting can flow naturally from the forensic data lake directly to a reporting platform, which can be utilised in training (such as executive tabletop exercises) and during a crisis.
  • The quality and availability of the data in the forensic data lake can rapidly imbue confidence in the ability and timing to receive a more qualified answer. Thus, stakeholders might not have all the information they would need to make a decision, but they can feel confident that they have the best forensic data available to make those judgments.

In this manner, a forensic data lake can:

  • Decrease the time to respond to a breach by days
  • Improve the quality and speed of reporting
  • Allow stakeholders to have confidence in executing decisions with the best data available

Getting it right

Despite the prevalence of security information and event management (SIEM) platforms and increased investments in security data lakes, cloud incident response (IR) and forensic investigation, challenges persist for the various teams tasked with cybersecurity.

In the context of supporting DORA, we find the forensic data lake approach preferable to security data lakes or SIEM platforms because these are often characterised by:

  • Heavy, upfront investment in parsing and normalisation
  • Lack of cloud and SaaS integrations and support
  • Limited scale
  • Missing control of data formats (normalised and enriched vs raw form)


In comparison, a forensic data lake provides:

  • Optimisations for forensic investigations, critical incident response, and threat hunting, as they are designed for use by incident responders, threat hunters, and by cloud digital forensics incident response (DFIR) specialists
  • Enhanced SaaS integrations and support, with forensic data that includes resource configuration snapshot and historical data, as well as security data found in SIEMs and security data lakes (e.g., alerts, telemetry, and standard security info)
  • Cloud-scale architectures, with forensic data lakes enriching cloud and SaaS data and normalising it
  • Facilitated investigation into lateral movement between systems across cloud and SaaS platforms by combining multiple data sources in a single query.


More specifically in the context of DORA, a forensic data lake enhances the ability of EU member states to:

  • Conduct risk assessments
  • Identify cloud and SaaS risks that could disrupt service delivery
  • Enhance organisational resilience


By building a forensic data lake, an organisation’s data centre operations and IT specialists can transition to a proactive cloud IR approach that no longer relies on limited cloud vendor logs to ensure success. A forensic data lake will strongly support efforts in DORA for improving incident reporting and increasing resilience.

Related Articles

Top Stories