New research reveals that the uncontrolled surge of enterprise data across cloud, hybrid, and SaaS environments is becoming a top headache for security teams and will only escalate. A report by Proofpoint finds that roughly one-third of UK organisations saw data volumes increase by 30 % or more in the last year, and 41 % of large enterprises now manage over a petabyte of data — all while more than half of firms say they struggle with cloud and SaaS data sprawl. Additional data shows that 85 % of organisations worldwide experienced data loss in the past year, with human error and insider threats leading the way. The advent of generative AI and “agentic AI” adds another layer of complication, as 40 % of organisations view AI-related data loss as a top concern, and 54 % say they lack visibility into AI-driven data activity. Experts warn that evolving transparency and privacy regulations will magnify the challenge and compel firms to embed privacy-by-design and security-by-default into data strategies rather than treat them as afterthoughts.
Sources: Dark Reading, Wix.io
Key Takeaways
– As enterprise data proliferates across multiple platforms (cloud, SaaS, hybrid), organisations face a vastly expanded attack surface and struggle to maintain visibility, control, and consistent policy enforcement.
– The leading causes of data loss remain human error, insider threat and lack of oversight — and these risks are being compounded by the rise of generative AI, shadow systems and neglected data sets.
– Traditional security approaches are proving inadequate: firms must shift toward governance frameworks, unified visibility tools and “security-by-default” mindsets to keep pace with the evolving regulatory and technological landscape.
In-Depth
In today’s rapidly evolving IT landscape, data sprawl has moved from nuisance to existential risk. Organisations are no longer simply generating more data — they’re storing, processing and replicating it across a dizzying array of cloud platforms, hybrid systems and third-party services. This diffusion of data assets creates what in effect is an ever-growing perimeter: one that’s far harder to defend. The report by Proofpoint highlighted that a third of UK organisations saw their data volumes grow by 30 % or more in the past year, and large enterprises are managing petabytes. This volume alone makes comprehensive oversight enormously difficult.
Compounding the challenge is that many enterprises lack the necessary visibility into where data resides, how it’s accessed and by whom. According to one source, 54 % of organisations say they lack sufficient visibility and controls over generative AI tools. With AI systems now able to ingest and replicate data automatically, unintended exposures are multiplying faster than defenders can track. In one sense, AI becomes less a security enhancer and more another route for data to leak or be misused. Meanwhile human-driven incidents continue to dominate root causes: Proofpoint reports that 66 % of organisations attribute their most serious data loss events to careless employees or third-party contractors; insider-malice further adds to the risk.
The consequences of uncontrolled data sprawl span several dimensions. First is the sheer expansion of the attack surface: each unused, forgotten or loosely controlled data set is another opportunity for malicious actors. One security study pointed out that exposed secrets, like API keys and access tokens, are proliferating in unlikely places — for example residual data in customers’ support cases, abandoned cloud buckets and developer repositories. That kind of “secret sprawl” illustrates how fragmented visibility creates real breach vectors. Secondly, regulatory compliance becomes much more complicated when data is scattered across jurisdictions and systems. Laws like GDPR, CCPA, HIPAA and others require clear accounts of how data is stored, processed and accessed — a requirement that becomes exponentially harder the more that data is distributed. Third, operational cost and inefficiency loom large: redundant data, unused storage, and overlapping systems consume resources while offering no real business value — but plenty of risk.
From a right-leaning perspective, it’s clear that organisations must stop treating data security as a checkbox exercise and embrace responsibility from the top. Boards, executives and security teams must accept that the era of perimeter defence and siloed solutions is over. Rather than reacting to the next breach, firms should proactively tighten governance, invest in unified data-security platforms, and ensure clear accountability for data management. In practical terms, that means regular audits of data repositories, elimination of forgotten storage silos, segmentation of access privileges, strong user-training programmes, and integration of AI-governance frameworks rather than deploying AI without oversight. It also means simplifying the tool stack and avoiding the trap of “more tech equals more security.” Complexity breeds gaps. As one study notes, bloated multi-vendor security environments delay threat detection and containment by days or weeks.
Finally, regulation is not the enemy here — it is the wake-up call. Transparency requirements, privacy rules and cross-border data-flow laws are forcing businesses into a new model: one where privacy and security are embedded from inception rather than bolted on. Organisations that adapt will gain competitive advantage, lower risk exposure and reduce legal liability. Those that don’t will increasingly find themselves exposed — not just to cyber threats, but to regulatory fines, reputational damage and operational disruption. The lesson is clear: managing data sprawl isn’t optional — it’s foundational to modern enterprise resilience.

