Data Retention and Deletion — When and How to Remove Data

Storage is cheap, deletion is work, and so most organisations quietly default to keeping everything. The old CRM export, the closed account, the support tickets from four years ago, the analytics nobody queries - all of it sits there, costing almost nothing to store and a great deal to be caught holding. Under this regulation, data no longer needed isn’t a neutral asset gathering dust. It’s a liability accruing interest: more to breach, more to disclose, more to defend, and eventually something a regulator finds. Retention is the discipline of deciding, on purpose, what to keep and what to let go.

Art 5(1)(e)

Storage limitation: keep personal data no longer than the purpose requires

No fixed clock

GDPR sets no universal retention period - the purpose and the law set it

€14.5M

The German fine for an archive that couldn’t delete what it no longer needed

764 audited

Controllers reviewed across Europe on erasure in the regulators’ 2025 coordinated action

The Principle Isn’t “Delete Fast” - It’s “Justify Keeping”

Storage limitation is widely misread as a command to delete data the moment it’s been used. It isn’t. The principle is narrower and more demanding: personal data may be kept only as long as there is a purpose that still needs it and a lawful basis that still supports it. Data can be retained for years where a genuine reason exists - tax law, employment records, an active contract, a live legal claim. What the regulation forbids is the default the other way: keeping data because deleting it is effort, because it might be useful someday, or because no one ever decided it should go. The burden is to justify retention, not to justify deletion.

Retention Periods Come From the Purpose, Not a Template

There is no single number. A retention schedule is built bottom-up, from the purpose of each category of data: how long an active customer relationship lasts, how long after it ends a record must survive for tax or warranty or dispute reasons, how long employment files are kept after someone leaves, how long marketing consent stays valid before it goes stale. Different data, collected for different reasons, expires on different clocks - and the work is to define those clocks per category, write them down, and apply them automatically rather than from memory. A policy that says “we keep data as long as necessary” without ever defining necessary is not a policy. It’s a paragraph.

The discipline	What good looks like	Where it breaks
Defined periods	A clock per data category, tied to a stated purpose and basis	“As long as necessary,” with necessary never defined
A deletion trigger	Something fires automatically when the period ends	Periods written down, but nothing ever acts on them
Coverage everywhere	Deletion reaches live systems, backups, and processors	Deleted in the database, alive in backups and exports
Proof it happened	A log showing what was deleted, when, and why	Deletion claimed but unevidenced - “we think it’s gone”

Deletion Has to Reach the Copies You Forgot

Defining a retention period is the easy half. Enforcing it is where most programmes break, because data doesn’t live in one place. By the time a record is due for deletion, copies have spread to backups, analytics warehouses, data exports, the email platform, the CRM, and a handful of third-party processors. Deleting the original and leaving the copies isn’t deletion - it’s tidying. Real erasure means reaching every system the data reached, and the regulator’s own standard is that deleted data is “put beyond use,” not merely archived or moved offline. Backups are the perennial blind spot: data quietly survives there long after it’s gone from production, and the schedule has to account for the backup lifecycle rather than pretend it doesn’t exist.

●

THE ARCHIVE THAT COULDN’T DELETE - DEUTSCHE WOHNEN (BERLIN DPA, 2019)

A German real estate company was fined €14.5 million - not for a breach, not for misusing data, but for keeping it. The regulator found that the company’s archive system had no functionality to remove tenant data that was no longer needed, so years-old records - pay slips, tax data, bank statements, health insurance details - sat there indefinitely, long past any purpose. The authority had warned the company to fix the architecture in 2017; by 2019 it still could not demonstrate a lawful state of storage. The violation was the design itself: a system built to accumulate, with no mechanism to forget. The lesson is pointed - storage limitation isn’t satisfied by a policy document. It has to be built into how systems actually work, or the policy is fiction.

Delete, Anonymise, or Justify - Pick One Per Dataset

When a retention period ends, data has three legitimate destinations and exactly one illegitimate one. It can be deleted - put beyond use across every system. It can be anonymised - stripped of anything that could identify a person, at which point it falls outside the regulation entirely and can be kept freely for analytics or statistics, provided the anonymisation is genuine and irreversible rather than a thin pseudonymisation a join could undo. Or its continued retention can be justified by a specific, documented reason that still applies. The illegitimate fourth option is the common one: keeping it with no decision at all. Every dataset past its period should be able to point to which of the three roads it took, and why.

The Regulators Are Now Looking Specifically Here

Retention used to be the quiet principle nobody enforced - the headline fines were for breaches and consent. That has changed. In their 2025 coordinated enforcement action, Europe’s data protection authorities turned their collective attention to erasure and retention specifically, with thirty-two regulators auditing 764 organisations across the continent on how they actually delete data. The recurring weaknesses they found were structural and familiar: no systematic classification of what data exists, and no automated mechanism to act when a retention period ends. The signal is simple - “we keep things tidy” is no longer a defensible posture. Retention is being inspected as a discipline in its own right, on the assumption that an organisation should be able to show, not assert, that it deletes.

Hoarding vs. Governed Retention

Pattern-matching from real retention reviews - the gap between hoarding by default and retention by design tends to follow the same shape:

Looks like retention	Is actually governed retention
✗ “We keep data as long as necessary”	✓ A defined period per data category, tied to a purpose
✗ Periods on paper, nothing enforcing them	✓ An automated trigger that fires when the clock ends
✗ Deleted in production only	✓ Put beyond use across live systems, backups, processors
✗ “Anonymised” by hiding a name	✓ Genuinely irreversible anonymisation, or honest deletion
✗ Old data kept in case it’s useful someday	✓ A decision - keep, anonymise, or delete - for each dataset
✗ Deletion assumed, never evidenced	✓ A log proving what left, when, and why
✗ Backups treated as out of scope	✓ Backup lifecycle built into the retention schedule

Build the Schedule on the Map, Automate the Trigger

A retention programme that works rests on two things an organisation should already have or build. The first is the data map: you cannot set a retention clock on data you can’t locate, so the inventory comes first. The second is automation: a period that depends on someone remembering to run a deletion job is a period that will be missed. The schedule defines, per category, how long and on what basis; the system enforces it without a human in the loop; and a deletion log captures the proof. Anonymisation is the pressure valve for data with genuine analytical value past its identifiable life. Everything else goes - on schedule, everywhere at once - and the organisation can show it.

Final Thought

Keeping data feels prudent and is usually the opposite. Every record held past its purpose is something that can be stolen, must be disclosed on request, and may be found by a regulator who now looks specifically for it. Deletion is unglamorous, effortful, and the clearest signal a privacy programme has matured past policy into operation - because it’s the one obligation an organisation can’t fake. The data is either gone or it isn’t.

The test: pick any category of personal data and answer three things without a project - how long is it kept and on what basis, what fires when that period ends, and could the deletion be proven across every system six months from now. If the honest answer is “most things stay and almost nothing is deleted,” the next audit already has its finding.

Frequently Asked Questions

Does GDPR set a fixed retention period?▾

No. GDPR sets no universal retention period. The storage-limitation principle (Article 5(1)(e)) says you keep personal data no longer than the purpose requires — the purpose, and any specific legal obligation, set the clock.

Does a deletion request include backups?▾

Yes. Deletion has to reach live systems, backups, exports, and processors. Deleting in the production database while the data lives on in backups and vendor copies is a common and serious failure.

What is storage limitation?▾

The GDPR principle that personal data must be kept in a form that permits identification for no longer than is necessary for the purposes for which it is processed.

How do I build a retention schedule?▾

Bottom-up: a clock per data category, tied to a stated purpose and legal basis, with an automatic trigger that fires when the period ends and coverage that reaches every system, backup, and processor.