Think about the last form you filled out online. Now picture everything your product quietly picked up around the fields you actually typed into: the IP address, the device model, the timestamp, a handful of cookie IDs, maybe the rough city you were in. Under GDPR, most of that is personal data too. Not just the name and email โ the passive exhaust your browser left behind. This is the single biggest misunderstanding founders bring into a compliance conversation.
The Definition, Plainly
Article 4 of GDPR says personal data is any information relating to an identified or identifiable natural person. Two words carry almost all the weight: identifiable and relating to.
"Identifiable" does not mean the data names someone directly. It means someone could be singled out by using that data โ alone, or combined with other information you reasonably have access to. The regulator question is never "does this piece of data name a human?" It is "could a human be picked out of a crowd using this?"
The "That Is Personal Data?" Table
| Looks Harmless | Actually Personal Data? |
|---|---|
| Dynamic IP address in server logs | Yes. The CJEU ruled on this in Breyer v. Germany (2016). If you can combine the IP with ISP data to identify a person, it is personal data. |
| Cookie ID used for analytics | Yes. Online identifiers are called out explicitly in Recital 30 of GDPR. |
| Mobile advertising ID (IDFA, GAID) | Yes. Persistent identifier that maps to one device and usually one human. |
| Licence plate in dashcam footage | Yes. Identifiers that resolve to a person via a public register still qualify. |
| Voice recording from a support call | Yes. Voice is biometric; depending on processing, it can be special category data. |
| Pseudonymised customer ID (user_a7f9c2) | Still personal data. Pseudonymisation reduces risk โ it does not remove you from GDPR scope. |
| Photo uploaded to your signup flow | Yes. Faces are biometric data the moment they are processed to identify someone. |
| Work email: jdoe@acme.com | Yes. B2B personal data is still personal data โ this is the most common founder blind spot. |
The Combination Trap โ Where Fields Become Personal
A pin code on its own is not personal data. A browser fingerprint on its own is not personal data. A purchase amount on its own is not personal data. But when your analytics warehouse joins all three on a session ID โ which most do, by design โ the combined record points to one person. The moment that join happens, the whole record becomes personal data. Regulators do not evaluate fields in isolation; they evaluate what you can reconstruct by joining them.
Special Category Data โ The Sharper Edge
A narrow slice of personal data carries extra obligations under Article 9. These need a stronger legal basis and usually a Data Protection Impact Assessment:
- Health and medical records โ anything from a doctor note to a fitness tracker step count, if tied to a person
- Biometric data โ face scans, voiceprints, fingerprints, when used to identify someone
- Genetic data โ including saliva tests and ancestry results
- Race or ethnic origin โ including information inferred from photographs
- Political opinions, religious beliefs, philosophical beliefs
- Trade union membership โ often caught in HR platforms
- Sex life and sexual orientation โ including data inferable from dating apps or ad targeting segments
If your product touches any of these โ even as a side effect โ your risk posture changes materially. A fitness app logging heart rate is in Article 9 territory. A recruitment tool extracting photos from CVs is doing biometric processing. These are not edge cases; they are common product features whose compliance footprint gets underestimated.
"But We Anonymised It" โ Did You, Though?
Anonymous means the data cannot be linked back to a person by anyone, using any reasonably available means. Truly anonymous data is out of GDPR scope entirely. It is also hard to achieve.
Pseudonymous means you have replaced direct identifiers with tokens, but the mapping back to real people still exists somewhere โ in a separate table, a key management service, or a vendor system. Pseudonymous data is a good security practice. It is not anonymity. It is still fully inside GDPR.
If your engineering team can, given a week, rebuild the link between the dataset and the original user โ it is pseudonymous. If nobody on earth can rebuild that link, including you โ it is anonymous. Almost every "anonymised" production dataset we have audited turns out to be the first kind.
A Quick Test to Run on Your Own Product
Pick any table in your production database. Ask four questions about a single row:
- Does it contain something that points to one specific human โ by name, by ID, by a persistent identifier?
- Even if no single field does, could you combine two or three fields in this table, or across tables you operate, to pick out one person?
- Could someone else who got a copy of this row โ a vendor, a contractor, an attacker โ combine it with information they might reasonably have, and identify the person?
- Is any of the content special category โ health, biometric, religious, political, sexual orientation, ethnic origin?
If the answer to any of the first three is yes, that row is personal data under GDPR. If the fourth is yes, you have heightened obligations. Most product schemas fail at least the first two questions on most tables โ and that is the honest starting point for building a real compliance programme. For the comparison of how GDPR personal data scope differs from India DPDP Act, read our What Counts as Personal Data Under DPDP Act guide.
Frequently Asked Questions
Yes, in most practical contexts. The Court of Justice of the European Union ruled in Breyer v. Germany (2016) that even dynamic IP addresses can be personal data when the entity holding them has the legal means to identify the individual โ for example, by combining the IP with ISP records or user account data.
Yes. A work email address identifies a specific natural person and is personal data under GDPR. The fact that it is professional or issued by an employer does not change this. B2B contact data is one of the most commonly fined misconceptions in GDPR enforcement.
Yes. Pseudonymisation replaces direct identifiers with tokens but the mapping back to real people still exists. Pseudonymous data is a good security practice that reduces risk, but it does not remove the data from GDPR scope. Only truly irreversible anonymisation โ where no one can reconstruct the link โ takes data outside GDPR.
Anonymous data cannot be linked back to a person by anyone using any reasonably available means โ it is outside GDPR scope entirely. Pseudonymous data has had direct identifiers replaced with tokens, but the reverse mapping exists somewhere. The test: if your team can rebuild the link given a week of effort, it is pseudonymous and still GDPR-covered.
Yes. Inferred or derived data โ credit scores, predicted churn risk, health risk assessments โ is personal data if it relates to an identifiable individual. It does not matter that you generated it rather than collected it directly. All GDPR obligations, including access, correction, and erasure rights, apply to inferred data.