Pseudonymisation & “means reasonably likely to be used” for identification: when does data become personal?

As I was in a meeting when the European Data Protection Board opened registration for its pseudonymisation stakeholder event of 12 December 2025, I missed the short (approx. 1h) registration window and they placed me on a waiting list instead – a pity given my frequent interventions on the EU Court of Justice’s SRB judgment of 4 September 2025 and on the issue of what is “personal data”.

So here are some practical suggestions for anyone who will attend. Perhaps the EDPB will kindly take them into account too?

A. Context: from Pseudonymisation Guidelines to SRB

The general context is that of the European Data Protection Board’s Pseudonymisation Guidelines, a first version of which was adopted in January 2025 and which was open to public consultation. I shared some preliminary thoughts on the topic, and was pleased to be able to contribute to comments of several trade associations on the issue.

After EU Court of Justice’s SRB judgment came out (see my lengthy analysis, including on broader consequences and open questions), some of us pointed out that the Pseudonymisation Guidelines included wording that conflicted with the Court of Justice’s reasoning. The contradictions were not so surprising, given that the EDPB had already come to the aid of the European Data Protection Supervisor, one of the parties in the case, in order to defend an “absolute” notion of personal data (as shown by the judgment’s own wording), and given that the SRB judgment rejected this view.

While the EDPB was publicly promoting pseudonymisation further to the SRB judgment, it appears to have also understood that its Pseudonymisation Guidelines could not remain unchanged. Hence the stakeholder event of this 12 December 2025, to gather input on how SRB (+ other cases such as Scania and OLAF) and pseudonymisation work together and the consequences for the concept of “personal data”.

B. A note on terminology

First, the questions (listed below) all include the word “controller”, but some questions appear to be using this term incorrectly (see notably the first sentence of question 3).

In my view, it is critical to make a difference between three main parties:

  • The pseudonymising controller, i.e. the person/entity who is the initial controller and pseudonymises the data (or obtains their pseudonymisation by a processor);
  • The initial recipient of pseudonymised data;
  • The subsequent recipient of pseudonymised data, which receives the pseudonymised data from the initial recipient.

The questions concern the issue of whether the pseudonymised data should be viewed as “personal data” from the perspective of the initial recipient or of the subsequent recipient, and whether this should make them a controller. The initial recipient can be a recipient or a transmitting party depending on the scenario and depending on the (part of a given) question.

The EDPB should clearly be careful when updating its pseudonymisation guidelines, so that it does not use the word “controller” in relation to a party who is not yet a controller.

C. The EDPB’s questions

The questions themselves are convoluted and have quite a bit of overlap, reason for which I will treat some of the key topics independently from a particular question (with some notable exceptions). I am copying them for the sake of completeness.

Question 1:

According to the Court, the relevant perspective for assessing identifiability depends, in essence, on the circumstances of each individual case. Based on your experience, what are the use cases where further guidance could be beneficial regarding the contextual assessment of the relevant perspective(s)?

Further, are there any specific GDPR provisions which pose particular challenges for this assessment? For example, what open questions remain in practice considering different roles in processing, e.g. controller-processor relationship, joint controllership?

Question 2:

According to the case law, a controller may need to assess the means of identification available through a transmission of the data in question to third parties. In relation to this, the data could possibly change its nature (e.g. data considered anonymous could become personal) due to (potential) transmissions between different parties, which may also have consequences for the initial controller.

Which types of use cases (e.g. connected with third country transfers, publication to the general public) present practical challenges to ascertain the presence or absence of means of indirect identification?

Which kind of measures could controllers take to recognise the presence of such means?

Question 3:

The Court emphasised the restriction of the analysis to means reasonably likely to be used by the controller or another person. Circumstances determine which means are ‘reasonably likely’ to be used.

What kind of measures can a controller implement to limit the means ‘reasonably likely’ to be used?

How can this be done in the case of subsequent transmissions by intended recipients to third parties who may be able to identify the data subject?

Question 4:

In your experience, in which use cases would a controller processing data that has undergone pseudonymisation (pseudonymised data) have problems in deciding whether they are personal for a given recipient?

What would be technical and organisational means that the pseudonymising controller could apply and that a recipient could not lift?

D. Commentary on the questions and relevant response elements

D.1. “Personal data” is a relative concept that depends on the context

It is clear from Recital 26 of the GDPR, paragraphs 42 to 48 of the Breyer judgment of the Court of Justice of the European Union (CJEU), and paragraphs 77, 81-84 and 100-111 of the SRB judgment that the issue of whether information constitutes “personal data” is a matter of perspective and context. And as the CJEU put it in SRB, “the relevant perspective for assessing whether the data subject is identifiable depends, in essence, on the circumstances of the processing of the data in each individual case” (para. 100). The issue of identifiability – which is the key element to transform information into personal data – is therefore a relative one, one that depends on the context.

Recital 26 of the GDPR illustrates this well, stating that identifiability – and in particular, determining whether means are “reasonably likely to be used” to identify a natural person – requires taking into account “all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments”. If excessive time or money are needed to link information to a natural person, therefore, information might have to be viewed as falling outside of the scope of “personal data”.

Some are now claiming that this – and the Digital Omnibus proposal, despite the fact that in practice it echoes case law – could be misinterpreted to allow for a purely “subjective” approach, where it might be sufficient not to look too closely at the data to avoid the GDPR applying. This is obviously not what the case law (or the Digital Omnibus) provide, nor has this to my knowledge been claimed by anyone.

Recital 26 explicitly talks about “objective” factors, and anyone I have seen supporting the relative approach is always keen to emphasise the reasonableness to which Recital 26 refers when assessing identification measures, in combination with Article 11(1) of the GDPR (which states explicitly that there is no obligation for a (potential) controller to “maintain, acquire or process additional information in order to identify the data subject for the sole purpose of complying with [the GDPR]”.

To answer Question 1, are there any particular use cases that require “further guidance”, and are there any “open questions”?

There has been quite a bit of debate on the issue of whether processors can benefit from SRB‘s GDPR exclusion if the processor is unable to get to identifying data, but I’m not sure whether this is something the EDPB should weigh on at this stage (beyond the question as to whether it forms part of the mandate that was given internally last year within the EDPB for the drafting of the pseudonymisation guidelines). While on joint controllership the CJEU has given so much input over the past few years that it might be possible to take a clear position, on the issue of processorship the CJEU has left a lot of the issues surrounding the scope of “authority” of the controller untouched. It might be better for the EDPB to wait for the issue to be tackled by the CJEU before it takes a stand on the nature of the controller-processor relationship.

D.2. “Potential” transmissions or only actual transmissions?

Question 2 states the following:

the data could possibly change its nature (e.g. data considered anonymous could become personal) due to (potential) transmissions between different parties, which may also have consequences for the initial controller.

Yet that word between brackets – potential – is a fabrication by the EDPB, one that is not found in the case law of the CJEU.

The very idea that transmissions to a third party can change the nature of data was examined in the Scania judgment, in which the CJEU held as follows:

“49. […] where independent operators may reasonably have at their disposal the means enabling them to link a VIN [= vehicle identification number] to an identified or identifiable natural person, which it is for the referring court to determine, that VIN constitutes personal data for them, within the meaning of Article 4(1) of the GDPR, and, indirectly, for the vehicle manufacturers making it available, even if the VIN is not, in itself, personal data for them, and is not personal data for them in particular where the vehicle to which the VIN has been assigned does not belong to a natural person.” (emphasis mine)

Basically, the vehicle manufacturer needs to actually make information available to a third party before that third party’s identification capabilities become relevant.

This is also apparent from the SRB judgment in which the CJEU stated, by reference to Scania, that “data which are in themselves impersonal may become ‘personal’ in nature where the controller puts them at the disposal of other persons who have means reasonably likely to enable the data subject to be identified. It is apparent, in particular, from the [Scania] judgment that – where those data are put at their disposal – those data are personal data both for those persons and, indirectly, for the controller” (para. 84 – emphasis mine). The choice of wording differs slightly from one linguistic version to another, but is always linked to the specific context of an actual transmission (the French version reads “dans le contexte d’une telle mise à disposition” – i.e. in the context of such placing at the disposal, wording that is also found in e.g. the Dutch and Spanish versions).

Even the OLAF judgment (OC v Commission, C‑479/22 P), which is often quoted as regards identifiability, makes it clear in its paragraphs 57 & 58 that actual recipients have to be taken into account:

“57 In the particular case of a press release issued by an investigating authority in order to inform the public about the outcome of an investigation, that press release is, by its very nature, intended for journalists in particular, with the result that they cannot be distinguished from an ‘average reader’, to whom paragraph 76 of the judgment under appeal refers.

58 However, the fact that an investigating journalist has, as in the present case, disseminated the identity of a person who is the subject of a press release cannot, alone, lead to the conclusion that the information contained in that press release must necessarily be classified as personal data within the meaning of Article 3, point 1, of Regulation 2018/1725 and exempt from the obligation to examine whether the person in question is identifiable.”

Put differently, “potential” transmissions are irrelevant, only actual transmissions matter. Question 2 is in this respect unfortunately worded – one may wonder whether it betrays a potential intention of the EDPB to look for hypothetical transmissions to justify a broader notion of “personal data”.

D.3. “Means reasonably likely to be used”: principles & burden of proof

The issue of what constitute “means reasonably likely to be used” to identify a natural person (according to Recital 26 of the GDPR, and Recital 26 of Directive 95/46/EC before that) has been examined in case law, in particular in the Breyer judgment and in the SRB judgment.

According to the CJEU in Breyer, to qualify as “reasonable”, the means used for identification must be lawful and proportionate: “whether the possibility to combine a dynamic IP address with the additional data held by the internet service provider constitutes a means likely reasonably to be used to identify the data subject” (paragraph 45) would “not be the case if the identification of the data subject was prohibited by law or practically impossible on account of the fact that it requires a disproportionate effort in terms of time, cost and man-power, so that the risk of identification appears in reality to be insignificant (paragraph 46, emphasis mine). This was confirmed by the CJEU in OLAF (para. 51) and in SRB (para. 82).

Therefore, any means that are unlawful or disproportionate are not taken into account in order to assess identifiability. They may of course be taken into account if used in order to assess the lawfulness of processing, but they do not change the “ex ante” assessment of identifiability.

This means in practice that theoretical access to identifying information that cannot lawfully be processed under the GDPR is irrelevant to the assessment of whether information is personal data for a particular entity.

Who then bears the burden of proof? Should it be a question of proving that reasonable means exist (proof of a positive – i.e. identification is by default not possible, unless means are shown to exist), or that something precludes identification (refutable presumption of a positive – i.e. identification is by default possible, unless excluded)? The CJEU itself seems to be unsure.

The wording of Recital 26 of the GDPR is unambiguously in favour of proving reasonable means exist (i.e. proof of a positive), rather than the opposite:

“To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly. To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments.”

This clearly says that we have to assess which means exist (= proof of a positive), not assume that those means exist and then prove otherwise (= refutable presumption).

In the Scania judgment, the CJEU held that non-personal data becomes personal data for an independent operator (as recipient of the non-personal data) “where independent operators may reasonably have at their disposal the means enabling them to link a VIN to an identified or identifiable natural person” (para. 49). The “may” here is ambiguous, but the other linguistic versions of the judgment make it clear that this should be read as “can” or “are capable of”, not “are authorised” or “might” (see e.g. French [“peuvent raisonnablement disposer”], Dutch [“redelijkerwijs kunnen beschikken”], Spanish [“pueden disponer razonablemente”]).

In SRB, the CJEU reaffirmed this: “data which are in themselves impersonal may become ‘personal’ in nature where the controller puts them at the disposal of other persons who have means reasonably likely to enable the data subject to be identified” (para. 84, emphasis mine). So far, it looks like proof of a positive.

Yet para. 85 of that same SRB judgment seems to veer towards a refutable presumption: in so far as it cannot be ruled out that those third parties have means reasonably allowing them to attribute pseudonymised data to the data subject, such as cross-checking with other data at their disposal, the data subject must be regarded as identifiable as regards both that transfer and any subsequent processing of those data by those third parties”.

This to me is very awkward and not in line with previous case law or with the “positive” wording of Recital 26 GDPR, so I think that this particular sentence presupposes some sort of knowledge. The only way that this can be viewed as consistent with Recital 26 GDPR, para. 49 of Scania and para. 84 of SRB is if we consider that if there are particular indicators that there are means available, the burden of proof gets reversed.

Put differently, if a transmitting party reasonably considers (= on the basis of tangible, demonstrated factors and indications) that there are means available to the recipient, it is up to the recipient to rule out the applicability or relevance of such means.

Basically, it’s a part-way compromise: proof of a positive at first, but reversal to a refutable presumption once a sufficient degree of certainty of the existence of means is established.

D.4. What are precisely those “means reasonably likely to be used” to identify a natural person?

It should be fairly obvious that if a pseudonym A1B2C3 is available and a recipient has a correlation table showing that “A1B2C3 = John Doe, living at 123 Big Boulevard in Major City and with phone number 123456789”, that person will be identified.

Beyond that, things get a bit harder. Is an Internet Protocol (IP) address sufficient to identify a natural person? The Breyer case tells us no, because you first need to contact someone like an Internet Service Provider to get access to the identity – and even then, you need legal means to do so. Some now claim that this is irrelevant because databases exist that link an IP address to someone’s identity, but are those databases lawful? If not, they are irrelevant to this assessment (see section 3 – the use of unlawful data can make processing unlawful, but the mere theoretical availability of unlawful data doesn’t make the pseudonymised data itself identifiable).

“But an IP address or a cookie identifier, even without being linked to the actual identity of a natural person, is surely enough if we can use it to display a particular ad to an individual,” some may say. Some (including some academics) have even taken to highlighting “singling out” as the key point of their reasoning: if a person can be singled out, that must be seen as sufficient to transform information into personal data.

Yet this is wrong – and that is where things get truly interesting about identifiability: singling out is not the same as identifying a natural person (and Recital 26 GDPR even makes that clear).

Identifying a natural person requires in essence three things: (i) the ability to attribute certain information to a natural person, (ii) the ability to distinguish that person from any other persons (= i.e. so it is a specific natural person, not just any natural person) and (iii) that distinction must be of such a nature as to make it possible to act upon or in relation to such person.

Where do I find these conditions?

First, “identifiable” comes from Art. 4(1) GDPR, the definition of “personal data”, which says that “personal data” is information relating to an “identified” or “identifiable” natural person. Identifiable then literally means “capable of being identified”, and Art. 4(1) GDPR applies to information about a person who is actually identified as well as to information about a person who is capable of being identified.

Recital 26 GDPR then tells us that “Personal data which have undergone pseudonymisation, which could be attributed to a natural person by the use of additional information should be considered to be information on an identifiable natural person”.

So we recognise that pseudonymised data is about an “identifiable” natural person if it can be attributed to a natural person by way of certain types of “additional information”. Not just any natural person – a specific one, one that becomes identifiable by way of that additional information.

Next, Recital 26 GDPR continues by stating that “To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly”. “Means” is an interesting word in English, as it is both singular and plural, but “singling out” is clearly a means of getting to identification – yet not necessarily the means of getting to identification.

Statistics make this very clear. In many statistics, outliers can be singled out. For instance, if I receive a table of responses from someone else about self-reported responses to “how high can you jump?”, respondent 34 might have reported being capable of jumping 2m high while everyone else is in the 0.5m-1.5m range. Yet this does not mean that I can do anything about respondent 34. I have no means of actually identifying who respondent 34 is or doing anything about respondent 34. So yes, I can distinguish respondent 34 from the others (= singling out), and I know from the context that respondent 34 is a specific person, but I cannot do anything about that person. It would be excessive to consider singling out as always leading to personal data, as that would effectively mean that many situations in which there is no possible action in relation to an individual become covered.

In other words, singling out can be insufficient to transform information into personal data. It can help and it may be that it becomes seen as a necessary condition, but it is not always a sufficient condition.

A recent judgment by the Polish Supreme Administrative Court provided a useful illustration of this, highlighting that “the phrase “can be identified” [= “identifiable”] should be understood […] not only as the possibility of referring specific information to a specific person, but as the possibility of identifying that person, understood as actually distinguishing them from other persons”. This judgment reinforces the notion that identifiability requires more than a theoretical possibility – it requires the practical ability to distinguish and act upon a person.

Singling out is therefore not identification. It can be, but it is not necessarily. Singling out can help achieve one, perhaps even two of the conditions I suggest above, but it is not always enough for all three.

For instance, medical research conducted on biological samples might fail the third condition if there is no feedback loop – no mechanism by which the individual whose sample is used can be affected by or benefit from the research on a personal level. Similarly, displaying an advertisement to a particular IP address might fail the first or second condition depending on whether it is possible to distinguish and act upon a specific person.

D.5. What must a transmitting party do in assessing means available to a recipient?

As highlighted above, information is not personal data if there are no lawful and proportionate means of identifying the natural person to whom that information might relate.

The key question is then not whether a recipient has any means whatsoever at its disposal to enable identification but whether the transmitting party can reasonably know or foresee that this is the case.

As explained above, the burden of proof depends on indications of means, not the exclusion of any possibility of identification. This means that the transmitting party cannot be expected to guarantee that no identification will ever occur, only to assess whether it appears reasonably likely to occur.

The transmitting party must therefore act based on reasonable awareness, considering objective factors (e.g. role of the recipient, nature of the data, foreseeable lawful and proportionate means).

The transmitting party’s perspective is however not the only one that matters, because of the burden of proof mechanism I explained above: if there are reasonable indications that means are likely to be used, the recipient can still overrule that by showing that the means in question are inapplicable or do not lead to identification in relation to the information actually transmitted.

One example I have given a few times is that of telephone numbers and telecom providers. A phone number can be linked by a telecom provider to a customer’s identity – but only if that phone number is actually one managed by that telecom provider. The phone numbers linked to other telecom providers are not ones that this particular telecom provider can readily identify. For some of those phone numbers linked to other telecom providers, some information may be available in phone directories – but even then not every number and identity is listed. So while a transmitting party sharing phone numbers and other information to a telecom provider can reasonably assume based on the relevant circumstances that the recipient, as telecom provider, might have means at its disposal to identify the natural person, the telecom provider can show that this is not the case.

Put differently, the reasonable indications themselves (Recital 26 GDPR, para. 49 Breyer, para. 84 SRB) are refutable (para. 85 SRB) in these specific circumstances.

Can the transmitting party do anything else? Beyond confirmation by the recipient that presumed means are not available or relevant, any transmitting party (whether an actual controller or a mere recipient of pseudonymised data) can also take measures of its own to limit the likelihood of (re)identification by making such (re)identification “practically impossible” or “prohibited by law”, to quote Breyer (para. 46).

Examples of such measures will obviously include contractual measures (such as robust contractual clauses prohibiting the recipient from (re)identifying any person to whom the information might relate, requiring the recipient to stop processing that information if ever identification becomes possible, requiring notice in case of breach of the information, etc.), organisational measures (such as processes for the monitoring of vulnerabilities or leaks) and technical measures (such as encryption or any of the plethora of privacy-enhancing technologies available).

Yet as the GDPR only applies to the processing of personal data, as long as the information is not personal data the GDPR does not apply. This means that such measures are only voluntary from a GDPR perspective for the transmitting party if it is not an actual controller (note that some such measures may be required as part of compliance with other statutory frameworks, such as NIS2).

The voluntary nature of such measures does not signal a “free for all”, though. I expect that the failure to take any measures, even reasonable ones, might be held against the transmitting party if it transmits data to a third party that could foreseeably be expected to treat it as personal data and if that third party’s processing is unlawful (I assume many supervisory authorities to reach for the concept of joint controllership in this context).

E. Conclusion

In summary, this leads to the following findings:

  • “Personal data” is a relative concept that depends on the context;
  • Identifiability requires three conditions to be met: (i) the ability to attribute certain information to a natural person, (ii) the ability to distinguish that person from any other persons (= i.e. so it is a specific natural person, not just any natural person) and (iii) that distinction must be of such a nature as to make it possible to act upon or in relation to such person;
  • When assessing whether information concerns an identifiable natural person, one should consider only lawful and proportionate means of identification;
  • Whether such means are “reasonable” depends also on whether they are foreseeable;
  • Singling out is an example of such means, yet while it can help attain identification, it is often insufficient in and of itself;
  • The burden of proof of the availability of means reasonably likely to be used to identify a data subject works in two steps: first, proof of a positive (the existence of means or at least of reasonable indications that such means exist), followed by the possibility to refute the resulting presumption of availability of means (whereby the existence of such means can be challenged, e.g. by the recipient);
  • Before assessing the impact of transmission to a third party on identifiability, one must limit the assessment to only actual recipients, not potential recipients;
  • Lack of identifiability in theory (whereby e.g. hypothetically available unlawful identifying data is not taken into account) does not mean that actual unlawful processing itself (e.g. based on actual unlawful processing of identifying data) falls outside of the scope of the GDPR;
  • There are means available to the transmitting party to limit the risk of (re)identification by a third-party recipient, such as contractual, organisational and technical measures, but such measures are at most voluntary if the transmitting party is not a controller.

This is just one point of view of course, but it it one built from case law and the law, not mere presuppositions. Hopefully it can inform those seeking to bring clarity to these concepts!

🫖

Did this analysis get you thinking? Reach out!

DataLaws.net is entirely open-access, and instead of getting your data in exchange for this content, how about another trade? If this commentary saved you research time or sparked an idea, feel free to invite me over for tea, chai or a hot chocolate next time you are around Brussels or Antwerp - or invite me over to your offices for a chat!

Get in touch ↗   Let's connect on LinkedIn ↗