Anonymous data : How “anonymous” are they really ?
Problems relating to what data/images/information can be considered as anonymous is one of the major data protection issues of privacy law having an impact in any sector including the Internet of Things and e-health.
In general, anonymous data can be defined by virtue of the method of collection that can never reasonably be connected with the person providing them. This can be accomplished by questionnaires that are returned by mail, questionnaires that are collected by one of a group of subjects and returned to the researcher, or internet surveys. The Article 29 Working Party, a consultancy body of the European Commission on data protection matters, issued an opinion upon anonymization techniques identifying what kind of conducts convert identifiable data into anonymous data for privacy law purposes in order to give guidelines at the topic. According the EU Data Protection Directive 95/46 in assessing whether a person is identifiable through the processed data, account should be taken of all the means “likely easonably” to be used either by the controller or by any other person to identify the said person.
In this respect, while pseudonymised data can still be deemed to be personal data as they can be connected to the individual they refer to linking the pseudonym to the name of the individual to whom it referred, the answer if randomization or generalization techniques are used is less straight-forward and depends on the peculiarities of the case and the technique used for the anonymization. Additionally, the issue is that anonymization techniques which are considered to be effective as of today might not be anymore in a couple of years with the development of technologies. Therefore data protection obligations might become later on an issue for companies that assumed to have overcome their restrictions.
In any case, it should be considered that according to the Article 29 Working Party even if data protection laws do not apply to anonymous data, such data might be still subject to confidentiality obligations and therefore their storage shall be authorized by the individual to which the data refers. Likewise the usage of anonymization techniques is deemed to be per se a data processing activity relevant for the purposes of data protection laws and therefore if this was not performed in compliance with privacy laws it might be challenged and fined.
These types of anonymous, aggregate data sets can be incredibly valuable. Companies such as Google, Apple and INRIX are using smartphones and in-vehicle devices to map traffic patterns and how people move throughout cities in efforts to improve both commute times and urban planning. Social scientists accessing data from companies such as Google and Facebook could learn a lot about the intricacies of online behavior. And predictive analytics platforms such as Kaggle present an opportunity optimize everything from business processes to health care.
One has to wonder, though, what types of policies and technologies will come about to keep data anonymous and available to the people who need it while still maintaining its utility. If true anonymization is really that difficult, perhaps the best bet is just to double down on security and try to ensure that valuable data — anonymous or not — doesn’t get into the wrong hands.
Étudiante en M2 Droit de l’économie numérique à l’UdS, avocate en Grèce et membre du barreau d’Athènes.