The problems with de anonymisation of data as a privacy protection
May 16, 2013 |
In the UK the Open Rights Group (ORG) has called for new EU data protection laws, currently being worked on by EU law makers, to require consent to anonymised data sharing. The ORG made the recommendation after it raised concerns with the practice of anonymisation. The genesis of the concern relates to the attempted sale of anonymised data by a mobile operator to the Metropolitan Police. See EE defends user-data selling scheme following police interest which provides:
Mobile operator EE has defended plans to sell its data, after a newspaper reported personal information was being offered to the Metropolitan Police.
Research company Ipsos Mori has an exclusive deal to sell on EE’s data, and has held talks with the force, according to the Sunday Times.
EE told the BBC the article was “misleading to say the least”.
The company said Ipsos Mori had access only to anonymised data grouped in samples of 50 people or more.
“We would never breach the trust our customers place in us and we always act to comply fully with the Data Protection Act,” a statement from EE said.
“The information is anonymised and aggregated, and cannot be used to identify the personal information of individual customers.”
The newspaper’s report said information about 27 million of EE’s customers was on offer – including their gender, age, postcode, the websites they visited, the time of day they sent texts and their location when making calls.
The Met Police confirmed to the BBC that they had held an “initial meeting” with Ipsos Mori to discuss how the data could be used to tackle crime, but added it “has made no offer to purchase data from Ipsos Mori nor has any intention of doing so”.
The force would not comment on whether it had made similar enquiries with other mobile operators.
In response to the story, Ipsos Mori – which is yet to fully finalise the terms of the deal with EE – told the BBC the data set was “not about individuals – it’s about behaviour”.
On its website, the research company outlined what powers it had:
- We can see the volume of people who have visited a website domain, but we cannot see the detail of individual visits, nor what information is entered on that domain
- We only ever report on aggregated groups of 50 or more customers
- Ipsos Mori only receives anonymised data without any personally identifiable information on an individual customer
- We do not have access to any names, personal address information, nor postcodes or phone numbers
Monetisation of mobile-data intelligence is a major new revenue source for operators.Clues about a user’s location, and what they are interested in, are a potential goldmine for retailers looking to offer targeted advertising.
Other networks such as Vodafone and O2 also offer businesses the chance to capitalise on the personal information it holds on its customers.
“Aggregated, anonymised data based on analytics such as footfall and outdoor media tracking can enable an organisation to make informed decisions,” said Vodafone in a recent press release about services it offers.
Likewise, O2 offers “analytical insights” to retailers through parent company Telefonica, whose digital insights team – set up last year – promises “a digital headcount to help them understand the movement of crowds”.
“Retailers are quite good at measuring footfall inside their stores,” the company said.
“But this data will tell them where people go once they are outside, as well as their age and gender.”
Such schemes have attracted the concern of privacy rights campaigners – particularly at a time when debate over what access the government should have to private data is under scrutiny.
Last week, proposals for the Communications Data Bill – referred to by some as the Snoopers’ Charter – was left out of the Queen’s Speech.
The bill called for greater powers to investigate crime in cyberspace – but was opposed by the Lib Dems who said the measures went too far.
On news the Met Police was in contact with Ipsos Mori about mobile data, one privacy group told the BBC it was “alarmed”.
“There is no point in the government announcing that they don’t want a Snoopers’ Charter only to get a privatised one by the back door,” said Loz Kaye from the Pirate Party UK.
“Companies must start to realise that it is against their interests to treat their customers this way. Otherwise we just end up being commodities in a 21st Century data gold rush.”
The Data Protection Act requires that organisations ensure that personal data is processed fairly and lawfully and only collected for “one or more specified and lawful purposes and not further processed in any manner incompatible with those purposes. The issue is whether data protection law applies to personal data that has been anonymised, and in the UK that it does not.
The UK’s Information Commissioner has issued a code of practice (Anonymiation: managing data protection risk code of practice) which provide that data anonymisation techniques do not have to provide a 100% guarantee to individuals’ privacy in order for it to be lawful for organisations to disclose the information. Anonymised personal data can be disclosed even if there is a “remote” chance that the data can be matched with other information and lead to individuals being identified, it said. The Canadian Privacy Commissioner in Dispelling the Myths Surrounding De-identification: Anonymization Remains a Strong Tool for Protecting Privacy, released in June 2011, argues that the issue of re identification is overblown and the concern is unnecessary. I think it is unduly optimistic.
The re identification of data is not a theoretical possibility. It is an actuality. The use of algorithms and other statistical and scientific means to de anonymise data has been the subject of significant research such as De-anonymizing Social Networks, Robust De-anonymization of Large Sparse Datasets, The ‘Re-Identification’ of Governor William Weld’s Medical Information: A Critical Re-Examination of Health Data Identification Risks and Privacy Protections, Then and Now and De identified data and third party data mining; the risk of re identification of personal information. I discussed this issue at my presentation “Managing Identify on Line” at MIT 8.
There is too much faith placed in de identification of data as a definitive means of privacy protection. Even an effective one. Technology, in particular the use of advanced algorithms, are rendering this technique problematical. Privacy watchdogs seem to want to steer a pragmatic, business friendly path through this issue. The problem is that technology has showed up the UK Information Commissioner’s code of practice to be flawed as far as it relates to de anonymisation.