Information Commissioner completes inquiries into I-MED, Harrison.ai and Annalise.ai regarding allegations of sharing personal information. It found the information adequately de identified
August 7, 2025
De identification of personal information is critically important where data is being used for research. It has also been the subject of great scrutiny by regulators. The Victorian Information Commissioner produced a paper on the limits of de identification after it found that Public Transport Victoria breached myki users privacy by releasing data which exposed myki users’ travel history which the PTV claimed to have de identified. Academics from Melbourne University proved it wrong as they were able to identify the travel history of themselves and others. Apart from being a breach of the Privacy and Data Protection Act Victoria it was embarrassing given the negative publicity. The Federal Office of the Information Commissioner released general advice about de identification. On 19 September 2024 Crikey published Australia’s biggest medical imaging lab is training AI on its scan data. Patients have no idea. The nub of the article is that I-MED “handed over” scans of thousands of patients to a start up company, Harrison.ai, which will use that data to train artificial intelligence. It posed the question of how the data could e legally used and disclosed to Harrison.ai. It made a number of valid points about the generally cavalier manner health organisations treat personal information. The Privacy Commissioner responded with an investigation. The Privacy Commissioner has closed an investigation regarding the transfer of data and issued a report.
The key elements of the report are:
- paragraph 4.2 which sets out the usual two steps of de identification being the removal of personal identifiers and removing or altering other information which may allow a person to be identified;
- paragraph 5.1 the process adopted by I- MED which involved
- segregating the patient data from the underlying dataset,
- scanning the records with text recognition software,
- using two hashing techniques (for unique identifiers such as patient ID numbers, and names, addresses and phone numbers),
- time-shifting dates (to a random date within a specified number of years),
- aggregating certain fields into large cohorts to avoid identification of outliers, and
- redacting any text that appears within or within 10% from the boundary of an image scan.
- paragraph 6.1 the appropriate de identification practices identified by NIST being:
- utilising of the 5-Safes Principles,
- ensuring separation of the Annalise.ai and I-MED environments,
- utilising a ‘Data Use Agreement Model’,
- imposing prescriptive de-identification standards,
- removing or transforming all direct identifiers, and
- utilising top and bottom coding and aggregation of outliers.
- paragraph 6.2 while some personal information was provided to Annalise.ai and therefore shared in error due to failures in the de identification process it was remedied.
it is interesting to note that there were data breaches but not notified to the Privacy Commissioner until after she commenced her preliminary investigation. That is Read the rest of this entry »