One step back for data release, but it needs to keep moving ahead

by Stephen Duckett

Published by The Canberra Times, Tuesday 11 October

The full potential of data sets held by governments is only just beginning to be realised. Collected for one purpose, such as paying Medicare benefits, their secondary or subsequent use has the potential to transform policy evaluation and economic analysis. These data sets can be used for multiple purposes, beyond just the initial transactional record for payment and accountability.

At Grattan Institute we have used them to identify ways of improving policy in areas such as paying for hospital care, and measuring inappropriate hospital care. However, the data sets need to be made confidential or encrypted prior to release so that individuals and organisations cannot be inadvertently identified.

In a positive move, the Commonwealth Department of Health recently released a data set of a 10 per cent sample of Medicare and Pharmaceutical Benefit Scheme transactions. It has now withdrawn the data set from public access after computer security researchers at the University of Melbourne, working independently of the Department of Health, discovered that they could decrypt service provider identification codes from the data set.

The decryption research work sought to understand the mathematical techniques related to encryption and anonymisation in order to improve the protection of government data. The Department of Health had made the encryption algorithm available online at data.gov.au when the dataset was released. According to the University of Melbourne decryption academics this was the right thing to do, since “keeping the algorithm secret wouldn’t have made the encryption secure, it just would have taken longer for security researchers to identify the (encryption) problem”.

The Department of Health is undertaking an audit of the release of the data and the Privacy Commission is also investigating it. According to the department, “no patient information has been compromised, and no information about the health service providers has been publicly identified or released”.

While this potential privacy breach is concerning, it should not dissuade the government from future release of data that researchers and organisations can use to improve policymaking. Secondary analysis of government data can shed insights into patterns of spending or service use that would otherwise not come to the attention of policymakers. Secondary analysis benefits from investments that have already been made in data collection and so is generally a less expensive form of research.

Government data holdings, derived from claims processing, should be seen as an important public resource to assist in policy-relevant research that will benefit the Australian community. Failure to harness fully the potential of these data sets represents a significant lost opportunity both for policy development and research.

This episode highlights the need for government-wide standards for data encryption of publicly released data.

The government should not be scared off by this incident and stop any further data releases. Rather, all departments should learn from this experience, and hopefully that is where the Privacy Commissioner’s review will lead.

There is immense value in the datasets, but that value can only be harnessed if the Department of Health and other government agencies continue a managed program of data release, albeit now with stronger encryption safeguards.