What should we do with our health data?

The NHS is sitting on a gold mine. We’re a nation of 70 million people where almost everyone is covered by the same health system and much of their data is already logged in the same computer system. Unlike American insurance companies, the data about patients' drug usage is stored on the same system that contains clinical notes. And the UK is relatively ethnically diverse, which means that findings based on UK health data are more likely to be applicable at a global scale.

For example, for everyone who was vaccinated against COVID-19 in the UK the NHS has logged what vaccine they received, their number of doses, and when they received said vaccine. Everyone who was hospitalised due to Covid also has that information logged onto the same system. Together, this information can give us high-quality information on which vaccines are the most effective, give us a sense of who benefits the most from boosters, and how different vaccines respond to new variants.

There are many other plausible applications of NHS data.

  • We could use it to better monitor the effectiveness of new drugs. Post- market drug surveillance, as it is called, is useful because it shows you how drugs interact in a real-world context. Clinical trials are often run on patients with few ailments, but outside of trials this is rarely the case in the real world. (In fact, people who have at least one disease are much more likely to suffer from multiple diseases.) Better post- market drug surveillance, aided by more data, would give us a better understanding of when to prescribe medicines and to whom.

  • Data about when and where people need to be treated can be deployed to use hospital resources more effectively and efficiently, as has been effective in Australia.

  • AI can be trained on banks of patient data and can be used to find links between different diseases or predict who needs pre-emptive medical care.

  • Smart watches and rings can be used to monitor things like a patient's blood sugar and hormone levels throughout the day, and this data can then be integrated with a patient’s medical records, to give doctors a more complete view of a patient’s health.

A few weeks ago, Ben Goldacre wrote a report for the government about better use of health data. It is comprehensive, covering everything from dealing with privacy and ethics concerns to the practicalities of gathering and finding people to use the data. But, it perhaps deliberately dodges a key debate. Should private companies have access to our data, and how should they be allowed to use it?

Answering this question matters a lot. Get it right, and we will improve healthcare and lead happier and longer lives.

The life-saving drugs and treatments that the NHS gives us are not made by the government. Instead they are designed, tested, and manufactured by private companies. So capitalising on the full potential of NHS data will mean sharing it with businesses.

Goldacre’s report does acknowledge that there is a political battle to be fought. 

In media and public discourse around access to NHS data two principle concerns dominate: appropriate safeguarding of patients' privacy; and the notion that NHS data is being “sold” for commercial use.

And it describes the following case, where allowing private businesses to use NHS data already clearly benefits the wider public:

When side effects are spontaneously reported for a given treatment, regulators will typically approach the relevant pharmaceutical company and require them to conduct pharmacoepidemiological research, using complex statistical models in electronic health records data – often from the NHS, using subsets of the GP data – to evaluate the extent to which a given adverse outcome is more or less common in recipients of different comparable drugs, or with comparable medical histories.

But it fails to acknowledge any of the other benefits of allowing private companies to use the data.

The NHS’s teams of analysts are not good enough to reap the benefits of the data either.

  • Training is low-quality and is often framed as a voluntary activity without there being a clear path to validate or certify skills gained. Private sector analysts are often given access to training, accredited courses, and conferences to keep them up to date with the latest statistical methods.

  • The path to promotion usually requires analysts to move into management roles, which means the best analysts either stay junior or spend less of their time working on data.

  • NHS analysts are typically paid under £45,000, while their peers in the private sector can expect to earn over £80,000.

Together this means that NHS analysts are poorly trained, likely to leave, or making major sacrifices to do public service.

To remedy this, the report proposes better training, recruitment, pay, and methods of empowering NHS analysts. This is not good enough. Many of the issues NHS analysts face are problems that are prevalent throughout the public sector. While it may be possible to make the NHS data analysts a rare, shining example of a great public sector work environment, I think this will be a hard fight to win. Political pressures will always favour more spending on medical staff and healthcare over analysts. It is overly ambitious to think that the NHS will ever be able to compete with the likes of McKinsey or DeepMind for data science talent.

It is more realistic, instead, to focus on external collaborations. Of course, the easier it is for people outside of the NHS to access the data, the more concerned we have to be about data privacy. So we need robust security standards.

There are problems with the current system, which is over-reliant on pseudoanonymisation, which doesn’t really work. Pseudoanonymous data  just removes a couple of key pieces of identifiable information, like a patient’s name. But if I know a person’s birthday and saw their Facebook post about getting vaccinated with Pfizer in April, then it would not be difficult to put that information together to snoop on a friend’s private medical records.

Sometimes synthetic data is used, where a new dataset is created that is similar to what a real data set would look like. This is problematic because the process of creating synthetic data uses computer programmes. Machine Learning algorithms trained on synthetic data could, plausibly, be useless.

The approach favoured by Goldacre, which I think is a good idea, is to create Trusted Research Environments where people can apply for access to NHS data.

A Trusted Research Environment (TRE) is a secure environment that researchers enter in order to work on the data remotely, rather than downloading it onto their own local machine. Users can extract and download the answers from their analyses – such as results tables, or graphs – but individual patients’ data always stays within the secure environment.

All of this implies that Goldacre and his fellow report writers believe that private researchers should have access to NHS health data. Despite this, the report does not mention startups or businesses and has only a cursory mention of pharmaceutical companies.

The danger is that we create a world-class library of healthcare data and fail to use it well. If the data is gated in such a way that it can only be used by researchers at large established pharmaceutical companies or universities, then we will miss out on some of the most innovative and novel therapies. It is not uncommon for pioneering drugs to be created by startups which are then purchased by larger pharmaceutical companies. While it is, of course, possible for a new startup to partner with a university or an established company to access the data, this creates an extra barrier for the creation of new drugs and biases the economy towards existing players.

The UK has a thriving, plausibly world-leading, Life Sciences research environment. But if we want to maintain an edge we have to be smart about what we do. The NHS’s bank should give us an easy advantage, leading to the creation of all kinds of healthcare innovations. It is in everyone’s best interest to get this right.

I want to see the NHS proactively gather more patient data. GP surgeries often keep their records in paper filing cabinets. This is a waste. Instead we should be encouraging surgeries, hospitals and clinics to be uploading their data to a central system. We should make this easy to do and make sure it fits in with other work that doctors are already doing, like recording patient’s blood work or giving out prescriptions. Patients should have the choice to opt out but data should be gathered by default.

The central data system should be held securely, and have dedicated teams making sure that the library is easy to use, well maintained, and that the data is accurate. Then, as Goldacre proposes, we should allow trusted researchers access to this data. After appropriate security measures are put in place, we should be generous about who we give the data access to. Provided they agree to certain ethical standards, research teams and startups should be given the information they need, and the process for gaining access should be quick and unbureaucratic.

If we manage to create this, this system will be unlike anything else in the world. If we can combine it with data from 23 and Me, period/fertility trackers, and smart watches it would be more powerful still. This is an ambitious goal, but it is worth pursuing as it will increase the quality of healthcare we can get, provide a new stream of income for the NHS, and create more viable therapies. Few government interventions are as win-win-win as this and we are well placed to achieve this.