The rate of creation of real world healthcare data has not been matched by a similar rate of its use in research. However, in the UK, the NHS is uniquely capable of collecting and making available high-quality data at a population scale. It has made great strides through the NIHR Health Informatics Collaborative (1), NIHR BioResource (2) and NHS Digital (3) and the formation of Health Data Research UK (4). The NHS is clearly committed to the best use of health data, but there is an opportunity to leverage this more widely. While public opinion surrounding personal data is often cited as a concern, there are more regular monthly donors to medical research charities than there are members of all the major political parties combined.
This suggests that the problem is not the British public’s willingness to support research. At the Medicines Discovery Catapult, we believe the challenge is for us to support productive links with the UK’s outstanding biomedical and data science research community. If data really is ‘the new oil’ we have to improve how consented data is drilled, distilled and then made safely available for innovators to create high value products.
The problem has moved from creating data to accessing it
The problem used to be that the data wasn’t there or good enough. Before the arrival of electronic patient records, it was prohibitively expensive to extract data at a large scale. Non-financial data was often unreliable and incomparable across sites. This has changed. Now we have accumulated national standardised datasets covering in-patient care, general practice, and disease registries such as cancer. Also, advanced statistics and data science can now create high value information from them. These findings were borne out of a survey of 84 biotech CEOs for the Medicines Discovery Catapult and BioIndustry Association’s State of the Discovery Nation 2018 report (5). 83 percent of CEOs agreed that ‘access to the right health data such as registries and activity is hugely important for innovative companies’.
However, small companies find it difficult to access the right data. In the same study, 63 percent disagreed with the statement that “It is easy to access the right health data such as registries and activity.” This is reflected by the low levels use by SMEs of large-scale NHS data. There have been significant investments in improving the quality and quantity of data, such as the NHS’s £4.2bn for Paperless 2020 (6) and UKRI’s recent £210 million “From data to early diagnosis and precision medicine Industrial Strategy Challenge Fund” (7). But when 73 percent of biotech CEOs believe that the UK does not have a clear framework and process for the commercialisation of NHS health data, there is a risk that UK health data gets bigger and better but remains inaccessible.
The flow is slow
There are issues that slow down or prevent the use of data from being used in a medical research context. Overpromising by data owners and resellers about quality and quantity has not helped. The current legal framework for the protection of patient data is comprehensive and appropriate. Dame Caldicott’s work is excellent (8). However, the legal framework has introduced necessary administrative and legal burdens on data owners and users. GDPR will increase both the bureaucratic burden and organisational aversity to risk. Areas of policy need to catch up with this rapidly developing field, which Cambridge Analytica/Facebook, Royal Free/Deepmind, and care.data have brought into the spotlight. There is a legitimate range of patient and public views, but this should not prevent people from choosing to have their anonymised healthcare data used in medical research.
Patients want their data to be used
So, we currently have many patients who want to donate data and many organisations that want to use it: but they cannot. This is a dysfunctional market where costs and benefits have not been well matched, creating little incentive to make data available. Data access specialists such as NHS Digital are making significant progress in collecting and assuring quality, but they are challenged by conservatism around data-sharing. This is understandable; legal and reputational risks of data sharing are difficult to control. However, the views of patients and researchers are not always heard. Staff at the Medicines Discovery Catapult previously ran a cancer study where over 10,000 patients agreed to share anonymised genetic testing and real world data for research and development. Over 90% of patients consented. 500,000 patients have agreed to be in UK Biobank and 70,000 will be part of Genomics England.
How can we ensure that the millions of other people who want to give data, can?
First, we much acknowledge that giving (and receiving) data is not simple. The technical, legal and administrative burden is complicated and requires specialist support. Experts do not exist in every hospital, university and research team. So, we need to find a way to reduce the constant reinvention of the wheel, which gets in the way of data access. Second, we must move beyond the hype, the theory and the top-down initiatives. Instead, we must deliver specific projects that show how using data can be safe, predictable and valuable. Third, we must continuously work transparently and listen to patient groups to ensure costs and benefits work from their point of view.
What are the benefits?
The monetary value of well curated, health datasets linked to biological information is measured in millions, because they help develop products that patients value. Datasets can be used to validate diagnostic tests and new drug targets for more predictable, lower risk medicines and R&D programmes. They also have indirect, spillover value. Use of more real world data means trial methodologies need to continue to evolve, along with cross-organisational working. This doesn’t mean the end of randomised clinical trials; they are still the best way to assess causality.
However, the strict inclusion and exclusion criteria in traditional randomised clinical trials compromise their ‘generalisability’. Use of real world data in trials will improve the relevance of their results to the whole population. Working across multiple organisations is also a challenge, but the NIHR Health Informatics Collaborative shows that the UK can overcome the usual human and IT issues. Successful cross-organisational working can be done.
The Medicines Discovery Catapult is working to improve access for SMEs to consented datasets in medical R&D, working together with Health Data Research UK, UseMyData and others (9). Focussed on the practical issues, we are building a small specialist data access support team that can help R&D companies, and data owners, navigate the process of finding, contracting, accessing and using data in research. We will also assess the practical accessibility of the major UK datasets, to show who is enabling the most research and what the others could do to improve.
Chris Molloy is chief executive officer and James Peach is samples & data access lead, both at the Medicine Discovery Catapult.
Chris Molloy, will today (24 May) chair a conversation focussed on “Humanising discovery and real world evidence and improving access for SMEs to the NHS Ecosystem” at the BioIndustry Association’s UK CEO and Investor Forum 2018. The event brings together experts and sector leaders to discuss and debate the potential implications of our continuously shifting international landscape and aims to predict what will happen next on the UK, European and Global stages.
5 State of the Discovery Nation 2018 and the role of the Medicines Discovery Catapult. Joint report by the BioIndustry Association and the Medicines Discovery Catapult. January 2018.