Blogs & News

Home Blogs & News

Trusted Data Sharing and Collaboration for Health and Biomedical Research – Pick Your Acronym

This article has a bit of a UK-centric theme to it as far as the various acronyms are concerned, but the content has applicability for any Academic Health Centre globally that’s investing in technology and organisational capacity to explore the secondary use of data for clinical research, research collaborations and ultimately to run real world data enabled clinical trials.

Acronyms and policy first though: NHS England last month published plans that in essence will mean that access to NHS data for secondary use whether by NHS, academic or industry researchers will be through access to approved Secure Data Environments (SDE’s), meaning that studies and trials using real-world data are going to have to come to the NHS data and run through a local or regional SDE rather than de-identified cohorts of data travelling outside the boundary of an NHS Trust.

Meanwhile Universities are focused on establishing Trusted Research Environments or TRE’s. Science is a collaborative ecosystem and researchers (in general) want to collaborate, but as those collaborations become more intertwined with other parties and other data, sensitivities on IP, security, GDPR and reproducibility hit centre stage.

Different acronyms, but broadly the same set of requirements on how to discover meaningful metadata, qualify it, obtain approval to work with it across organisational and geographic boundaries safely and securely and how to ensure both reproducibility and equity on the results of a study, trial or innovation program. There’s a lot to do to reduce the time and effort it takes to set up a multi-institutional collaboration and to ensure that the focus on clinical and scientific outcomes happens as quickly as possible. SDE’s/TRE’s have the potential to make the common denominator pieces of a collaboration more repeatable and configurable removing the ‘start from first principles’ default that’s all too prevalent. Starting to define and accredit SDE’s/TRE’s as an ‘actual thing’ is a great start by various UK Govt organisations representing the NHS, Academia and Industry.

So, welcome as that is, there is still a big hurdle to be tackled to ensure the UK regains its once prominent position in running advanced clinical trials. There are a multitude of reasons for this decline, one of them is clearly the challenges around data, technology infrastructure and skills. For Research Hospitals and Universities to run an SDE or a TRE successfully it effectively makes them a hybrid of a specialist Managed Service Provider and a Software and Security Development Provider which in the vast majority of cases they are neither, and unfortunately, that’s a problem generally ignored to the detriment of patients, scientific productivity and economic growth.

At Aridhia, we’ve run nearly 1,000 projects (neurodegenerative, cancer, rare and orphan disease, Covid-19, AI/ML across a wide spectrum of structured/unstructured data) over the past 10 years through our Digital Research Environment (DRE) on cloud infrastructure (sorry, another acronym, but as we started this 10 years before any of the other jargon was in place, we’re sticking with our original name).

Here are a few observations taken from experience across 5,500 highly diverse datasets contributed from Healthcare, Academia and Industry with people, data and code contributions from over 30 countries –

1. Data are messy – generally underpowered, with variable quality and variable completeness (what might be good for primary use isn’t necessarily complete for secondary study and linkage).

2. Data are diverse and fragmented – structured, unstructured, living in multiple locations, cataloguing, dictionaries and modelling are highly variable.

3. Long Tail of Use Cases – Use cases within clinical and scientific domains are very very diverse. Data, people and code arrive at different times. Teams start, stop, change their approach, discover new knowledge and want to branch and incorporate into their study.

4. Users are diverse – Clinical Scientists understand the domain and the data, not necessarily how to code. Data Scientists understand the code, not necessarily the domain or the data. Data Engineers understand data standards and modelling approaches, but not necessarily the analysis. And funders understand results.

5. Metadata content – Data are rarely annotated for secondary use and there are no incentives for this to change, so work has to be put in to describe and publish metadata and related content to make it meaningful at the point of search and exploration.

6. Data Use Agreements are diverse – Data controllers have a legal and ethical responsibility for the data they control. They are (correctly) very conservative around secondary use. Often this is the longest pole in the tent for a project. HDR (UK) are helpfully trialling a standardised benchmark approach for Data Access Agreements, which we’re embedding into our Data Access Request workflow.

7. Economic models for collaboration – are in their infancy. SDE/TRE/DRE’s should provide the audit, reporting and billing to allocate costs and measure contribution (in essence the mainstay of any collaborative environment).

8. Skill set shortage – The technical, clinical and scientific skill sets needed to step change trials, studies and innovation are in short supply and will become tighter as new tech (AI/ML, patient derived biomarkers, streaming digital health data) make their way into the routine mix. Hospitals and Universities need to think through their business model for research and innovation scale and how they plan to be part of this.

9. Swiss style neutrality – For multi-centre studies with multiple data controllers and many diverse users, there’s a requirement for ‘somebody’ to be Switzerland or act solely as a data processor in GDPR terms and be the Trusted Service Provider for the collaboration.

The risk for the UK is to implement a strategy to mandate SDE/TRE/DRE’s (for sound reasons) but then fail to appreciate the detail of execution resulting in a lowest common denominator/best efforts approach on delivery. The NHS and Universities are service providers in respect of healthcare provision and education, but not in software services.

A model for scale and effective outcomes is Great Ormond Street (GOSH). GOSH implemented the Aridhia DRE alongside the Epic EPR 5 years ago and invested in skills, partnerships, research and clinical engagement consistently and methodically. That partnership approach has delivered real benefit to patients, research funding, national and international collaborations. GOSH has published the impact of its experience and it serves as a business model blueprint for how local implementations of SDE/TRE/DRE’s should be adopted.

SDE/TRE deployments and the various business models associated with them need to accommodate not just today’s challenges and requirements but also think about what’s just round the corner in terms of routine use of multi-omics, imaging and digital health streaming data, federated compute and learning approaches, and complex modelling and analysis required for precision medicine. Policy and strategy are one thing, sustainable delivery over a long period of time is another.