The Aridhia DRE: A Federated Trusted Research Environment
Our Q1 blogs have detailed two important projects. Our work with the DARE UK TREvolution project enabling federated analytics in Trusted Research Environments, and our collaboration with Flower, integrating their federated learning framework with the DRE. While our work on making federated data access a native feature of the DRE continues, these initiatives both reached significant milestones recently, so it feels like an opportune moment to review our progress, and reflect on wider developments in this space.
Federated Research Platforms
The Federated Research Patterns Framework, published last year provides a high level framework for understanding the integration of federated analytics with a TRE. It splits the required components into six themes, detailed below:
| Pattern theme | Description | Component |
|---|---|---|
| Analytical | Understanding the analytics required for the research informs the statistics and determines the algorithm type. Where the algorithm type categorises what analyses could be supported. | Isolated, Connected, Centralised |
| Data Movement | As determined by the algorithm type, this is how data is required to move in and out of a TRE and how it is executed. | Summary, Model Weights, Row level data |
| Data Egress | The output checking method required for the egress of results. | Off, Manual, Semi-automated, Automated |
| Metadata | Data that assists the researcher to construct analyses to run with the weave. | Metadata specification |
| Initiate | The components that are functionally required to enable federation. | API Specification |
| Process | The components that are functionally required to enable federation. | API Specification |
As you can see above each theme has a number of possible components e.g. the analytical theme can have one of three possible components:
- Isolated: these are analyses that are replicated individually within each TRE and require no state to be maintained. The only results to be shared in this category will be a Summary Data Movement Pattern.
- Connected: these are analyses which require multiple rounds of local calculation and aggregation, requiring the TREs to receive results from other TREs to be included in the local calculations. These analyses may require a state to be maintained, and the results will either be a Summary or Model Parameter.
- Centralised: these analyses require data to be pooled, even temporarily, for the analyses to be performed. This will always require a row level data movement.
The framework describes any valid combination of these themes and components a Weave, and provides example weaves to help readers better understand the concept, you can find these here. This is a potentially useful way of thinking about data federation at the level of data flows and process, rather than particular technologies.
Rereading the framework following the completion of our recent DARE and Flower projects is encouraging, as it casts a positive light on the level of flexibility we have achieved in enabling federated analysis and federated learning in the DRE.
Our DARE UK implementation uses the open source Federated Node we have developed for the PHEMS project, and allows users to initiate an isolated analytical task on data held inside the DRE.
Our Flower implementation, uses the Flower framework, and allows users to perform federated learning across datasets held in multiple workspaces, with the task managed from Flower’s SuperGrid.
These can be transcribed into the patterns framework as follows:
| DARE UK | Flower | |
|---|---|---|
| Analytical | Isolated | Connected |
| Data Movement | Summary | Model Weights |
| Data Egress | Manual (optional) | Off |
| Metadata | FAIR Data Services | FAIR Data Services |
| Initiate | Federated Node | Flower SuperGrid |
| Process | Federated Node | Flower SuperNode |
Apart from using FAIR Data Services as our source of metadata, these are completely different implementations under the Patterns Framework.
These are the most recent examples, but given our ability to connect the DRE to external data sources, run machine learning jobs securely inside a workspace, and check outputs with our SACRO integration and workspace airlock, we are well placed to provide further distinct implementations under the Federated Research Patterns Framework. Indeed, our recently announced partnership with Orrum is another step in our embedding of federation capabilities within the DRE.
If you would like to know more about the above please contact us here.