February 11, 2020 | Gary
Data: It’s produced in huge quantities from every corner of the scientific and healthcare communities, and harnessing it is the key to driving forward medical advancements. There’s a problem though because, while you might know there are petabytes of information that would enrich your research, you don’t know where or how you can you access it. The data is being held in different silos in different hosting organisations all using different governance policies. Once you’ve derived new data and published your work, it’s then held in this same manner, limiting researchers’ ability to ever access it. Research is being repeated, lost or rendered practically useless to the next generation.
These problems led to the development of the FAIR Data Principles, in order to make data more Findable, Accessible, Interoperable and Reusable.
Findable – The data and metadata should be easily discoverable by both humans and machines through the use of standard identification mechanisms.
Accessible – Once found, the data should be easy to download and use either locally or in a trusted digital research space. The repository it is kept in should also have plans in place to keep metadata accessible even in the event of the data itself no longer being available.
Interoperable – Should utilise standard vocabularies and ontologies in order to ensure the data can be easily mapped and combined with other datasets. This, along with the ability to transform data into standardised formats like FHIR, should enable sharing between various scientific disciplines and organisations.
Reusable – Data and metadata should be richly described with the least restrictive licenses, allowing it to be easily reused in future research. Integration with other data sources should be easy, facilitated by proper citations and descriptions. The potential to break apart data silos by publishing derived results into easily searchable ecosystems. The standard identification of items improves data provenance and allows researchers to not only re-use data, but to identify how aspects of it may be reproduced in their own research.
Aridhia’s aim is to create a single suite of software that addresses every principle, whether you’re a researcher or an entire organisation: the FAIR Data Services. This covers all the needs of a research lifestyle, comprising of a dataset searching and metadata discovery tool, dataset querying, participant de-identification, ontology service, data curation/transformation tools, and more. Aridhia currently have experience in deploying various FAIR services to research institutions/projects such as:
- Great Ormond Street Children’s Hospital, where we have a full deployment of FAIR including cataloging and data dictionary description capabilities, data selection/querying and a customized data delivery pipeline (including record de-identification) to transfer data to Workspaces.
- Large international Alzheimer’s Dementia (AD) initiatives in which we achieve all the above as well as facilitating data interoperability between various AD platforms/silos, allowing these platforms to participate in a network of data sharing.
Aridhia already has a lot of experience of delivering these services on private cloud infrastructure but we are moving away from this as we have Azure implementations currently in preview with customers in both the UK and United States. Up until the full release of FAIR Data Services we’ll be highlighting in a series of articles how a Digital Research Environment like ours can help improve every aspect of the research lifecycle and, crucially, how we can make it FAIR.
Stay tuned for the next item in this series, which looks at data findability.