Home Blogs & News

OMOP in a Trusted Research Environment

We’ve previously blogged about the increasing adoption of Common Data Models (CDMs), with OMOP currently the most frequently adopted by our data owners. This blog provides an overview of current and upcoming Aridhia DRE features that make it easier for our users to work with OMOP data, and explores our future plans for further improving the experience of using CDM data in the DRE.

The Present – Existing FAIR and Workspaces features

As noted, the DRE already has a number of features that support data owners using OMOP datasets. In FAIR, where a data owner already has one OMOP dataset they can duplicate the common dictionaries when creating their new dataset, and the custom catalogue feature makes it simple for OMOP data owners to provide users with links to OMOP-specific resources like Athena.

The FAIR Cohort Builder is fully compatible with OMOP datasets, allowing analysts to visualise and subset the data before requesting access, with users able to compare the prevalence of different conditions within a cohort, or drill down through multiple dictionaries to identify subjects that meet their specific criteria.

The example below shows how the cohort builder can be used to compare the prevalence of different conditions within an OMOP dataset:

In Workspaces users can now group database tables by schema. This is crucial for working with multiple OMOP datasets in one workspace as the CDM mandates that every dataset has the same data table names. Work is already in progress to extend this feature to FAIR, allowing data owners to add a recommended schema to their datasets and have this automatically applied in the workspace when a dataset is transferred.

The Future – a framework to support OMOP and other CDMs

While it is clear that the DRE already provides data owners and analysts several features that assist their management and interrogation of OMOP data, we believe this can be further improved by introducing an overarching support framework for CDMs in the DRE. Work to introduce this is expected to start in the second half of the year.

In the first instance the framework will introduce supported CDMs (e.g. OMOP or SDTM), and allow data owners to categorise and identify their datasets by these in FAIR. In turn this will allow standard users to search and filter datasets by CDM. On its own this addition to search would be a useful feature, but the introduction of a CDM framework offers up a far wider range of development possibilities.

Taking OMOP datasets as an example:

All OMOP datasets have the same dictionaries, therefore when a data owner identifies a dataset as OMOP during creation the dictionaries could be automatically generated

Given the usefulness of data schemas when working with multiple OMOP datasets, a data owner could be prompted to add a schema during the creation of an OMOP dataset.

FAIR Collections allows data owners to create collections of related datasets. Any new OMOP datasets could automatically be added to a default OMOP collection, making it easier for users to find and browse OMOP datasets.

As detailed above, Cohort Builder is fully compatible with OMOP. However, the nature of OMOP data requires that some data fields need to be omitted to ensure subject anonymity. This is manually configurable within the Cohort Builder, but where a dataset is known to use OMOP these settings could be applied automatically.

The introduction of a CDM framework also makes building cohorts across different datasets a far more attainable goal. Building cohorts from multiple datasets requires a degree of data harmonisation: by definition CDMs provide this. Introducing a CDM framework will make it possible for the system to identify which datasets are harmonised and could therefore be used to build a cross-dataset cohort.

Where the CDM used by a dataset is known, this could be passed to the users’ workspace, allowing it to be configured for the specific data type. We recently wrote about the growing complexity of data transfers, and this approach would be consistent with that trend.

Alternatively, the DRE conditions framework could be used to ensure that OMOP data is only transferred to workspaces with appropriate tooling already in place.

The above are not short-term deliverables, and some of them may never actually be prioritised, but the list illustrates the possibilities that the introduction of a CDM framework to the DRE opens up. Currently the most likely candidate for delivery following the introduction of the framework is automatic metadata generation, but we are keen to receive user feedback on these plans.

OMOP in a Trusted Research Environment

The Present – Existing FAIR and Workspaces features

The Future – a framework to support OMOP and other CDMs

Ross Stiven

Recent Posts