Home Blogs & News

Introducing Aridhia’s AI Research Assistant Framework: Secure AI for Healthcare and Life Sciences Organisations

AI is reshaping healthcare research, but the need for rigorous data governance hasn’t changed. Sensitive patient records, clinical trials, and genomic datasets require environments that protect confidentiality while enabling innovation. With this in mind, Aridhia proudly unveils the AI Research Assistant Framework (Aira), a new capability within the Digital Research Environment (DRE) that brings the value of secure AI to healthcare and life sciences organisations without compromising on the provenance of the data driving the model outputs.

AIRA Framework

Why Offline LLMs Are Critical for Healthcare Data

Healthcare datasets such as clinical records, omics data, and trial metadata are subject to strict data-sharing agreements, regulatory controls, and patient confidentiality standards. Public cloud-based AI services like OpenAI do not provide the required level of assurance, as they rely on external data transfer and lack robust legal privilege frameworks.

It’s not only the datasets themselves that are sensitive, but also the prompts, instructions, and other inputs researchers provide, as well as the outputs of models. These can include identifiable information, confidential research data, or intellectual property that must remain within a secure environment.

OpenAI CEO Sam Altman has recently acknowledged:

“There’s no legal confidentiality when using ChatGPT… If you talk to a therapist or a doctor, there’s legal privilege. We haven’t figured that out yet for AI.”

Sam Altman, CEO OpenAI

This establishes the need for offline AI models that can work within a safe, secure analysis sandbox, such as Aridhia’s Workspaces, that in turn operate entirely within the security perimeter. These models must be able to comply with data use conditions – whether that’s specified by a single organisation, multi-party collaboration, or a particular research topic where there may be multiple data controllers granting access to their datasets.

For Data Owners: Why Offline Must Be the Standard

As custodians of health data, data owners hold the responsibility of protecting subject privacy, honouring agreements, and upholding ethical research standards.
Offline modelling ensures that:

All data remains inside a defined and governed infrastructure.
No cloud services external to that infrastructure can access sensitive content.
Model behaviour can be controlled, audited, and validated.
Export of results can be tightly managed and reviewed.

Whether mandated by legal contracts, institutional policy, or regulatory bodies, offline deployment is not optional, it is essential for safe and scalable AI in healthcare.

The regulatory environment for AI is evolving quickly, with different countries and even individual states introducing new requirements around transparency, explainability, and data residency. Navigating these can be complex, for example across the EU where the EU AI Act and European Health Data Space initiative present further challenges when working with patient data and AI technologies as described in this article from the European Law Blog.

For healthcare organisations operating across borders, this patchwork of rules adds another layer of complexity. Deploying AI entirely within a governed, compliant environment like the Aridhia DRE ensures you meet current obligations and are ready to adapt as regulations change.

It’s unlikely that existing data sharing agreements have considered the potential of the unsafe use of AI technologies, even in what are considered “trusted research environments” or TREs. It is increasingly important that they do, and in some cases, that existing agreements are re-visited and updated accordingly.

Hallucinations and Safeguards: Integrity First

LLMs are excellent at generating code and interpreting natural language, but they do not perform mathematics, statistical reasoning, or validation of scientific results. Their outputs can include hallucinations: plausible but incorrect information, which pose serious risks in clinical and scientific domains. Researchers should always use verified, peer-reviewed code for statistical calculations. LLMs may assist with code generation but are not reliable sources of numerical truth.

Aridhia’s DRE also provides several technical and governance safeguards to review code and output:

Code Versioning and Review via Gitea: Gitea integration enables isolated version control and peer review of code artifacts. This supports transparency, reproducibility, and auditability of all analytic logic.
Audited Model Use: Every model interaction is logged, creating a clear audit trail for model behaviour and data lineage.
Outbound Airlock Controls:
- All data and results exiting the workspace are subject to structured governance inspection.
- Review includes model output validation, provenance checks, and metadata tagging.
- Built-in SACRO (Secure Analytics and Collaborative Review Orchestration) for enhanced airlock processes. Enabling collaborative multi-step approval involving governance stakeholders.

Inside the AI Research Assistant Framework

Designed for flexible, secure AI experimentation, the framework includes:

Offline Model Hosting: Administrators deploy models directly into the DRE, eliminating external data exposure. Models are available to workspace users and isolated entirely to that workspace. The framework allows for model scheduling and prioritisation with the flexibility to allow for CPU only or GPU processing.
Interactive & Batch Inference: Researchers can explore model results interactively using a direct user interface or automate workflows via OpenAPI-compatible endpoints. The Aira API is available from applications in the workspace, allowing for batch inferences to be run from R or Python code running directly as embedded workspace applications or as background jobs.
Your Models, Your Way: Aira enables you to go beyond generalist or outdated models, with our platform supporting domain-specific pretrained models and easy bring-your-own-model integration. Easily deploy best-in-class models for your specific tasks, whether sourced from leading public leaderboards like MEDIC LLM or developed in-house SLMs, ensuring optimal performance for your use cases.
PyTorch and TensorFlow Compatibility: Researchers can work with popular frameworks, enabling extensibility and consistency with existing infrastructure and enhancing the ways to interact with models.

Aira enables customers to experiment with, fine-tune, and deploy models, whether open source, commercial, or in-house, all within the secure boundary of the DRE. This ensures model innovation can occur without any data leaving the environment. We recognise that newer models are not always better, as recent examples such as OpenAI’s ChatGPT-5 regressions have shown.

Broader Use Cases for Aira

A primary focus for Aira is on secure, assisted code generation for analytics in R, Python, and SQL. This capability helps researchers rapidly develop and refine analysis pipelines, query data, and create visualisations, all within the DRE environment. Every code snippet produced can be peer-reviewed, version-controlled, and validated before execution, ensuring reproducibility and scientific integrity.

However, beyond code generation, Aira provides support for a wide range of high-value use cases, including:

Natural Language Processing (NLP) on clinical notes
Extract key clinical concepts, identify patient cohorts, or analyse patterns in free-text health records using offline language models. This accelerates research recruitment, outcomes tracking, and safety monitoring without exposing raw text outside the DRE.
Genomics and omics model fine-tuning
Adapt domain-specific AI models, such as classifiers for genomic variants or proteomics patterns, using sensitive datasets. This enables tailoring to specific populations or disease areas while keeping raw sequence data securely contained.
Medical imaging batch inference and review
Execute large-scale image classification or segmentation workflows (for example, tumour detection on MRI scans) entirely within the DRE. Outputs can be reviewed collaboratively through SACRO’s airlock process, ensuring results are validated before leaving the environment.
Agentic operations within the DRE
Use AI-driven workflows that can autonomously chain together tasks such as data preparation, model execution, and result summarisation. Agentic capabilities allow more complex research processes to run end-to-end, always within the security and governance boundaries of the DRE.

Together, these capabilities allow organisations to test, train, and deploy models without compromising the confidentiality of datasets, prompts, or outputs.

Aridhia’s AI Research Assistant Framework delivers the most compliant and capable platform for working with LLMs in life sciences. It puts full control in the hands of researchers and data owners, supporting both innovation and integrity.

Introducing Aridhia’s AI Research Assistant Framework: Secure AI for Healthcare and Life Sciences Organisations

Why Offline LLMs Are Critical for Healthcare Data

For Data Owners: Why Offline Must Be the Standard

Hallucinations and Safeguards: Integrity First

Inside the AI Research Assistant Framework

Broader Use Cases for Aira

Scott Russell

Recent Posts