CPRD Safe - our Trusted Research Environment

Trusted Research Environments (TREs) are considered the future method for access to healthcare data.

In recent years there has been a growing public awareness of data privacy and calls for greater transparency of data sharing, including assurances that healthcare data is safe, secure, and only used for its intended purposes.

CPRD has provided access to anonymised patient data for research for the benefit of public health for 35 years. We have an excellent track record in ensuring that data is safe, secure, and only used for its intended purposes, as recently validated by an external NHS Audit. To further strengthen the secure, safe use of data for research, we have created CPRD Safe, our dedicated TRE.

  • About Trusted Research Environments
  • About CPRD Safe
  • How CPRD Safe works
  • Technical details
  • Onboarding to CPRD Safe
  • Next steps

CPRD Safe continuing to provide secure data

An image showing data coming from a hospital and GPs surgeries. The data passes through a lock representing it having been pseudonymised,  before going into a secure safe.

About Trusted Research Environments

Trusted Research Environments (TREs), also known as Secure Data Environments (SDE) or Data Safe Havens, are highly secure computing environments that provide remote access to data for approved researchers to use for public health research.

This video short from Understanding Patient Data briefly explains what a TRE/SDE is.

TREs differ from widely used models of data access where researchers need to download data onto their computer to be able to use it for their analysis.

You can find out more about TREs in this document by HDR UK: https://www.hdruk.ac.uk/wp-content/uploads/2021/09/HDRUK_TRE-One-Pager.pdf

We are developing our dedicated TRE, CPRD Safe, in line with the Five Safes framework, allowing approved researchers to access our data in a secure and controlled way.

The Five Safes are:

  1. Safe People – Researchers are trained and authorised to use the data safely
  2. Safe Projects – Research projects are approved by data owners for public good
  3. Safe Setting – A secure environment that prevents unauthorised use
  4. Safe Data – Data is treated to protect any confidentiality concerns
  5. Safe Outputs – Approved outputs that are non-identifiable
An image of a safe surrounded by five shields each containing an icon representing the five safes which are safe settings, safe data, safe output, safe people and safe projects

 

Airlock safe

In response to patient feedback we have developed an “Airlock” which uses both human and automated checking to ensure any data leaving CPRD Safe is a “Safe Output”. 

Find out more in the news article Patients reviewed our CPRD Trusted Research Environment ‘airlock’ system.

About CPRD Safe

CPRD Safe gives approved researchers with approved projects secure access to CPRD healthcare data. All patient information in CPRD Safe is anonymised, which means that any identifying (or personal) information such as names, addresses or NHS numbers are removed and an individual cannot be identified.

An image of a file with a pseudonymised document being passed through CPRD Safe to become anonymised.

 

How CPRD Safe works – a guide for potential users

Using Government Digital Standards (GDS); we have created a secure, scalable, reliable, and accessible service. 
Please go to the training guide: CPRD Safe features: a guide for users | CPRD

The technical components of CPRD Safe

CPRD Safe consists of a secure shared workspace that users connect to via a virtual machine. The workspace is protected from the internet and provides access to research data, code libraries and analytics tools.

A diagram of CPRD Safe showing that data comes from Observational research servers and is placed into a CPRD Safe workspace by the Observational research team. Access by an approved researcher is  via virtual machines to a workspace that has statistics packages for analysis. An airlock is used to quarantine outputs as automated and human checks are carried out, before it can be exported to an approved researcher. Code and scripts can be brought in via GIT Hub and Gitea

 

The Goldacre Review

In February 2021, Professor Ben Goldacre was commissioned by the UK Government to review safety and security in the use of health data for research and analysis.

The Goldacre Review – Better, broader, safer: using health data for research and analysis – was released in April 2022 and made 57 recommendations focusing on the use of TREs as the future direction of travel for a strengthened and consistent management and access of healthcare data.

In June 2022, the Department of Health and Social Care released the Data saves lives: reshaping health and social care with data policy paper, which supported the recommendations in the Goldacre Review.

In line with this UK-wide direction of travel, our major strategic aim is therefore to move to a predominantly TRE-based model for data access and analysis.

Most research that utilises our data will take place within CPRD Safe but there may be limited instances, such as patient consented trials, where it will be possible for data to leave the TRE to be combined with other datasets for analysis.

What we have delivered

In April 2022 we began working on the first iteration of CPRD Safe. This initial version allowed our researchers to log into a secure environment and use tools such as R, Python and Stata to analyse synthetic (artificial) data.

In April 2023 we tested a second iteration of CPRD Safe with internal and external users. This version allowed researchers to log in to dedicated areas of the secure environment, called workspaces, and analyse anonymised real-world data.

By April 2024 we completed and testing the third iteration of CPRD Safe. It was then launched, and we started to onboard our first research team. 

In August 2024 we updated our airlock mechanism and output checking processes with improved security, whilst making them easier to use.

Onboarding to the TRE

A.    Initial Application:

Timings are dependent on the applicant.

1.    Use the eRAP (electronic Research Application Portal) to fill out an application.

B.    Validation of New Protocols to first RDG Outcomes:

2.    The RDG reviews the research protocol and requests clarifications or provides approval.

3.    For full applications (such as SSL) this step takes an average of 30 working days from valid application submission to first outcome.

C.    Contract Process:

From mail out to signature: this step usually takes a minimum of 4 weeks.

4.    The client receives a contract template.
       a.    The contract includes agreements related to end users and data usage.

5.    The client legal team reviews the contract and returns it signed.

D.    Data Specification:

This step usually takes 8-12 weeks.

6.    The client is contacted to start defining data requirements (using the data specification form) within 15 days, as per SLA).

7.    Client completion of the data specification form and agreement on the data specification; takes on average 6 weeks. 

E.    Data Delivery:

This phase spans a total of 6 weeks and includes concurrent activities:

8.    Data and online training:
·    Workspace Owner and Super User Training on CPRD Safe (4 weeks).
·    Data Cut: Extracting and preparing the data (within 30 days as per SLA).

F.    Workspace Creation

This phase takes 1 to 2 weeks

9.    Data imported into CPRD Safe and checked.

10.    Researcher access to data within CPRD Safe.

11.    Research starts.

Remember that these timelines are averages, and actual durations may vary based on client-specific processes and our internal procedures. When planning research studies, consider these steps and their associated timeframes before data analysis can commence.

Next steps

We continue to onboard SSL clients and further develop CPRD Safe for RDG and MSL services. 

December 2024: Launch of new CPRD Safe website content and online training materials. 

October 2025: RDG clients will be able to use CPRD Safe and we will begin migration of MSL clients to CPRD Safe. 

High level Roadmap milestones

Roadmap for TRE delivery up to Q4 2026

Our roadmap is subject to influences of legislative, technological and client need.

We will continue to update this page as we progress through the development stages.

Further information

GitHub - MHRA/cprd-oss-tre: An accelerator to help organizations build Trusted Research Environments on Azure.

Medicines and Healthcare products Regulatory Agency Delivery Plan 2021-2023

CPRD Safe features: a guide for users | CPRD

Page last reviewed