CPRD COVID-19 symptoms and risk factors synthetic dataset

Release date: 

Citation: Clinical Practice Research Datalink. (2020). CPRD COVID-19 symptoms and risk factors synthetic dataset (Version 2020.11.001) [Data set]. Clinical Practice Research Datalink. https://doi.org/10.11581/XG37-GT17


This synthetic dataset is based on anonymised real primary care patient data extracted from the CPRD Aurum database. The dataset focuses on patients presenting to primary care with symptoms indicative of COVID-19 and includes data on sociodemographic and clinical risk factors.

The development of this dataset was funded by NHSX using the synthetic data generation and evaluation framework developed under a grant from the Regulators’ Pioneer Fund launched by The Department for Business, Energy and Industrial Strategy (BEIS) and managed by Innovate UK.

The dataset includes 248,678 patients.

Further information is available on the Synthetic data web page.

Please contact enquiries@cprd.com for further information or if you have any questions.