Guidance on completion of a CPRD Research Data Governance (RDG) Application

General Information

  • All CPRD research applications must be completed and submitted via CPRD’s electronic research application portal (eRAP) (www.erap.cprd.com). Applications cannot be submitted on the CPRD RDG Protocol Application Form in Word format. An eRAP guide for users can be found here https://www.cprd.com/cprd-erap-guide-users
  • All research applications must be supported by a research team comprising of at least two members.
  • All new research applications are subject to a two-stage validation process, prior to entering the scientific review process. This validation is conducted separately by the CPRD RDG Secretariat and RDG Scientific Support team. It is essential that applicants adhere to the guidance relating to completing the application and protocol sections to avoid applications being returned as invalid. Please note that protocols that are poorly developed, that is, have a paucity of methodological details in most sections may be returned to applicants as invalid during the application validation stage. All sections that are validated are indicated in the respective application and protocol sections.  
  • During CPRD’s scientific review process, all observational research seeking access to CPRD primary care and/or linked data for public health research are reviewed for study feasibility and research team expertise/experience, the public health benefits/risks of the research and potential information governance risks (risks to patient confidentiality and privacy).
  • Protocols developed as theoretical frameworks for undertaking research are not eligible for consideration under the RDG process. This is because research frameworks do not provide sufficient details about the specific research to enable an explicit evaluation of study feasibility and research team expertise/experience, the public health benefits/risks of the research and potential information governance risks (risks to patient confidentiality and privacy), during the review process.  
  • Applicants conducting programmes of research requested by, or on behalf of, medicine regulators, that is, organisations responsible for granting marketing authorisation for medicines, should contact the CPRD for further information (enquiries@cprd.com).
  • Where research in CPRD will be used to inform a wider body of research e.g., economic evaluations, clinical guidance development, software performance evaluation etc, the CPRD research application must focus on the research question to be addressed using CPRD data. For example, where CPRD data will be used to obtain estimates on the burden of illness or disease transition states to be used as input parameters in decision analysis models, the research protocol must focus on the use of CPRD data to generate the input parameters and not on the economic analysis per se. Information about the decision analysis models should be included as supporting information only as an appendix to the protocol.
  • Where multi-database research is planned, please indicate the same in the RDG protocol, including how the use of CPRD data or findings from the CPRD study will be used alongside data or findings from the wider researcher. Please note that the RDG protocol must focus on the research that will be conducted in CPRD.
  • The outcomes of a research application are ‘Approved’, ‘Amendment required’ or ‘Rejected'. An outcome of ‘Rejected’ does not preclude applicants from submitting a new protocol. Please note that all CPRD research protocols have a lifespan of four calendar years from the date of first approval.  
  • For the benefit of patients and the public, the following information on approved RDG applications are published on CPRD’s website, 3 months after approval - Study title, Date of approval, Study protocol number, Lay Summary, Technical Summary, Health outcomes to be measured, Collaborators and Linked data requested. For further information on the publication of Approved studies using CPRD Data, please go to https://www.cprd.com/approved-studies-using-cprd-data 
  • Where research is undertaken under a CPRD multi-study licence (MSL), CPRD may make the following information on approved RDG applications available to the administrators of the annual licence, if requested – study protocol number, study title, data sources requested, and all collaborators listed on the application. This information will be used by the licence administrators for data monitoring purposes. 

Part 1: Application Form 

GENERAL INFORMATION ABOUT THE PROPOSED RESEARCH STUDY

Question 1: Study Title (Max. 255 characters including spaces)

Please note that the study title is validated by CPRD.

Please note that the study title of approved research studies is published on the CPRD website for the benefit of patients and the public, to inform them of how anonymised patient data collected by CPRD are used for research.

It is important to ensure that the title of the study is clear, concise, easy to understand, and accurately reflects the main purpose/focus of the study.

The title should be reflective of the overarching study aim. The title of a hypothesis-testing study should give a clear indication of the primary exposure(s) and outcome(s). Ideally, the title should also refer to the study design.

Example 1: Incretin based drugs and risk of adverse renal outcomes
Example 2: Topical corticosteroids and risk of type 2 diabetes: a nested case-control study

Similarly, for a descriptive study, an example of a good title would be ‘The prescribing of codeine for the treatment of pain in children: a descriptive study’.

Avoid catchy titles that are vague about the study aim. Examples of unsuitable titles would be: ‘Pneumonia - the old man’s friend’.

Avoid including acronyms in the title, defining these at first use.

Applications with titles with more than 255 characters will be returned as invalid.

Question 2: Research Area (Please note that this section is validated by CPRD)

Specify the research area of the proposed study. Applicants must select at least one box. 

Question 3: Purely Observational Research (Please note that this section is validated by CPRD) 

Where the research proposed only involves the use of anonymised primary care and linked data routinely available via CPRD, separate research ethics approval is not required for your research. However, approval from an NHS Research Ethics Committee is required if the proposed study is not purely observational, that is, it involves direct contact with patients, or the work proposed involves research about health professionals at CPRD contributing practices.

Question 4: GP or Patient Questionnaires/Contact (Please note that this section is validated by CPRD)

CPRD encourages consultation and/or piloting of questionnaires with the target population (health care professionals or patient groups) outside of the CPRD, where possible. Evidence of this work should be included in the research application, where available.

GP Questionnaires

All questionnaires to be completed by GP practice staff must be reviewed and approved via the CPRD RDG process before being used. All questionnaires must be included as an Appendix to the research application at the time of submitting the research protocol unless the research proposed will inform future development of the questionnaire or the questionnaire is being piloted/tested. Where the research proposed will inform future development of the questionnaire or is being piloted/tested, the protocol must state that these will be submitted for RDG approval, as a post approval amendment, prior to use.

Patient questionnaires/patient contact

All studies involving patient questionnaires and/or patient contact must have favourable opinion from an NHS Research Ethics Committee (REC) approval to undertake this research. Where patient data from CPRD are also required, applicants must also seek RDG approval to access these data following NHS REC approval. The HRA approval letter, NHS REC favourable opinion, the REC-approved questionnaire template and all relevant patient-facing material approved as part of the study must be included as an Appendix to the CPRD RDG application at the time of submitting this on eRAP.  

Pre-approval of Questionnaires

Applicants must seek pre-approval of their questionnaire design and timelines by submitting an enquiry to the CPRD Interventional Research team via enquiries@cprd.com.  

Please note the following requirements relating to questionnaires at the time of pre-approval: 

  • All CPRD questionnaire studies are now implemented via an online study platform. 
  • All questionnaire templates must include an explanation of the purpose of the study for the recipient and any guidance on completion.
  • Questionnaires must be provided in the form/format/layout in which they will be administered.
  • Questionnaire templates must not include free text fields. 
  • GP questionnaires are typically no longer than 10 questions (including sub-questions). 
  • Where validated instruments will be used in the research, you must provide evidence that the necessary licence/permissions to use the instrument have been granted, where applicable. Applicants must also make clear whether modifications of the instrument are permitted. 
  • Fees for implementation and reimbursement of GP practices are in addition to data access fees (https://cprd.com/cprd-prove-providing-online-verification-ehr-pricing).  
  • Fees and timelines for questionnaire administration will be confirmed by the CPRD Interventional Research team as part of the pre-approval enquiry. 
  • Applicants must quote the enquiry reference number relating to questionnaire pre-approval to support their protocol application.

Reviewer Assessment Criteria

GP Questionnaires

All GP questionnaires submitted for RDG review will be evaluated for feasibility, public health benefits/risks and information governance risks as indicated below: 

  • For an assessment of feasibility, reviewers will consider whether the questionnaire is based on treatment or management at the GP surgery or other settings as this may affect feasibility and reliability of the data collected. 
  • For an assessment of public health benefits/risks, reviewers will consider how the questionnaire population will be selected e.g., use of random selection or other justified approaches to minimise the risk of selection bias;  determine whether questions are clearly defined to minimise the risk of misinterpretation resulting in inaccurate or missing data;  and determine whether the questions include or reflect known variations in data recording in clinical practice which may result in incomplete data capture.
  • For information governance risks, reviewers will consider whether the condition is rare or uncommon and whether specific questions could reveal attributes that may lead to re-identification of patients in the study. Depending on the initial information governance assessment, reviewers may request a risk mitigation plan or refer the application for review by the CPRD Information Governance team.  

Please note that substantial changes to a GP questionnaire following RDG approval (e.g., adding or removing one or more question) will require a post approval amendment of the application and questionnaire.  Minor changes will be recorded by the CPRD Interventional Research team as part of the implementation (e.g., updates to question guidance, re-ordering a question, removing options to multiple choice questions).

Patient Questionnaires/Patient Contact

All patient questionnaires submitted for RDG review will be evaluated for feasibility, public health benefits/risks and information governance risks as indicated below. Please note that all studies involving patient questionnaires and patient contact will be subject to CPRD Information Governance review. 

  • For an assessment of feasibility, reviewers will consider whether the questionnaire can be implemented in the setting/s proposed, and that data collection is likely to be reliable. 
  • For an assessment of public health benefits/risks, reviewers will consider how the questionnaire population will be selected e.g., use of random selection or other justified approaches to minimise the risk of selection bias.
  • For information governance risks, the CPRD Information Governance review team will confirm that the patient population have consented to have their patient-reported data collected and combined with their CPRD primary care data. This will involve a review of the HRA approval letter, NHS REC letter of favourable opinion, as well as any/all other documentation (e.g. privacy notice) which is provided to patients. The REC -approved questionnaire template and all relevant patient-facing materials which must be included as an Appendix to the CPRD RDG application.

Question 5: Chief Investigator (Please note that this section is validated by CPRD)

Please note that the name and affiliation of the Chief Investigator on approved research studies is published on the CPRD website for the benefit of patients and the public, to inform them of how CPRD’s anonymised patient data are used for research.

The Chief Investigator is responsible for ensuring that the research is undertaken with full adherence to CPRD RDG guidelines, and any contractual terms and conditions. As such, students and junior researchers are not eligible to act as the Chief Investigator on CPRD research studies.  

The full name, job title, organisation name, and e-mail address for corresponding with the Chief investigator must be included. The organisational affiliation of the Chief Investigator will be the sponsor of the proposed study.

All applicants must also indicate whether they have statistical experience, experience of handling large datasets and/or experience practicing in UK primary or secondary care, when registering for an account on the CPRD electronic research application portal (eRAP) (https://www.erap.cprd.com).   

Research teams without statistical experience, experience of handling large datasets and/or experience practicing in UK primary or secondary care are encouraged to collaborate with other researchers with experience in the relevant areas, prior to submitting their research for RDG review. 

Please note that all research applications must be supported by a research team comprising of at least two members, and at least one member of the research team must be listed as accessing the data for the research study.
 

Question 6: The Corresponding Applicant (Please note that this section is validated by CPRD)

Please note that the name and affiliation of the Corresponding Applicant on approved research studies is published on the CPRD website for the benefit of patients and the public, to inform them of how CPRD’s anonymised patient data are used for research.

The Corresponding Applicant is the direct point of contact for the RDG Secretariat and is authorised to submit the application on behalf of the Chief Investigator. It is also acceptable for the Chief Investigator to be the corresponding applicant.

All applicants must indicate whether they have statistical experience, experience of handling large datasets and/or experience practicing in UK primary or secondary care, when registering for an account on the CPRD electronic research application portal (eRAP). Research teams without statistical experience, experience of handling large datasets and/or experience practicing in UK primary or secondary care are encouraged to collaborate with other researchers with experience in the relevant areas, prior to submitting their research for RDG review. Please note that at least one member of the research team must be listed as accessing the data for the study to be conducted.

Please note that all research applications must be supported by a research team comprising of at least two members, and at least one member of the research team must be listed as accessing the data for the research study.

Question 7: Other investigators/collaborators (Please note that this section is validated by CPRD)

Please note that the name and affiliation of other investigators / collaborators on approved research studies are published on the CPRD website for the benefit of patients and the public, to inform them of how CPRD’s anonymised patient data are used for research.

Anyone who will access AND use CPRD data to conduct the research detailed in the protocol must be named in the CPRD RDG protocol. All investigators or collaborators must have an authorised eRAP account for a protocol to be submitted.

At the time of registering for an eRAP account, applicants must indicate whether they have statistical experience, experience of handling large datasets and/or practicing in UK primary or secondary care. Research teams without this experience are encouraged to collaborate with other researchers who have such experience on their proposed CPRD research. Please note that at least one member of the research team will need to be listed as accessing the data for the study to be conducted.

Please note that all research applications must be supported by a research team comprising of at least two members. At least one member of the research team must be listed as accessing the data for the research study.

ACCESS TO THE DATA 

Question 8: Sponsor of the study (Please note that this section is validated by CPRD)

The sponsor for the study is a company, institution, organisation, or group of organisations that takes on responsibility for initiation, management, and financing (or arranging the financing) of the proposed research.

A sponsor can delegate specific responsibilities to any other organisation that is willing and able to accept them. Any delegation of responsibilities to another party should be formally agreed and documented by the sponsor.

It is the sponsor who determines what data is requested for the research study through the protocol.

The sponsor organisation is the affiliation of the Chief Investigator.

Question 9: Funding source for the study (Please note that this section is validated by CPRD)

Specify the primary funding source for the study. Any organisation, or group of organisations, providing funding for the research project should be listed, including any grants and the awarding bodies.

Funding organisations will need to be CPRD approved funders. Organisations can apply for the New Funder Request for Access to CPRD Data form on our website https://www.cprd.com/Data-access. This should be completed and returned to CPRD Enquiries (enquiries@cprd.com). 

Question 10: Institution conducting the research (Please note this section is validated by CPRD)

Applicants must specify the name and address of the institution that will be conducting the research using CPRD data where this is not the sponsor organisation.

Question 11: Data Access Arrangements (Please note this section is validated by CPRD)

State the method that will be used to access the data for this study - a study-specific dataset agreement or an institutional multi-study licence. If a licence is to be used, please indicate the licensing institution name and address.

Please note that, for applicants requesting NCRAS Systemic Anti-Cancer Treatment (SACT) or NCRAS National Radiotherapy Dataset (RTDS) data CPRD must extract and deliver all primary care and linked data for the study, regardless of whether a multi-study licence is in place.

Datasets that will be extracted by CPRD

Investigators must discuss requests for CPRD to extract full datasets with a member of the CPRD Research Team before submitting a CPRD RDG application. Please contact the CPRD Research Team on (enquiries@cprd.com) to discuss your requirements. You must state the enquiry reference number associated with your contact with CPRD in this section.  

Multiple data delivery

Applicants must state whether multiple data deliveries are needed. Multiple data delivery refers to planned repeated data extractions of a single-study dataset or linked data requests. There may also be a cost implication for CPRD to service multiple data deliveries.

Where multiple data extracts are needed over the lifespan of a research study, applicants must discuss their request with a member of the CPRD Research team before submitting the CPRD RDG application on eRAP. Multiple data deliveries are permitted in the following situations:

  • Planned interim reporting for regulatory requirements,
  • Research on the impact of the introduction of interventions/policies/changes in practice, with reporting planned at multiple time points.

Please note that:

  • subgroup analyses
  • multiple planned work packages
  • plans to first define cases and later define controls

are not acceptable reasons to approve a request for multiple data delivery. 

Spilt deliveries for single-study datasets are not considered under multiple data deliveries and should be discussed with the study team as part of drafting the dataset specification. Please contact the CPRD Research team (enquiries@cprd.com) and provide the following information about your protocol: the proposed study title, aims and objectives, justification for why multiple data deliveries are needed, how many are planned and the time points for delivery. 

You must state the enquiry reference number associated with your contact with CPRD in this section.

Question 12: Data Processor(s) (Please note that this section is validated by CPRD)

We require information on any organisation that will be processing, accessing, or storing the data requested by the applicant.

For each location, applicants must: specify whether the organisation is processing, accessing, or storing data, and provide the organisation name, address, and processing area.  The data processing areas are – UK, European Economic Area (EEA), or Worldwide. It may be that one location stores, processes, and analyses the data.
For studies conducting research under a CPRD single study licence (SSL), all data and research must be accessed and conducted on CPRD Safe (https://www.cprd.com/cprd-safe-our-trusted-research-environment). If this is applicable to your research, the data processing area must be set to 'UK'. 

If the sponsor organisation will provide access to the data via a multi-study licence (MSL), the sponsor organisation must be listed as a separate data processing location.

INFORMATION ON DATA

Primary care data collected by the CPRD are linked to several other patient and area level datasets, including Hospital Episode Statistics, Office of National Statistic mortality data, Cancer Registry data, Index of Multiple Deprivation data etc., for patients at English practices where the practices have consented to participate in the linkage scheme.

Information on the range of linked data available via CPRD can be found here https://www.cprd.com/cprd-linked-data.

If you have any questions about accessing linked data, please contact CPRD Enquiries (enquiries@cprd.com).

Question 13: Primary Care data (Please note that this section is validated by CPRD)

Vision and EMIS are different clinical software systems used by general practices in the United Kingdom primary care setting. CPRD has historically collected data from Vision primary care practices, which is referred to as the CPRD GOLD primary care data. CPRD also collects data via the EMIS software system under the CPRD Aurum primary care data. 

Question 14: Requests to access linked data (Please note that this section is validated by CPRD)

Please note that linked data sources used in approved research studies are published on the CPRD website for the benefit of patients and the public, to inform them of how anonymised linked patient data collected by CPRD are used for research.

For all linked data requests, applicants must outline under the protocol section on “Planned use of linked data (if applicable), including the public health benefits to patients in England & Wales” how the main outputs of the proposed study will benefit patients in England and Wales. You may base your justification on how the study findings would improve patient care either directly or indirectly by informing clinical practice guidelines or public health policy. Please provide justification for each linked data source requested.

Where access to the linked data sources below is requested, at least one applicant named on the CPRD RDG application form must discuss the linkage with a member of the CPRD Research Team (enquiries@cprd.com), prior to submission of the RDG application:

  • NCRAS Systemic Anti-Cancer Treatment (SACT) data
  • NCRAS National Radiotherapy Dataset (RTDS) data

Applicants seeking access to NCRAS SACT or RTDS data must also complete CPRD’s NCRAS Data Selection form (available from CPRD on request) and submit the CPRD approved version via the CPRD RDG process as an appendix to the protocol.

Please note that applicants with approved access to NCRAS data for their research must also agree to the publication of their study title and study institution details on the UK Cancer Registry website.

As a risk minimisation measure, CPRD routinely provides only one practice and/or one patient level area linkage per study. If you require more than one practice or patient level area linkage (i.e. practice level IMD and Rural-Urban classification), this will need to be discussed with a member of the CPRD Research Team (enquiries@cprd.com), before submitting a CPRD RDG application. Applicants must include the enquiry reference number in the application to evidence the discussion with CPRD about the study requirements.

Question 15: Requesting non-standard data linkage (Please note that this section is validated by CPRD)

Investigators wishing to link to a dataset not listed under Question 14 of the CPRD research application form must receive CPRD approval for the respective linkage, prior to submitting a CPRD RDG protocol. 
Applicants must provide the Non-Standard Linkage (NSL) reference number for the CPRD approved linkage as part of their research application.

Please contact the CPRD Research Team (enquiries@cprd.com) for more information on accessing data as a non-standard linkage.  

Question 16: Patient identifiers (Please note that this section is validated by CPRD)

Investigators must state whether any person named in the study has access to the data in a patient identifiable form, or any associated identifiable patient index.

If the answer to this question is ‘Yes’, applicants must provide a re-identification risk management plan as an appendix and refer to it here and in relevant sections of the protocol.

The re-identification risk management plan should provide a thorough and robust account of how the risk of reidentifying patients in CPRD data will be made negligible. The plan should consider the following characteristics of the environment where CPRD data will be processed, and ensure appropriate controls are in place for each: 

  1. Other Data: This relates to whether you or other members in the research team hold or have access to any information that could be linked to CPRD data, thereby enabling re-identification. This may include personal knowledge, information from publicly available sources, restricted access data sources, and other similar data releases.
  2. Agents: This relates to the people and entities accessing, using and interacting with the data. 
  3. Governance Processes: This relates to how agents’ relationships with the data are managed. This includes formal governance such as data access controls, licensing arrangements and policies which prescribe and proscribe agents’ interactions.
  4. Infrastructure: This relates to the structures and facilities that allow CPRD data to flow and shape the data environment, including security infrastructure and wider social and economic structures. At its narrowest level, infrastructure is best thought of as the set of interconnecting structures (physical, technical) and processes (organisational, managerial) that frame the data environment. At its broadest level, infrastructure can include intangible structures, such as political, economic, and social structures, that influence the evolution of technologies for data exploitation, as well as data access, sharing and protection practices.

For more information and guidance, please refer to Section 3 of the UK Anonymisation Network’s ‘The Anonymisation Decision-Making Framework: European Practitioners’ Guide’ (https://ukanon.net/wp-content/uploads/2020/11/adf-2nd-edition-1.pdf)  

ALL APPLICATIONS MUST BE COMPLETED AND SUBMITTED VIA THE CPRD ELECTRONIC RESEARCH APPLICATION PORTAL (eRAP) www.erap.cprd.com 

Part 2: Protocol Information 

Please remember the following points below as you develop your research protocol.

  • ­Protocols that are poorly developed, that is, have a paucity of methodological details in most sections may be returned to applicants at the application validation stage.
  • During CPRD’s scientific review process, all research seeking access to CPRD primary care and/or linked data for public health research are reviewed for study feasibility and research team expertise/experience, the public health benefits/risks of the research and potential information governance risks (risks to patient confidentiality and privacy).
  • Research protocols developed as theoretical frameworks for undertaking research are not eligible for consideration under the RDG process. This is because research frameworks do not contain details about the specific research to enable an explicit evaluation of study feasibility and research team expertise/experience, the public health benefits/risks of the research and potential information governance risks (risks to patient confidentiality and privacy) during the review process
  • Where CPRD data will be used to inform a wider body of research e.g., economic evaluations, clinical guidance development, performance evaluation of software etc, the CPRD research application must focus on the research question that will be addressed using CPRD data. Information about the wider body of research should be included as supporting information only. For example, where CPRD data will be used to obtain estimates such as trends in disease, risk stratification, disease transition states as input parameters in a cost effectiveness model, the protocol must not focus on the economic study but rather on the use of CPRD data to obtain the input parameters.

Study Title [max. 255 characters]

Please note that the lay summary is validated by CPRD.

Reviewer Assessment Criteria

In this section, reviewers will assess whether the study title clearly describes the main focus and purpose of the proposed research.

Application Requirements

Please note that the study title is published on the CPRD website for the benefit of patients and the public, to inform them of how anonymised patient data collected by CPRD are used for research.

It is important to ensure that the title of the study is clear, concise, easy to understand, and accurately reflects the main purpose/focus of the study.

The title should be reflective of the overarching study aim. The title of a hypothesis-testing study should give a clear indication of the primary exposure(s) and outcome(s). Ideally, the title should also refer to the study design.
Example 1: Incretin based drugs and risk of adverse renal outcomes
Example 2: Topical corticosteroids and risk of type 2 diabetes: a nested case-control study

Similarly, for a descriptive study, an example of a good title would be ‘The prescribing of codeine for the treatment of pain in children: a descriptive study’.

Avoid catchy titles that are vague about the study aim. Examples of unsuitable titles would be: ‘Pneumonia - the old man’s friend’.

Avoid including acronyms in the title, defining these at first use.

Applications with titles more than 255 characters will be returned as invalid.

Lay Summary [max. 250 words] 

Please note that the lay summary is validated by CPRD.

Please note that Lay summaries are published on the CPRD website for the benefit of patients and the public, to inform them of how anonymised patient data collected by CPRD are used for research.

Reviewer Assessment Criteria

In this section, reviewers will assess whether the proposed research could be easily understood, as a standalone summary, by non-scientific readers. The importance, relevance, and implications of the research to patients, clinical practice or the health care system will also be assessed.

Application Requirements

The lay summary should provide an overview of the research without the need to refer to the technical summary.

Lay summaries should include the following: 

  • A succinct outline of what prompted you to do the research - state the problem and what you are planning to do about the problem.
  • A statement about the potential impact/public health benefit of your work (what is going to change for patients, clinical practice, or the wider society).
  • Non-technical language, short sentences and a summary written in plain English. When writing, imagine that you are writing your research to be understood by readers 9-11 years old, the average reading age of the UK population.
  • Follow a logical order. This may not always coincide with a temporal order.
  • Use first person and active voice (“we agreed” rather than “it was agreed”).

Lay summaries should not include the following: 

  • Any technical details, such as the study design or statistical methods. 
  • Jargon - you must explain it if you must use it.
  • Use of the word “identify” when referring to cohort definition or determining study eligibility 
  • Abbreviations - these should be defined on first use or explained. 
  • The use of superscripts, subscripts and references is not permitted.

Applications with lay summaries that do not adhere to these guidelines will be returned as invalid.

Technical Summary [max. 300 words]

Please note that the technical summary is validated by CPRD.

Please note that technical summaries are published on the CPRD website for the benefit of patients and the public, to inform them of how anonymised patient data collected by CPRD are used for research.

Reviewer Assessment Criteria

Technical summaries will be evaluated for transparency in communicating the purpose, methods, and benefits of the proposed research to scientific readers as a standalone summary. A high-level assessment of the relevance/feasibility of the stated methods and analytical approaches will also be undertaken. Reviewers will also assess whether the benefits of the proposed methods to achieve the objectives of the research outweigh potential risks, including information governance risks such as patient or practice confidentiality issues.

Application Requirements

The technical summary is written primarily for other researchers and clinicians who may be interested in your research. This should include enough technical details to provide a clear idea of your study aim and methods.
Your technical summary should be presented as 1-2 paragraphs providing a succinct overview of your research and include details on the following: 

  • Overarching aim and objective(s) 
  • Study population of interest 
  • Primary exposure(s) and outcome(s), where relevant 
  • Data sources that will be used to achieve the aim and objectives (e.g., hospital data will be used to determine outcomes, hospitalisations, outpatient attendance) 
  • Study design, methods including the main statistical tests
  • Intended public health benefit of the research

You should avoid vague and broad references to methods, for example time-to-event analysis or regression models, in favour of more specific terms such as Cox proportional hazards regression or linear regression. 
The use of the word “identify” should be avoided, or it should be made clear that it does not refer to identification of patients. Abbreviations should be clarified before use. The use of superscripts, subscripts and references is not permitted.

Technical summaries that do not adhere to these guidelines will be returned as invalid.

Outcomes to be Measured [max. 100 words]

Please note that study outcomes listed in this section are published on the CPRD website for the benefit of patients and the public, to inform them of how anonymised patient data collected by CPRD are used for research.

Reviewer Assessment Criteria

In this section, reviewers will assess whether the choice of primary and key secondary outcomes will achieve the intended benefits of the research. An assessment of the feasibility of ascertaining the outcome(s) in CPRD may also be reviewed in this section.

Application Requirements

Applicants should make a clear distinction between the primary and secondary outcomes of the research in a concise list, separated by semicolons e.g., “Complications of infection in primary or secondary care; Admission to Accident & Emergency; All-cause hospitalisation; All-cause mortality”. Please do not include operational definitions of study outcomes in this section. This information should be included under the section on ‘Exposures, Outcomes and Covariates’.

This section should not include statements relating to the study aims and objectives. For descriptive and feasibility studies, list the key variables in this section.

All definitions of the primary and key secondary outcomes should be included under the section on “Exposures, Outcomes and Covariates”

Specific Aims, Objectives, and Rationale [max. 250 words]

Reviewer Assessment Criteria

In this section, reviewers will evaluate the clarity, scope, feasibility, and benefits that may be achieved through the study aim(s) and objectives. Protocols should include details to enable reviewers to evaluate the public health benefits of the research, assess the methods for implementing the stated objectives, determine whether inherent public health risk(s) may arise during the conduct of the research or whether there may be risks to patient/practice confidentiality and/or privacy.

Application Requirements

A general aim should be provided, followed by one or more specific and related objectives. Applicants should clearly state the primary and secondary study objectives and the primary hypothesis to be tested (where relevant). RDG reviewers will carefully consider whether research for all proposed objectives is outlined in later sections of the protocol, including the data analysis section.

Protocol Scope

Please note that research protocols that are developed as theoretical frameworks for undertaking research do not qualify as a research protocol under the RDG process. This is because frameworks are too broadly defined to enable an explicit evaluation of the research feasibility, public health benefits/risks and potential information governance risks, during the review process.

A protocol may be too extensive if it proposes a large amount of research that is under-specified, resulting in a general lack of clarity about what exactly will be done in each study or area of research. Such protocols do not enable an explicit evaluation of the research feasibility, its public health benefits/risks, and information governance risks, during the protocol review process. 

The following below provides examples of when a single protocol may be too extensive in scope: 

  • when multiple phases of research are proposed 
  • when several distinct objectives are proposed 
  • when there are descriptive, hypothesis generating, and hypothesis testing elements included
  • when many hypotheses will be tested and there are likely to be issues of statistical multiplicity (with the potential for Type 1 errors)
  • when there are multiple exposures of interest (e.g., several drug classes or groups of conditions)
  • when there are multiple outcomes of interest
  • when there is potential for selective publication of findings

Where the research to be undertaken is extensive, a single research protocol may be split into more than one protocol and submitted simultaneously or sequentially for review.

As there is no straightforward answer about when a research protocol may be too extensive, applicants may wish to contact the CPRD (enquiries@cprd.com) for an initial assessment of the protocol scope, prior to formal submission for RDG review.

Rationale for the research:

Applicants must also provide a statement regarding the rationale/need and implications for the present study, including how this will improve patient care, either directly or indirectly, for example by informing clinical practice guidelines or public health policy.

Study Background [max. 250 words]

Reviewer Assessment Criteria

In this section, reviewers will assess whether the study background highlights the importance, relevance, and public health value of the research. This should be linked to relevant published literature.

Application Requirements

Applicants should explain the reason for the research aim and objectives and support this with relevant information from scientific or other literature. This should highlight key issues that are currently unanswered or in dispute in the field. Background information may refer to previous or similar studies conducted in GPRD, CPRD or other data sources. All supporting statements should be duly referenced in this section with the full reference included in the “Reference” section of the protocol.

Ensure that you refer to any previous RDG protocols or protocols from the Independent Scientific Advisory Committee (ISAC) that may be related to your study. Any reference to a previous ISAC/RDG protocol should be accompanied by the relevant protocol number e.g., 15_101, 20_000001

Study Type [max. 50 words]

Reviewer Assessment Criteria

In this section, reviewers will assess whether the study type(s), as described below, are appropriate to address the stated aims and objectives of the research and to achieve the intended public health benefits.

Application Requirements

Please note, where CPRD data will be used to inform a wider body of research e.g., economic evaluations, clinical guidance development etc, the CPRD research application must focus on the research question that will be addressed using CPRD data. Information about the wider body of research should be included as supporting information only. For example, where CPRD data will be used to obtain estimates such as trends in disease, risk stratification, disease transition states as input parameters in a cost effectiveness model, the protocol must not focus on the economic study but rather on the use of CPRD data to obtain the input parameters.

Specify whether the study will be primarily descriptive, hypothesis generating, hypothesis testing, or a methodological piece of research. We recognise that a single research study may comprise one or more of the following study types:

  • Descriptive studies – These include ecological studies, cross-sectional analyses, drug utilisation studies, and case series assessment, which focus mainly on identifying patterns or trends in disease occurrence over time.
  • Exploratory/ Hypothesis Generating – Exploratory or hypothesis generating studies are often descriptive studies that aim to reveal patterns associated with a specific condition or event, without an emphasis on testing pre-specified hypotheses. Thus, the emphasis of such studies is on estimation. Some quantities that can be estimated in exploratory studies are the prevalence and incidence of a disease, the resources required to treat a disease, or utilisation patterns of a product. Hypothesis generating, or exploratory studies, are acceptable within a defined framework (i.e., they do not constitute data mining), and there is a clear commitment to report the results accordingly.
  • Hypothesis Testing – Hypothesis testing studies in epidemiology involves the use of data to make statistical decisions about the associations of a disease, or the degree of exposure to an agent or product and its relationship with disease. Hypothesis testing studies are therefore intended to provide results by testing hypotheses with clearly defined exposures and outcomes. Analysis of the data must therefore be based on predefined valid analysis plans.
  • Methodological – Methodological studies include studies of statistical methods, comparisons of study designs, etc. The analysis of data should be based on a predefined valid analysis plan.

Study Design [max. 100 words]

Reviewer Assessment Criteria

In this section, reviewers will assess whether the study design(s), as described below, are appropriate and can be reliably implemented in CPRD to achieve the intended benefit of the research. Feasibility of the design may also require an assessment of the methods - numbers expected in CPRD (feasibility counts), sample size considerations, data sources to be used, study exposures and outcomes or proposed data analysis.

Application Requirements

Applicants should briefly state the overall research design, strategy, and reasons for choosing the proposed study design.

Research designs include, for example: case-control, cohort, cross-sectional, nested case-control, or hybrid designs.

Applicants should clearly outline their study design to avoid reviewer confusion arising in relation to matched control groups, for example, a comparative cohort study being described as "case-control". 

Feasibility counts [max. 200 words]

Reviewer Assessment Criteria

In this section, reviewers will assess the feasibility of the research, that is, whether there is likely to be “adequate” number of patients in CPRD to address the main objectives of the proposed research. Where numbers may be low, reviewers will also consider the sample size calculation, study design, and data analysis sections of the protocol to assess whether the research may present patient and/or practice re-identification risks. Applicants should include a risk mitigation plan in their protocol where there may be potential re-identification risks. 

Application Requirements

Applicants must provide an estimate of the expected number of patients available in the CPRD and/or linked datasets for the proposed study. Applicants may refer to relevant publications using CPRD data to gauge study feasibility or support their application with feasibility counts based on CPRD data. In some cases, feasibility counts can be requested from CPRD and for more information applicants should contact enquiries@cprd.com.

A searchable list of publications using CPRD data, which is updated monthly, can be found at https://www.cprd.com/bibliography. Applicants can also request code browsers for free from CPRD which will allow them to search for all medical and treatment codes that are included in the CPRD primary care database to assess whether specific conditions, treatments or other exposures are captured in the CPRD database.
Where numbers expected in CPRD are low and may present a challenge to study feasibility, applicants may wish to consider the following approaches: 

  • Use data from CPRD GOLD and/or CPRD Aurum to increase the sample size
  • Use linked data sources for case definition/outcome ascertainment if conditions are more likely to be recorded in secondary care
  • Revisit the case definition 
  • Consider an alternative study design to increase study power e.g., matched study design

If options to increase your study population are not feasible, please outline mitigation approaches to minimise the risk of inadvertently identifying patients or practices during the conduct and/or publication of the study.

Sample size considerations [max. 200 words]

Reviewer Assessment Criteria

In this section, reviewers will assess whether a sample size calculation is needed and if so, whether the sample-size/power of the study will be sufficient to address the primary hypothesis of the research. Where numbers may be low, reviewers will consider your study design and proposed data analysis to assess whether your research may present potential patient and/or practice re-identification risks. Applicants should include a risk mitigation plan in their protocol where there may be potential re-identification risks.

Application Requirements

All protocols must include some consideration of whether the sample-size and study power for hypothesis testing studies will be sufficient to meet the primary outcome of the research. For a primary hypothesis with multiple outcomes, applicants should provide sample-size/study power estimates for the main outcome, commonest outcome, and/or rarer outcomes in the study. 

All protocols should include an estimate of the expected numbers of patients, exposures, or outcomes (as appropriate) that will be available. Applicants may refer to relevant publications using CPRD data to gauge study feasibility or support their application with feasibility counts based on CPRD data.

For hypothesis testing studies, it is necessary to demonstrate that the expected numbers are sufficient to investigate the primary study objective with adequate power. This may be demonstrated by carrying out a formal power or sample size calculation, in which case sufficient information should be given for a statistician to be able to repeat the calculation(s), including the method and the values of numerical inputs and their sources (e.g., references). Alternatively, it may be possible to make an informal argument that the expected numbers are sufficient by comparison to previously published studies.

For hypothesis generating and descriptive studies, we typically expect demonstration that expected numbers will give reasonable precision around the effect estimates or numerical results to be calculated. For methodological studies, the appropriate approach to demonstrating that expected numbers are adequate will vary.

In all types of study, sample size/power calculations should, when relevant, reflect chosen approaches to dealing with multiple comparisons.

Where numbers expected in CPRD are low and may present a challenge to study feasibility, applicants may wish to consider the following approaches: 

  • Use data from CPRD GOLD and/or Aurum to increase the sample size
  • Use linked data sources for case definition/outcome ascertainment if conditions are more likely recorded in secondary care
  • Revisit the case definition 
  • Consider an alternative study design to increase study power e.g., matched study design

If applicants wish to make a case that it is worth proceeding with a study even though the expected numbers are lower than desired – for example, in studies of extremely rare conditions – then this should be identified and clearly acknowledged as a limitation in the research protocol and addressed in a risk mitigation plan.

Please be aware that post-approval, CPRD will also review any data requests are supported by the sample size estimates stated in the approved protocol and that there is a clear justification for large sample size requests to demonstrate compliance with data minimisation principles.

While there is no specific limitation on the size of the study population, the size must be clearly justified in the protocol. Proportionate data minimisation measures will be applied when any Primary Care or linked dataset comprise of >600k patients, and will consider feasibility counts, sample size calculation, data linkages requested (including study/coverage period), definition of the study population (including inclusion and exclusion criteria), comparison groups, exposure, outcomes, and covariates definition. Please contact CPRD (enquiries@cprd.com) if you have any questions regarding data minimisation. 

Planned use of linked data (if applicable), including the public health benefits to patients in England & Wales [max. 200 words]

Reviewer Assessment Criteria

Where applicable, reviewers will assess whether the linked data sources requested are relevant and feasible (can support cohort/comparison identification, exposure definition, outcome ascertainment or covariate definition) to address the study aims and objectives. An explicit review of how the outputs of the proposed study using linked data will benefit patients in England & Wales will also be evaluated.

Application Requirements

Any proposed use of linked data sets must be appropriate to the research. This will be assessed against statements made on the CPRD RDG application form and any other relevant information documented in the protocol. For proposals to use data sources routinely linked to CPRD data, for example, Hospital Episode Statistics (HES), Office of National Statistics (ONS) Mortality data, Cancer Registry data, practice/patient area-level data, please describe why the linkage data is necessary for the study and how it will be used.

Applications must outline how the main outputs of the proposed study will benefit patients in England and Wales. You may base your justification on how the study findings would improve patient care either directly or indirectly by informing clinical practice guidelines or public health policy.

It is important that the relationships between the study population (e.g. regarding dates), sample-size, and the use of linked datasets are clear within the protocol i.e., whether the entire study will be undertaken among practices which have consented to linkages or only part of it (e.g. in a sensitivity analysis). Applicants should consider how the time periods for availability of linked data might affect the study period and censoring of patients.

Research groups which have not previously accessed CPRD linked data resources must discuss access to these resources with a member of the CPRD Research team before submitting a CPRD RDG application. Requests for access to certain linked data resources must also be discussed with a member of the CPRD Research team and the evidence of this provided on the CPRD RDG application See Question 14: Requests to access linked data for further details. Studies requesting linked data will not be approved unless these conditions have been met.

Studies proposing non-standard linkage of CPRD data to one or more external data sources must provide additional assurances about how the disclosure of patients and practices will be avoided in the form of a risk mitigation plan.

Requests for non-standard linkage must have approval from CPRD prior to CPRD RDG submission. Please email CPRD (enquiries@cprd.com) for further information. It essential that any necessary legal/ethical approvals are in place for any non-standard linkage to take place before submission to the CPRD RDG process.

Definition of the Study population [max. 250 words]

Reviewer Assessment Criteria 

In this section, reviewers will assess whether the study population is clearly described and relevant in the context of the research; whether restricting/excluding certain patient groups from the research may disadvantage such patient groups and limit the benefits of the research; whether research to combine data from CPRD with other external non-CPRD data sources may present potential patient and/or practice re-identification risks or other information governance risks.

Application Requirements 

It is important to ensure that the protocol clearly defines the study population. The following areas listed below should be addressed in all research protocols:

  1. Describe the source/target population:
  2. State the indicative recruitment period and the definition of the start and end of follow-up for patients to allow an assessment of study feasibility.
  3. Describe the study population in terms of key inclusions, exclusions, and the data used for each (diagnosis, referrals, tests, therapy/drugs, immunisations, consultations). You should also provide justification for selecting the study population of interest. 
  4. Provide a clear definition of the index date and any minimum requirements for previous follow-up time.
    Any reference to incidence or prevalence should be accompanied by details on how this should be defined (first record in the study period, first ever record, any record before the study end, treatment naïve etc…).
  5. If any sampling from a base population is to be undertaken, provide details of sampling methods considering approaches that are likely to be free of selection bias.
  6. Also include information on the exposure window(s) of interest, where appropriate, clearly defining the time/period which will be considered "exposed" or "non-exposed".
  7. For studies requiring linked data, please make clear the restrictions imposed by the eligibility criteria and coverage periods.
  8. For studies using GP validation or patient questionnaires, please state how the questionnaire population will be defined and selected, including the number of patients that will be selected. For questionnaire development, please see the reviewer assessment criteria under ‘Definition of the Study population’ that will be used to assess questionnaires.  

While there is no specific limitation on the size of the study population, the size must be clearly justified in the protocol. This is important as proportionate data minimisation measures may be applied by CPRD when any Primary Care or linked data sets comprise >600k patients. The need for data minimisation will consider the following: feasibility counts, sample size calculation, data linkages requested (including study/coverage period), definition of the study population (including inclusion and exclusion criteria), comparison groups, exposure, outcomes and covariates definition. Please contact CPRD (enquiries@cprd.com) if you have any questions regarding data minimisation.

For all cohort studies, the protocol should clearly define when a patient enters the cohort and when they will leave it. If there is an index date, it is important to ensure that it is clearly specified.

Please note that Researchers are not permitted to combine CPRD data with external data sources without explicit and prior permission from the CPRD. Please contact the CPRD (enquiries@cprd.com) to discuss combining or pooling of CPRD data with external data for your research. If permission is obtained to combine CPRD data with external data, please reference the Query number associated with your discussion with the CPRD on this subject and provide justification for the combining of these data in this section.

Selection of comparison group(s) or controls [max. 250 words]

Reviewer Assessment Criteria

In this section, reviewers will assess whether the selection of comparison groups or controls are clearly described, relevant and appropriate, and can be operationalised using the data sources and variables available in CPRD. Whether selection approaches will introduce biases that may limit the benefits of the research will also be evaluated.

Application Requirements

Where controls or comparison groups are needed to support a research question, please describe the following in the research protocol:

  1. How control groups differ from the main study population.
  2. The key inclusions, exclusions, and the data used for each (diagnosis, referrals, tests, therapy/drugs, immunisations, consultations).
  3. For studies requiring matching, type of matching (index date, calendar time, frequency, incident density sampling, high dimensional propensity score etc.) and the ratio/number of matches required should also be stated.

Applicants should also provide justification for the procedure for control selection. When making comparisons, calendar time should always be considered, e.g., through an index date. Care should be taken to avoid the possibility of "immortal time bias". When this is a potential issue, a diagram showing how periods of time will be handled, and such bias avoided is recommended.

Exposures, Outcomes and Covariates [max. 1000 words]

Reviewer Assessment Criteria

In this section, reviewers will assess whether the exposures, outcomes and covariates of interest are clearly described, relevant in the context of the proposed research and can be operationalised in CPRD using the data sources requested. Reviewers will also assess whether there may be oversights in the selection of covariates or outcomes that may limit the public health value of the research. Potential risks to patient/practice confidentiality and/or privacy arising from the use of specific data variables or sensitive concepts will also be assessed.

Application Requirements

Defining Exposures and Outcomes
A clear description of the exposures and health outcomes of interest to the study should be provided. Operational definitions of these should also be provided to enable an assessment of feasibility. An operational definition is one that can be implemented independently using the data available in the proposed study. For example, "asthma episode" is not an operational definition; a better description would be “record of a Read code for asthma, as listed in Appendix A, and documented in the patient clinical or referral record”. 
If it is not possible at the time of the CPRD RDG application to provide operational definitions of exposures and/or outcomes because these will be elucidated during the study, an acceptable alternative is to describe the process by which these definitions will be reached.

Data source/s
Applicants should also describe the data sources, where applicable, for determining the main exposures and health outcomes relevant to the study. Data sources might include, for example, primary care clinical records, prescription drug files, test records, administrative linked exposure/disease registries and GP questionnaires. Steps to validate exposure and outcomes are encouraged and may be suggested for diseases not previously studied in the database or for which there is commonly diagnostic uncertainty.

Covariates
A list of covariates to be included in baseline tables and statistical models as potential confounding variables and effect modifiers should be stated, including the data source/s from which these will be derived. This would suggest that reasonable steps to control for confounding will be taken.

Codes lists
Applicants should provide preliminary code lists for the main exposures and outcomes to demonstrate that they have an awareness of the practical issues involved in defining these, where appropriate. Code lists should be provided as appendices and not included in the body of the protocol. 
Where relevant code lists are absent, the procedure for developing them has not been described, or the use of codes from a previous study has not been proposed, protocols will be regarded as deficient in this respect. Given the nature of the medical coding system in use in UK primary care, it is advised that, where possible, a named clinician with experience of UK primary care is involved in the process of code list development.

Note that code sets must include the actual codes (e.g., CPRD medical codes or product codes, Read/SNOMED-CT/ CD) and the text descriptors associated with the respective code e.g., the Read term associated with the Read code or ICD description for an ICD code. Code lists do not need to be finalised at the time of submission of the CPRD RDG application.

Data/ Statistical Analysis [max. 1000 words]

Reviewer Assessment Criteria

In this section, reviewers will assess whether the analysis proposed are aligned with the research objectives, are broadly appropriate for the proposed research from an epidemiological and statistical point of view and minimise potential risks of patient and/or practices re-identification or other information governance risks. For example, reviewers will assess whether analytical/statistical methods may ‘single out’ patients through investigation of outliers, case review by clinicians, or by conducting stratified analyses.

Application Requirements

All data management and data analysis to be performed must be covered in this section. Applicants should ensure that analytical methods proposed are consistent with all the specific study aims and objectives listed, and with the study design. It is also important to ensure that this section is clear and specific about any comparisons which will be made (e.g., whether drugs classes will be compared or specific drugs). Mention of approaches to address potential problems of misclassification, bias, confounding, and missing data should be given.

Applicants should also make it clear whether sensitivity analyses will be undertaken, and outline the provisions to account for reverse causality, where this is felt to be a potential issue.

Analysis should be represented according to whether the study is hypothesis generating or testing but, in either case, the analytical methods to be used should be specified in the protocol. Please see below for suggestions on what may be included in your summary of statistical analyses for different types of studies.

Descriptive studies
Measures of central tendency (mean, median), variation, and correlation are often reported in these types of studies. Trend analysis is an important tool in descriptive studies.

Hypothesis Generating
Descriptive statistics to provide useful summaries about the sample and the outcome measures. Together with simple graphics analysis, descriptive statistics form the basis of virtually all quantitative analyses. Hypothesis generating analyses may include measures of disease frequency such as prevalence and incidence and time trend analyses.

Hypothesis Testing
Descriptive statistics to provide useful summaries about the sample and the outcome measures; measures of association to be derived and statistical tests to be conducted; pre-specified sub-group analyses including how the analysis will control for potential confounding. Where appropriate, specify the statistical modelling techniques to be used, giving some indication as to how models will be specified.

For studies proposing new methodologies such as machine learning or artificial intelligence (AI), specific techniques/algorithms proposed should be explained and inherent biases/limitations of the techniques should be duly considered. Internal validation must be proposed and plans for external validation should be considered and communicated in the protocol.

Multiple testing 
Applicants are advised to consider the implications of multiple testing as the interpretation of p-values <0.05 (5%) as “statistically significant” may be threatened when many tests are carried out in a single study (Bland, 1995). 

Approaches for handling multiple testing may include: 

  • Cautious interpretation of findings
  • Clear distinction between a pre-specified primary and several secondary hypotheses (with a commitment to caution regarding findings relating to secondary hypotheses) 
  • Bonferroni (Bland, 1995) or other formal statistical corrections

Information governance risks
Applicants must consider whether the analyses proposed may increase the risk of unintentional (deductive) disclosure of patients, for example, single out individuals or small groups of patients (<5). Where there is a potential risk of disclosure, please outline mitigation approaches to minimise the risk of inadvertently identifying patients or practices during the conduct and/or publication of the study. Mitigation may include, but is not limited to, secondary suppression methods.

Plan for addressing confounding [max. 200 words]

Reviewer Assessment Criteria

In this section, reviewers will assess whether methods for addressing confounding have been considered, and where needed, are relevant given the study aims and objectives, study type, design and analyses proposed. Reviewers will also assess whether methods to consider confounding may increase the risk of patient and/or practice re-identification or other information governance risks during the conduct and/or publication of research findings. For example, reviewers will assess whether analytical/statistical methods may result in strata with <5 patients or ‘single out’ patients through other investigations.

Application Requirements

Purely descriptive studies are exempt from this requirement and can list ‘Not applicable’ in this section. All other studies should here provide some discussion of what will be done in the design and/or analysis to control for confounding.

Where methods to consider confounding may increase the risk of patient and/or practice reidentification or other information governance risks during the conduct or publication of your research, please outline mitigation approaches to minimise the risk of inadvertently identifying patients or practices during the conduct and/or publication of the study.

Plans for addressing missing data [max. 200 words]

Reviewer Assessment Criteria

In this section and where applicable, reviewers will evaluate how missing data will be handled in the research and whether this may lead to spurious findings or incorrect conclusions that may undermine the benefits of the research.

Application Requirements

The potential for missing data is present in most studies and needs to be identified and addressed in this section of the protocol. In practice, missing data is most commonly of concern in relation to covariates, such as BMI and smoking, but would be of bigger concern if the relevant variable is an outcome or exposure.

Applicants should carefully consider their options in relation to how best to handle missing data in their research and expand on their choice and the resulting likely issues. Approaches should be considered that minimise the chance of bias, especially when data missingness is extensive and could result in a much reduced and potentially biased sample. The extent of missing data should be reported and recognised as a potentially important limitation, and applicants should state any assumptions made about the patterns of missingness for their analytical approach to be valid and outline any planned sensitivity analyses to further investigate potential selection biases due to missing exposure or covariate data.

Patient or user group involvement [max. 150 words]

Reviewer Assessment Criteria

In this section, reviewers will evaluate whether patient or user group involvement have been considered during different stages of the research.

Application Requirements 

It is expected that many studies will benefit from the involvement of patient or user groups in the planning and refinement stages of the research, and/or in the interpretation and dissemination of findings. Patient/user involvement may also be useful in informing plans for further work involving patient contact and/or studies with an interest in quality of life.

Applicants must consider and indicate how patient/user groups have been/will be engaged in the research. Engagement with patient and/or user groups that may benefit from the proposed research nationally or internationally is encouraged. If there is no planned patient/user groups engagement in your research, applicants must provide strong justification to support why this is not indicated. Applications which simply state ‘Not applicable’ will be returned as invalid.

Some ways in which research applicants may include patient/user involvement in research using CPRD data are outlined below. 

  • Development and/or review of the lay summary explaining your research. 
  • Patient involvement in grants or other funding application related to the proposed research.   
  • Including patient partners in the research team as co-investigators, where relevant.
  • Including service users and carers in study advisory groups, where relevant.
  • Establish links with existing groups to provide expert patient input to the project and/or facilitate rapid dissemination of findings e.g., charities that represent the condition of interest or establish the same to support the research, where relevant. 
  • Hosting workshops and focus groups with local patient groups to describe the concept of the work and elicit feedback, where relevant.
  • Researchers may also gain insights on patient views and experience from published sources e.g., patient/user group discussions (https://patient.info/forums), where available.

Given the wide range of research that is possible using CPRD data, please see Annex 1 for recommended ways to include patient/user engagement in CPRD research studies.

Exceptions
Please note that research that requires an urgent response to serious public health or other concerns may be exempt from patient/user groups engagement due to the additional time constraint imposed in these situations. Such research may include and is not limited to the following:

  1. Serious public health concerns e.g., COVID-19, drug safety or other regulatory concerns.
  2. Ministerial requests i.e., research in support of security, intelligence, prosecution, international relations. 
  3. Urgent health policy formulation.
  4. Research requested by law or court order.

Plans for disseminating and communicating study results [max. 150 words]

Reviewer Assessment Criteria

In this section, reviewers will evaluate whether there are any restrictions on publication of the research findings that may impact its public health value. For instance, whether the funder has a role in writing up the research or in deciding to submit the paper for publication. An assessment of whether there are potential risks of inadvertently re-identifying patients or practices during publication and dissemination of the outputs of the research will also be considered.

Application Requirements

There is an ethical obligation to disseminate findings of potential scientific or public health importance (e.g., results pertaining to the safety of a marketed medication). Authorship should follow guidelines established by the International Committee of Medical Journal Editors.

Applicants must list the following acknowledgements in publications resulting from studies using CPRD data:

  • This study is based in part on data from the Clinical Practice Research Datalink obtained under licence from the UK Medicines and Healthcare products Regulatory Agency. The data is provided by patients and collected by the NHS as part of their care and support. The interpretation and conclusions contained in this study are those of the author/s alone.
  • Copyright © [YEAR], re-used with the permission of The Health & Social Care Information Centre. All rights reserved. 
  • Please also refer to the range of acknowledgements that should be listed in publications listed under the ‘Publication’ section of your CPRD licence.

When reporting, applicants are advised to follow the principles outlined in the Strengthening the Reporting of Observational studies in Epidemiology (STROBE) and any other relevant guidelines in the Enhancing the Quality and Transparency of health research (EQUATOR) network. The Consolidated Standards of Reporting Trials (CONSORT) statement refers to randomised studies, but also provides useful guidance, the principles of which may be applicable to observational hypothesis-testing studies.

Where research is felt to provide important new evidence on the safety or effectiveness of a medicine or vaccine then pre-publication manuscripts may be sent by email to the MHRA at Pharmacovigilanceservice@mhra.gov.uk. Marketing Authorisation Holders should submit manuscripts for post authorisation safety studies, accepted for publication, as described in the Guideline on good pharmacovigilance practices (GVP) module VIII – Post-authorisation safety studies.

Conflict of interest statement [max. 150 words]

Reviewer Assessment Criteria

In this section, reviewers will evaluate applicants’ conflict of interest statements to determine whether these may influence publication and/or communication of the research findings.

Application Requirements

Each applicant must provide a conflict-of-interest statement. The statement should be transparent about any sources of funding not already listed on the application including relevant financial interests of investigators/collaborators, and any relevant paid or unpaid positions held by investigators/collaborators.

Limitations of the study design, data sources, and analytic methods [max. 200 words]

Reviewer Assessment Criteria

In this section, reviewers will evaluate whether there are important limitations of the study that have not been considered or adequately considered and which may affect the conclusions drawn from the research.

Application Requirements

Limitations of the study such as issues relating to bias and confounding, misclassification, random error and generalisability etc., should be considered. Specific consideration of the potential impact on findings should be provided. For example, primary care databases contain little, if any, information about over-the-counter drug (OTC) usage. Applicants studying a class of drugs for which some products are available OTC should recognise which drug exposures are likely to be underestimated and discuss the expected impact on the findings.

Considerations about how important biases may arise from the study should also be addressed.

Researchers should consider situations in which certain prescriptions may not appear in the database. It should also be noted that presence of a prescription in a primary care database does not ensure that the prescription was then provided to the patient, issued by a pharmacy, and consumed by the patient. Applicants should contact enquiries@cprd.com with any queries.

References [max. 20 references]

Reviewer Assessment Criteria

In this section, reviewers will assess whether the supporting evidence about the research e.g., study background and methods, are linked to published scientific or other literature. 

Application Requirements

Please provide a numbered list of references at the end of the protocol. The reference list should include the titles of the papers, but it is not necessary to include all the authors. A minimum of three authors is sufficient, and the Vancouver format for referencing is preferred.

List of Appendices [max. of 5]

Reviewer Assessment Criteria

In this section, reviewers will inspect preliminary code lists for the main study exposure(s) and outcome(s) to assess their relevance and feasibility for conducting the research in CPRD data sources. Other documentation referred to in the application will also be assessed, where needed. For example, GP Questionnaires should be assessed as outlined under the section on ‘Question 4: GP or Patient Questionnaire/Contact’.

Application Requirements 

Please provide all appendices related to this research protocol as separate documents.

Applicants seeking access to NCRAS SACT or RTDS data must ensure that the final approved version of the NCRAS Data Selection Form is attached in the appendices on submission.

Grant ID (optional) [max. 255 characters]

Please provide a grant reference identifier or link to the funding award, where available.

Other information

Lapsed Applications

Research application requiring resubmission (Amendment required) must be submitted within 6 months of the date of initial RDG reviewer feedback. Resubmissions not made within 6 months of the initial feedback date will be recorded as withdrawn, and applicants will be required to submit a new protocol application. The RDG Secretariat should be contacted if there are circumstances which may result in the resubmission exceeding the 6-month deadline.

Data deletion

CPRD Dataset Agreement Terms and Conditions state that applicants will need to provide evidence that any received datasets have been deleted no later than 12 months following receipt. Applicants are required to keep a register of any copies made and must provide data destruction certificates for all copies or backups at the end of the contract period.

Applicants may apply for extensions to the 12-month period and should email rdg@cprd.com to discuss any request for an extension.

Confidentiality of research protocols

All research applications to the CPRD RDG process are held securely and confidentially at the CPRD. No information about study applicants or protocol content are released to third parties, other than in accordance with CPRD’s Transparency Policy or for use by the CPRD license administrators for data monitoring purposes, without first seeking the agreement of the Chief Investigator of the study. Only applicants named on the research protocol can make enquiries about the protocol.

Ethical review of protocols

CPRD has obtained ethical approval from a National Research Ethics Service Committee (NRES) for all purely observational research using anonymised CPRD data; namely, studies which do not include patient involvement (which is the vast majority of CPRD studies). CPRD RDG committees review protocols for feasibility and research team expertise/experience, public health benefits/risks and information governance risks, but may recommend that study-specific ethical approval is sought if ethical issues arise in relation to an individual study. Separate ethical approval will be required for any study which includes any form of direct patient involvement.

Information Governance Review

As part of the assessment of applications for access to CPRD primary-care and related datasets, CPRD has to be mindful of data protection legal requirements and best practice as well as contractual undertakings to other data providers. This may mean that, occasionally, requests that may ostensibly appear reasonable may need to be referred for an information governance (IG) review to check for possible compliance issues, including that all suitable data minimisation and other risk mitigation approaches have been applied and that it is safe and appropriate to release the data requested for the study. This may mean that some modifications to the study protocol are required or that the data provided may need to be additionally processed before release. If your application requires any amendments to address any IG concerns, you will be provided with protocol-specific feedback to enable you to do this. This additional review stage is not expected to significantly add to the overall review timescales.

Voluntary registration of CPRD RDG approved protocols

Epidemiological studies are increasingly being included in registries of research around the world, including those primarily set up for clinical trials. To increase awareness amongst researchers of ongoing research, CPRD encourages voluntary registration of epidemiological research conducted using MHRA databases. This will not replace information on CPRD RDG - approved protocols that may be published on the CPRD website. It is for the applicant to determine the most appropriate registry for their study. 
Applicants should inform CPRD about voluntary registration of protocols by submitting a post approval amendment of the protocol on eRAP updating the section on ‘Plans for disseminating and communicating study results.

Reporting findings

When reporting the findings of a CPRD RDG-approved protocol, authors are encouraged to indicate that the study was approved and should provide information on any deviations from the original protocol. For protocols approved from 01 April 2014 onwards, applicants are required to include the ISAC or RDG protocol number in journal submissions, with a statement in the manuscript declaring approval by the ISAC or RDG process, where applicable. If the protocol was subject to any amendments, the last amended version should be the one included in the publication.

Applicants are required to submit a copy of all peer-reviewed publications based on CPRD data to CPRD. Applicants should inform the CPRD of the publication outcome/s and, where appropriate, send a copy or link of publications or a copy of funder’s report summarising the research. These can be sent to CPRD Enquiries (enquiries@cprd.com).

Please note that the CPRD reserves the right to audit the concordance between approved study protocols and published research.

It is essential that consideration is given to preserving confidentiality at the reporting stage. The possibility of unintentional (deductive) disclosure arises when cells with small numbers of patients are quoted. Applicants should note that, when reporting the data, CPRD policy is that no cell shall contain <5 events.

Page last reviewed