Determining the applicability and feasibility of using regression discontinuity in electronic health record data

Date of ISAC Approval: 
20/07/2020
Lay Summary: 
This study focuses on an analysis method - called regression discontinuity or "RD" - that aims to determine whether patients profit from a given treatment. While clinical trials (that is, a study where patients are randomly assigned to receive a treatment or not, often under highly controlled conditions) are considered to be the method of choice to test for effective treatment, they may not fully capture real-life effects occurring during routine care. Moreover, many clinical trials lack the long-run perspective necessary to evaluate treatments. RD achieves this by exploiting thresholds (cut-offs) in variables that influence whether someone receives a treatment or not (e.g., blood pressure above 160/100 mmHg). By "zooming in" around the threshold, it can be assumed that people just above and just below the threshold are similar to each other and can, thus, be compared in the same fashion as treated and untreated patients in a clinical trial. The method could potentially be applied widely in clinical medicine because clinical decisions are frequently at least partially based on such thresholds. RD, however, has thus far not been used in electronic health record data. The objective of this study is to determine to what degree RD can be applied in electronic health record data. If we are able to show that the method is feasible for a wide variety of variables and thresholds used in clinical medicine, then RD could be used to study of the effectiveness of clinical interventions in routine care (as opposed to research settings).
Technical Summary: 
Regression discontinuity (RD) design - a quasi-experimental method taking advantage of decision rules that assign patients to a clinical intervention if they fall above/below an arbitrary cut-off point - has potential to assess causal effects of clinical interventions. This study seeks to determine the applicability and feasibility of RD in electronic health record data. Specifically, we aim to (1) determine which (if any) laboratory or physical measurements contain thresholds that are associated with a substantial change in the probability of receiving a clinical intervention, (2) evaluate if patient characteristics are balanced within a small bandwidth surrounding these thresholds, and (3) investigate whether associations between the clinical intervention and patient outcomes are robust to different choices of bandwidth around the threshold. Exposure variables include laboratory and physical measurements such as BMI, blood pressure, HbA1c, blood glucose, age, low-density lipoprotein, thyroid-stimulating hormone level, hemoglobin, and T-score and Z-score for bone mineral density. Patient outcomes primarily include future measurements of the exposure variables, mortality (overall and by cause of death, such as due to cardiovascular disease when examining the effects of statins), and hospitalization (overall and by cause of admission). We will estimate "fuzzy" RD models using local linear regression to avoid overfitting data and triangular weights to give more influence to observations close to the threshold. In addition, we will use a mean squared error (MSE) optimal bandwidth that is empirically derived. We assess the sensitivity of the results using alternative bandwidths (e.g. bandwidths that are 50%, 75%, 125%, and 150% of the empirically derived mean squared error-optimal bandwidth). If feasible and widely applicable, RD analyses in electronic health records could generate valuable insights into the real-life effects of clinical interventions on health and health care use, the unintended effects associated with these interventions, and the potential heterogenous treatment effects by detailed patient subgroups.
Health Outcomes to be Measured: 
The primary outcomes that we will measure are: future measurements of the exposure variables (e.g. BMI, blood pressure, HbA1c, blood glucose, low-density lipoprotein, thyroid-stimulating hormone level, hemoglobin, and T-score and Z-score for bone mineral density), mortality (overall and by cause of death, such as due to cardiovascular disease when examining the effects of statins), and hospitalization (overall and by cause of admission).
Collaborators: 

Professor Till Barnighausen - Chief Investigator - University of Heidelberg
Dr Anant Jani - Collaborator - University of Oxford
Dr Christian Bommer - Collaborator - University of Heidelberg
Dr Duy Do - Collaborator - University of Heidelberg
Ms Julia Lemp - Corresponding Applicant - University of Heidelberg
Professor Justine Davies - Collaborator - University of Birmingham
Mrs Michaela Theilmann - Collaborator - University of Heidelberg
Dr Pascal Geldsetzer - Collaborator - University of Heidelberg
Professor Sebastian Vollmer - Collaborator - Georg-August-Universitat Gottingen