"The emulation framework lends the design and conduct of observational analyses some of the rigor required for the planning of a randomized trial. In doing so, it provides an innovative and principled approach to improve the evidence base on the comparative effectiveness of interventions," writes Boris Gershman, MD, and Aaron Fleishman, MPH.
Gershman is an assistant professor of surgery and Fleishman is an associate in surgery at Beth Israel Deaconess Medical Center, Boston, Massachusetts.
Clinical decision-making is contingent on high-quality evidence that informs patients as well as physicians about the effectiveness and trade-offs of different therapies. Well-designed, randomized clinical trials are atop “the pyramid of evidence” and regarded as the best means of evaluating comparative effectiveness.1 However, such “Level 1” evidence is often lacking in surgical disciplines, and urologic oncology is no exception. Although the scarcity of successful randomized trials in this specialty is multifactorial, it partly reflects the difficulty of randomizing patients to competing surgeries or to such surgical vs nonsurgical approaches as observation and radiation. Moreover, surgeons and patients often hold strong personal beliefs that make randomization difficult, and equipoise is difficult to achieve in such cases.
Indeed, randomized evidence in urologic oncology is often lacking, with observational data exclusively or predominantly informing practice. For instance, despite the importance of choosing between partial and radical nephrectomy in cases of small renal tumors, only 1 randomized trial examining the question has been completed—EORTC-30904 (NCT00002473)—and its findings have essentially been discounted in favor of observational evidence.2-4 In other settings, feasibility trials have likewise highlighted the challenges of randomizing patients to surgery. For example, despite the United Kingdom’s rigorous clinical trial infrastructure, the authors of the BRAVO-Feasibility Study on the effectiveness of BCG vs radical cystectomy in high-risk non–muscle-invasive bladder cancer concluded that such a trial would be “challenging to recruit into.”5 Similarly, the authors of the SPARE trial, which attempted to randomize patients with muscle-invasive bladder cancer to selective bladder preservation or radical cystectomy, concluded that such an attempt was “not feasible in the UK health system.”6
It is thus clear that when randomized data are limited or unobtainable, clinical decision-making must rely on observational evidence.7 Observational studies, however, have traditionally been considered inferior to randomized trials in terms of causal inference because of their potential for measured and unmeasured biases and confounding, both unknown and due to the dataset’s inability to capture them.8 In contrast, randomization, at least in principle, eliminates the potential for confounding, allowing for a more reliable estimation of treatment effects.
Described by Hernán and Robins,7 the emulation framework views observational analyses to answer a causal inference question as an emulation of a target clinical trial. The target trial may be real (ie, a successfully completed randomized trial) or hypothetical (when no such trial has been successfully completed). By requiring that each and every criterion of the target trial’s protocol be specified (Table7), the emulation framework improves the accuracy of inferences made as well as the quality of the analyses utilizing observational data.7,8
To better understand how explicitly specifying the components of the target trial can improve the accuracy of causal inference by avoiding study design flaws and reducing confounding, let us consider a specific example: the comparative effectiveness of radical cystectomy versus trimodality therapy for muscle-invasive bladder cancer.9 This is a clinical question with no successfully completed clinical trials, and it is unlikely that such a trial will ever be completed given the inferences from SPARE discussed above.6 Still, SPARE is an excellent real-world target trial upon which to base an emulation. In designing the hypothetical target trial, specifying the eligibility criteria ensures that investigators include only those patients for whom genuine uncertainty as to optimal treatment exists (ie, there is a state of equipoise) and omit those whose inclusion would result in intractable confounding (ie, those for whose presence no statistical adjustment could correct).8 For example, a patient with a cT4 tumor or cN+ disease would generally be excluded from trimodal therapy protocols by design and therefore would not likely be considered for this intervention in an observational dataset. Similarly, very young patients may be far more likely to undergo radical cystectomy, whereas very old patients or those with comorbidities may be far more likely to undergo radiation. At the extremes of age and comorbidity, the potential for intractable confounding exists. Partly due to such considerations, randomized trials often require threshold performance status like ECOG. Accordingly, eligibility criteria for the emulation should reflect a population for whom treatment equipoise exists and could in principle be randomized. In the SPARE emulation, the eligibility criteria are thus restricted to patients aged 40 to 79 with stages cT2-3 cN0 cM0.9 Although ECOG status is not captured in the observational data, restriction to patients with Charlson 0-1 may provide a reasonable surrogate for this measure.
Although it may appear obvious at first glance, specifying the treatment strategies is essential to ensuring treatments are well-defined. Moreover, this avoids specifying treatments that may not be susceptible to a causal interpretation (ie, that could not be studied in a trial).8 In the case of the SPARE emulation, the treatment strategies (based on target trial treatments) are specified to include neoadjuvant chemotherapy prior to radical cystectomy, which serves to minimize confounding between the 2 interventions by ensuring that all patients are fit to receive chemotherapy. Still, some elements of the treatment definition require surrogates, such as what constitutes lymphadenectomy in an observational dataset. Although the target trial may define lymphadenectomy as the removal of specific nodal basins, the observational dataset requires that it be defined by specifying a minimum number of lymph nodes to be removed. Once this compromise is identified at the design phase of the emulation, appropriate sensitivity analyses can be implemented to examine the impact of this choice on treatment effects.
Although assignment in an observational analysis is necessarily per-protocol, it emphasizes causal contrasts (ie, intention-to-treat versus per-protocol) and potential non-adherence issues. Importantly, specifying the follow-up period encourages investigators to appropriately define time zero and thus reduce the potential for immortal time bias (a bias related to the period of time at which an outcome, such as death, cannot occur based on the definition of a treatment). In the case of SPARE, randomization is the obvious choice for time zero, but in the emulation, date of diagnosis is the best choice for minimizing immortal time bias. In this regard, it is helpful that both treatment arms require chemotherapy. Had the radical cystectomy arm not required neoadjuvant chemotherapy, the trimodal arm may have been subject to immortal time bias, as all patients receiving this treatment must have survived to chemotherapy completion based on treatment definition.
Finally, relevant estimands are identified and statistical analyses are outlined in a fashion similar to that of randomized clinical trials, including absolute and relative measures of treatment effects (ie, survival rates at various time points as well as hazard ratios for mortality). However, planning statistical analyses a priori requires that investigators consider available covariates, treatments, and outcomes in the observational dataset, including potential confounding and selection bias.8 Designing an emulation often requires compromises when certain features required for the target trial, like ECOG status, are not available in the observational dataset. Sensitivity analyses can be implemented to examine the effect of such design choices on treatment effect estimates.
Emulating target trials using observational data has a number of other benefits. First, it provides real-world information, which may differ from the randomized trial evidence obtained in carefully controlled settings. Second, it can extend causal inferences to end points not captured or examined in a target clinical trial. Third, the emulation can be used to examine heterogeneity of treatment effects (ie, how treatment effects vary according to patient or disease characteristics).8 In this regard, randomized trials are usually powered to detect population average treatment effects, whereas there is often biologic plausibility that the effectiveness of interventions may vary according to patient or disease characteristics. The sample sizes of observational analyses are typically large enough to allow for examination of the heterogeneity of treatment effects.
Emulating a target trial using observational data is not meant to replace well-designed randomized clinical trials. Observational analyses are inherently subject to biases and confounding, and such limitations must be candidly acknowledged. However, when randomized evidence is lacking in urologic oncology, the emulation framework can improve the accuracy and reliability of causal inference and fill in critical evidence gaps.
It is important to consider the amount of planning required to develop a protocol for a randomized clinical trial. By contrast, observational analyses may be planned quickly and haphazardly. The emulation framework lends the design and conduct of observational analyses some of the rigor required for the planning of a randomized trial. In doing so, it provides an innovative and principled approach to improve the evidence base on the comparative effectiveness of interventions.
References
1. Ho PM, Peterson PN, Masoudi FA. Evaluating the evidence: is there a rigid hierarchy? Circulation. 2008;118(16):1675-1684. doi:10.1161/CIRCULATIONAHA.107.721357
2. Van Poppel H, Da Pozzo L, Albrecht W, et al. A prospective, randomised EORTC intergroup phase 3 study comparing the oncologic outcome of elective nephron-sparing surgery and radical nephrectomy for low-stage renal cell carcinoma. Eur Urol. 2011;59(4):543-552. doi:10.1016/j.eururo.2010.12.013
3. Kim SP, Campbell SC, Gill I, et al. Collaborative review of risk benefit trade-offs between partial and radical nephrectomy in the management of anatomically complex renal masses. Eur Urol. 2017;72(1):64-75. doi:10.1016/j.eururo.2016.11.038
4. Kim SP, Thompson RH, Boorjian SA, et al. Comparative effectiveness for survival and renal function of partial and radical nephrectomy for localized renal tumors: a systematic review and meta-analysis. J Urol. 2012;188(1):51-57. doi:10.1016/j.juro.2012.10.026
5. Catto JWF, Gordon K, Collinson M, et al. Radical cystectomy against intravesical BCG for high-risk high-grade nonmuscle invasive bladder cancer: results from the randomized controlled BRAVO-Feasibility Study. J Clin Oncol. 2021;39(3):202-214. doi:10.1200/JCO.20.01665
6. Huddart RA, Birtle A, Maynard L, et al. Clinical and patient-reported outcomes of SPARE - a randomised feasibility study of selective bladder preservation versus radical cystectomy. BJU Int. 2017;120(5):639-650. doi:10.1111/bju.13900
7. Hernán MA, Robins JM. Using big data to emulate a target trial when a randomized trial is not available. Am J Epidemiol. 2016;183(8):758-764. doi:10.1093/aje/kwv254
8. Gershman B, Guo DP, Dahabreh IJ. Using observational data for personalized medicine when clinical trial evidence is limited. Fertil Steril. 2018;109(6):946-951. doi:10.1016/j.fertnstert.2018.04.005
9. Softness K, Kaul S, Fleishman A, et al. Radical cystectomy versus trimodality therapy for muscle-invasive urothelial carcinoma of the bladder. Urol Oncol. 2022;40(6):272.e271-272.e279. doi:10.1016/j.urolonc.2021.12.015