Good morning, my name is Marie Rienzo, and I want to welcome you to the NIH Office of Disease Prevention mind the gap webinar series. This series explores research design, measurement, intervention, data analysis, and other methods of interest to prevention science. Our goal is to engage the prevention research community in thought-provoking discussions to promote the use of the best available methods and to Support the development of better methods. Before we begin I have some housekeeping items. To submit questions during the webinar. There are two options. First you may submit questions via WebEx by clicking on the question mark in the WebEx toolbar. Please direct your questions to all panelists. Second, you may participate by Twitter and submit questions using the hashtag NIHMTG. At the conclusion of today’s talk we will open the floor to questions that have been submitted via WebEx and Twitter. Lastly, we would appreciate your feedback about today’s webinar. Upon closing the WebEx meeting, you will be directed to a website to complete an evaluation. We would appreciate your feedback as it will help us improve this webinar series. At this time, I’d like to introduce. Dr. David M. Murray. Associate Director for Prevention and Director of the Office of Disease Prevention. Thank you Marie. Today’s speaker is Dr. Jacob Bor. Dr. Bor is an Assistant Professor and Peter T. Paul Career Development Professor in the Departments of Global Health and Epidemiology at Boston University. His research applies the analytical tools of economics and data science to the study of population health. With a focus on HIV treatment and prevention in southern Africa. Dr. Bor was also ODP’s 2018 Early Stage Investigator Lecture winner and we’re delighted to have him back with us today. It’s my pleasure to welcome. Dr. Bor. Thank you very much David. It’s really a pleasure to be here and to be able to share some thoughts a little bit of my research, but also a broad overview of progression discontinuity designs and public health research. I don’t know whether this cartoon image that you’re seeing on the on the title slide is used in all of your or for all the lectures for the series. But it is It’s the perfect image for what I’m going to talk about today. You have here a person on a subway platform, who is about to step into the abyss and the question about whether he crosses over this threshold or not has very important implications for his fate. And so just take that with you as we go through the next to the next half hour or so of this presentation. So I’d like to start out with a with a quote. “As the case for any observational study, our results might be affected by unknown unmeasured confounding factors.” So who has said that? I’ve said that. Perhaps you have said that? Perhaps every epidemiologists and social scientists has said this? In my doctoral studies I remember this being drummed into us as the sort of required sentence in the discussion section. Just as if to say, “Whatever you just read. Take it with a grain of salt. It might not be true.” This particular quote comes from the Wind to Start Consortium. A big paper on the timing of HIV treatment that was published from The Lancet in 2009. But this could have been taken from any number of papers. When we think about health research and causality and health research. You typically have have two ideal cases. We have randomized controlled trials in the one hand and we have observational studies on the other. And these two ideal cases are what you see in w-h-o engraved guidelines for example to degrade the quality of evidence that’s considered in w-h-o recommendations. In randomized controlled trials, of course have high internal validity because of randomization. They achieve balance of both unobserved but both observed and unobserved covariance. But RCT states some constraints in terms of when an RCT can actually be done. They are often more costly. There may be ethical constraints if equipoise can’t be established and there’s a limit on the range of types of exposures that one can do in a clinical trial for. On the other hand, observational studies may have better external validity. But they may achieve balance only on observables and require the strong assumption the disclaimer that I mentioned a slide ago. Quasi-experimental studies, draw seek to draw from the strengths of both of these approaches. Quasi experiments exploit quasi random variation that occurs naturally in the world or in an administrative rule. In order to estimate to make causal effects and quasi experiments there’s potential for balance on both observed and unobserved factors similar to an RCT. But because the data are often observational, there are fewer ethical and financial constraints on to analyze and quasi experiments and there’s a wider range of exposures that one might be interested in for prevention science that one can look at and finally, it allows you to evaluate programs at scale as implemented in real life non trial settings. Which is important for understanding some of the behavioral pathways of the behavioral impacts of interventions that I’ll discuss later in the presentation. So today I’ll talk about one particular quasi-experimental design, the regression discontinuity design, provide an overview of RDD, provide the little bit of the theory behind causal inference in regression discontinuity designs, show how you would estimate a treatment effect in an RDD in an RD design, talk through a few examples of regression discontinuity in the health literature, and finally deal with the very common type of regression discontinuity design in which is RDD with non-compliance or fuzzy RDD. So regression discontinuity can be implemented whenever treatment is assigned at least in part by a threshold rule on a continuous baseline variable. Basis to think about this with an example so for a long time in South Africa before 2000 prior to 2011 There was a CD4 count threshold for eligibility for HIV treatment. So if a person’s CD4 count was below 200 cells then they were eligible to initiate HIV therapy if machine content was above 200 cells then they were instructed to return in six months for reassessment of eligibility and The intuition behind behind regression discontinuity. Is that because of this threshold rule? Patients presenting just above and below 200 cells Are essentially identical and are expected to be similar on both observed and unobserved characteristics similar to an RCT But they’re assigned different exposures. Treatment eligibility on the one hand and deferred eligibility on the other hand in This in this study that the co-authors and I published in 2014 in epidemiology epidemiology, we provided a primer of RDD for Epidemiologists and we also used that example that I just described to show how RDD can be used in clinical and epidemiological sciences and this is actually the first application of our RDD to our knowledge to a clinical threshold pool in in an epidemiology public health or clinical sciences journal. So what we looked at was the effects of immediate versus deferred our eligibility on treatment uptake in survival And so this plot shows you the distribution of first CD4 counts when a person presents for care and These are data from rural South Africa from African Health Research Institute About 2 hours north of Durban even if people are familiar. The way care works is that someone comes into the clinic Tests positive for HIV and then blood is immediately drawn and sent to the labs for a CD4 count So this first CD4 count is really CD4 count at diagnosis at entry into care. And these are this is just the the distribution of CD4 counts Amongst people presenting for care and you see this is quite this is quite smooth This is just a histogram showing showing that people are presenting it at various CD4 counts both above and below the threshold There’s substantial noise and these CD4 counts due to measurement error or due to random fluctuations So where you are right around the 200 whether you’re you’re just above it or below the 200 threshold is substantially random and yet it has really important implications for whether you start treatment So this plot shows the probability of starting treatment ART is the antiretroviral therapy here whether you start treatment within six months after that first CD4 count and what you can see is that Since I’ve lost my right cursor but what you can see is that just below the 200 cell threshold About 70 percent of people or two-thirds of people or starting treatment within six months just above the 200 cell threshold only about a third of people were starting treatment within six months and the impact of immediate eligibility was to increase the chances of starting treatment by thirty two percentage points. And so this is called regression discontinuity because at this threshold there’s a discontinuity in the likelihood of being exposed and that discontinuity is exploited as a natural experiment in order to identify impacts on on outcomes. So causal effects can be estimated Simply by comparing patients presenting above and below the threshold and so we can have we can identify effect measures either as ratios or differences in Uptake or another outcomes at the threshold and in certain settings as I’ll discuss No assumptions are actually required about unmeasured confounding factors Making a distinguishing regression discontinuity signs from other observational studies. This is the the effect now of HIV treatment eligibility on survival. So here patients were this is a unique setting to do the study. We were able to link patients to longitudinal demographics surveillance data, so we actually had very good gold standard information on on survival and what we saw was that patients presenting just below the 200 threshold who had slightly lower CD4 counts and were in slightly worse health on average actually had better survival or lower mortality in patients presenting just above the threshold and so the intent to treat impact of HIV treatment eligibility on mortality was a reduction in the hazard of death of Of 35 percent and so this was this this is what we published in a 2014 article in epidemiology. So this isn’t so this is sort of the overview of how this works and now we’ll unpack this a bit going forward the original regression discontinuity study actually came from the educational psychology literature this Study by Thistlethwaite and Campbell where they looked at the the PSAT and looked at PSAT scores Or they wanted to know whether it’s you if you scored above a certain threshold and received a certificate of merit whether that led to changes in in later educational educational outcomes. Since since its inception in the 60s RDD has really an interesting history, it was practiced primarily and promoted primarily in this sort of program evaluation world by scholars such as Shadish, Cook, and Campbell in these papers that that I listen listen sort of the Books that I list here at the top since the late 90s Economists started to take quite an interest in regression discontinuity designs and there were a number of papers Including some of the ones that was listed here that established Some of the methodological theoretical and methodological underpinnings of regression discontinuity designs and Form form the basis for how people are think about and analyze regression discontinuity designs today. In terms of clinical and public health research in 2014-2015 at the same time as that we had published another article we we were interested in just how no How much How many in sort of existing papers have there been on RDD in the health literature and we found just 32 empirical RDD papers in PubMed and just two of those studies looked at clinical thresholds with physical health outcomes. Almond’s study on Low birth weight and and then in our study on CD4 counts and HIV treatment so this was this was still sort of Nascent just a few years ago It’s now grown substantially and sort of the highest data point there terms of PubMed results regression discontinuity designs Is this year at forty eight. And this year isn’t even done so You know, we really see this as There’s an increasing interest and acceptance and use of RDD in the health literature. So, how are we able to identify causal effects and regression discontinuity designs? So I I’m gonna introduce a little bit of notation here We can imagine two potential outcomes for an individual i We can imagine outcome Y1 if that person is treatment eligible and why not Y0 and a completely counterfactual unobserved state of the world if that person had not been eligible for treatment and so this is this is one this is this this framework which was developed by Rubin and and others has been has been the primary way that people have thought about regression discontinuity design and other quasi-experiments So our goal is to make comparisons between Y1 and why not? The problem is that Y1 and Y not are never observed for the same individual in truth And so we need to identify two comparable populations one that’s treated for whom we observe Y1 and one that’s not treated for whom we observed Y not and if those populations are comparable then the difference between outcomes in those populations will be a column can be interpreted as a causal treatment effect if they’re not comparable then it’s confounded and to not be interpreted as a causal effect. So the setup for regression discontinuity is that we imagine that we have these These different potential outcomes if eligible if not eligible and then we have a continuous treatment assignment variable CD4 count in our example distance from the yellow line on this on the subway platform on the intro slide, for example And the threshold rule says that someone is eligible for the treatment or the exposure if the assignment variable is less than or greater than some threshold and so what we’re going to be playing with here are objects called the potential outcome conditional expectation functions or POCEFs and these are the average outcomes that one would observe if treatment eligible had different values of Z On the one hand and then on the other hand, the average outcomes one would observe if not eligible at different outcomes of Z So here’s a picture of that the top line you can imagine this as corresponding to our study on CD4 counts of mortality That the top line shows the potential outcomes if not treatment eligible the expected mortality rate at different CD4 counts. Let’s say The bottom line shows the potential outcomes if treatment eligible at different CD4 counts the idea behind the potential outcomes framework is that theoretically Theoretically both of these lines could exist both through both a solid and dotted aspects of these lines But in fact in the observed data, we only observe a solid lines So we might like to compare Patients at different CD4 counts who are eligible or not eligible But we actually we don’t observe that across the whole distribution of CD4 counts. We do however observe that at this threshold C So the theory behind RD is that in the limit as we approach the threshold from above and below the observed values of the observed mean outcomes just above the threshold and just threshold are estimates of the potential outcome conditional expectation functions at the threshold Potential outcomes if observed is excuse me if treated and it’s not treated and So this allows us at the threshold to identify a causal effect Which is the comparison of people in this neighborhood around the threshold? Who just happened to be above or just happen to be below the threshold and receive different treatment assignments But were otherwise similar In order to make in order to implement this design. We need a few assumptions the first is that the threshold will exists and that the threshold is known Second is that the is that the assignment variable is continuous near the threshold so that we can imagine getting infinitely close to the to the threshold and taking limits The third which is really the key assumption is that we have continuity in these in these lines So just going back for a moment if there was a jump in the threshold in either the blue line or the red line Then they wouldn’t be a good counterfactual for each other at the threshold So that’s the key assumption is that the blue line as you approach from above is a good counterfactual for the red line Even though red line above the threshold is not is not observed So is that a strong assumption is this continuity assumption a strong assumption? Let’s take one case which is a geographical boundary. So in New Hampshire they have lower cigarette taxes than we have in Massachusetts and so I might be interested in using distance from the New Hampshire boundary as as a way to look at the impact of cigarette taxes on smoking so is do we believe that there’s continuity in smoking rates at the threshold? And that this can be linked to cigarette tax? Well, there’s this this is a case where where we might think about other possible explanations So taxes are different and policies are different for a whole range of things beyond cigarette taxes And so this geographic boundary may not work so well in this instance in terms of justifying this assumption On the other hand what if we have a laboratory measure on a clinical biomarker that determines the determines treatment? In this case we can actually identify off the random noise inherence in that laboratory measure So when you take CD4 counts, there’s random noise There are random daily and hourly fluctuations in CD4 counts when you take a blood sample it’s a sample from an underlying population and there’s sampling variability in that sample There’s also measurement error in terms of the machines used at the laboratory to measure CD4 counts and so when you look at that variability and measured CD4 counts, it’s quite substantial and because there’s measurement error that measurement error in the assignment variable actually guarantees continuity in potential outcomes at the threshold so long as people aren’t able to directly manipulate the values of their CD4 counts Manipulation of those values can be assessed in the data and can also be sort of assess for plausibility in our case the data came directly the labs and before that before the providers or patients even knew about the results of the of those CD4 tests. So there really wasn’t any scope for manipulation, but it can also be assessed in the data by looking at gaps in the density of the assignment variable around the threshold. This is a test that Justin McCrary pioneered where he’s the intuition is that if patients are providers change their values of Z designs variable in order to gain or avoid access to the treatment. This will result in a bunching of values and a higher density of the values of the assignment variable on one side or the other of this threshold and so we would see discontinuity in the density of the assignment variable So this is a test and I showed you the histogram at the very beginning to show that there was that that the density was was continuous across the threshold The second test we can do which is similar to a balanced table in an RCT is to look at continuity at the threshold in observables So I just said that regression discontinuity gives you continuity gives you continuity both in observed factors and in unobserved factors Well, just like an RCT. We can’t prove that there’s balance and on observables but we can show balanced and observed factors as evidence that the mechanism for treatment assignment the random mechanism for treatment assignment that we think occurred was actually the mechanism of treatment assignment that did occur and so this is one example of showing continuity in the baseline covariant at the threshold And of course the the intuition is that patients just above and below the threshold are similar on all baseline covariance both observed and unobserved and we can show that balance that continuity and observed covariance When it comes so heavy so that’s for the intuition behind causal inference. So we’re identifying a treatment effect at the threshold we’re able to to identify because of this assumption of continuity at the threshold in the potential outcomes So how do we estimate these treatment effects? Well, we’d like these two values these two values of the observed conditional expectation function i e the observed mean values of the outcome as we approach see as the approaches C from above and as he approaches C from below Those two points at the threshold better shown with those arrows so those can be estimated in the data and And from those we can make inferences about these two potential outcome means at the threshold So, how do we estimate those in the data? Well, the traditional approach is to run a linear very simple linear regression model in which we have an intercept in which we have a continuous term for the assignment variable and and allow for different slopes on either side of the threshold and then we have an intercept shift at the threshold and so I’ve written out the equation there at the top if we look at the if we look at the graph at the bottom we can see what these different coefficients represent So the intercept shows us the value at the threshold Just just above the threshold here we went out without the treatment so that’s the that’s the mean and the control group as it were Beta 1 shows you the slope above the Q in this case above the threshold where people are not eligible for treatment The slope below the threshold where people are are eligible for treatment is just the sum of the coefficients Beta 1 and Beta 3 and Beta 2 is the treatment effect, which is this which is the difference in means at the threshold Importantly in this equation you can see that we’ve that we’ve centered the assignment variable at C So we’ve subtracted 200 from the CD4 cap value so that the assignment variable is is essentially a plus or minus distance from the threshold and when What when Z equals C then those terms are 0 and both Beta 1 and Beta 3 drop out of the drop out of the model So this is this is how this is how people identify our beauty treatment effects and the thing in in the the objects you’re of interest the main object here of interest is Beta 2 and But we can also look at Beta not and Beta not plus Beta 1 as the treatment and control group treated and control means at the threshold Bunch of reasons to favor local linear regression which are which are listed here One common question is how to choose how big a bandwidth having a window of data one should use around the threshold and they’re data-driven bandwidth selection routines that that are the best way of doing that The alternative which is sort of for the researcher to pick whatever they think is the nicest bandwidth doesn’t really hold up to scrutiny and there’s potential for Cherry-picking and so the recommendation here is is to declare a particular data-driven bandwidth selection method a priori and then that selection method will generate a bandwidth that becomes that that’s used in the analysis and the other guidance throughout the route the RDD literature is that it’s important to show lots of show robustness to lots of different bandwidths So how should we interpret these effect estimates? So the RDD effect is a causal offense at the threshold Well, is that so useful? So you know when I hear criticism of this of this of the design the criticism is is primarily Great, you’ve identified the causal effect at one point. Why do we care about the causal effect at that one point? And so I want to provide a little bit of interpretation about how we should think about that causal effect and the first is to say If we were to assume constant treatment effects, and that causal effect is the same causal effect that we have observed in an RCT now constant treatment effects is a strong assumption, but it’s a common assumption in an epidemiological research Secondly if effects a treatment effect for heterogeneous non-constant, but they were independent of the assignment variable then RDD would also identify the same causal effect as an RCT so we only really have a problem here if treatment effects are heterogeneous and the size of the treatment effect is Correlated with the assignment variable and in that case the RDD effects identifies a local average causal effect where it can be thought of as local to the area around the threshold But it’s not necessarily just the effect at the threshold because as described earlier There’s substantial measurement error in in if there’s substantial noise or measurement error in in values of the assignment variable Then we can think of that causal effect as being a causal effect across and as a sort of a weighted average of causal effects across a region With true CD4 counts that have measured CD4 counts at 200 there’s quite a wide range of true CD4 counts that give rise to a measured CD4 count of 200 and so to think of this as a hyperlocal Hyperlocal effect very limited generalizability isn’t necessarily correct The other point to make is that the local effects ie the effect effect at the threshold is exactly the effect that we you anticipate if we marginally increase the threshold Often that’s the policy option that’s out that that’s available to us should we change the threshold should we increased should we should we increase it a little bit should we lower it a little bit and So the local effect is precisely that Effect estimate and that’s something that’s actually not available In most randomized controlled trials unless they are very large and high enough power to identify cheating effect heterogeneity with with the assignment variable covariant So I want to run through a few examples I’ve given our example just a clinical example, but there are many many more in public health that made us repeat your your imagination and curiosity so um Almond et. al. looked at the low birth weight cut off and find that infants born with birth weights below 1,500 grant Below 1,500 grams blue. This is for very low birth weight. Had higher mortality in Infants born just below the very low birth weight threshold were Identified as being very low birth weight and we’re more likely to receive intensive neonatal intervention and actually had lower mortality than babies that were born just above this threshold and they used this to to to calculate the cost effectiveness of a neonatal care provided for very low birth weight since Carpenter and Dobkin can look at the impact of drinking age rules on On mortality and so this is a bit of a dense plot but the the primary plot here to look at is the top one which shows motor vehicle accidents MVA and Shows that as people there’s generally a declining rate of motor vehicle accidents as people get older But there’s a sharp increase at age 21 when people are eligible to legally drink Chen et al. looked at a very interesting policy called the Huai River policy in China which provided subsidies for for coal burned for coal heating in the north of China but not in the south of China and they use distance from this In this Huai River, which sort of above the river you’re eligible for this and below you or not in order to identify differences in air pollution Which is the the TSP here on the bottom left and Indent and then differences in in life expectancy And use this as a way to estimate the impact of air pollution on on life expectancy in China Ludwig and Miller looked at the rollout of Head Start programs in the 60s and they As is often common in these RDD studies a lot of the hard work is finding finding the natural experiment and figuring out precisely how How treatment was assigned and obtaining the data necessary to do to to to Recreate that and so it turns out that when Head Start was first rolled out in the 60s. There was additional sort of Federal funding but also hand-holding to particular commute low-income communities low-income counties To help them to get Head Start programs started and this is that a recognition that counties like needs an extra hand holding in order and in order to and in order to implement these programs and so that The determination of whether a locality was in a high poverty County was based on the poverty rate in the 1960 census and So what so what Ludwig and Miller do is they Extract the data on poverty county poverty for the 1960 census and map this on to different counties not to headstart funding and show that actually there was a discrete increase a discontinuous increase in Head Start funding per four-year-old by the late 60s in those higher poverty counties and that This actually led to reductions in in late in in child mortality In later survey in later in later years This study just came out this past years very interesting study by by Anderson who looks at women’s property rights an HIV infection risk, so there’s been all there’s long been a theory that that women’s empowerment or lack of women’s empowerment is substantially responsible for For very high rates of HIV prevalence amongst women in sub-saharan Africa and when Anderson did was to identify ethnic groups where there were people were there members of that ethnic group on both sides of a national boundary and then to identify Those sets of ethnic groups splitting national boundaries where the two countries Had different legal systems one based in common law and one based in civil law and these different legal systems had different protections of property rights for women and So what they found what they found was that using distance? To border to the border as the assignment variable and using demographic and health surveillance data to look at this Anderson found markedly lower female HIV prevalence and these civil law countries where there were stronger property right protections for women compared to the common law countries and this and this was consistent with with with behavioral data on condom use as well a Study from from Toronto Chen and colleagues looked at air pollution alerts Where if the daily maximum air quality indicators above this threshold at 48? then there’s a air pollution alert is announced and People may have different Behavioral responses to those alerts for example not going outside. The air quality is very bad and it’s you have asthma let’s say and so what they show is that the these air These air pollution alerts were actually associated with fewer asthma ER visits ostensibly because because kids with asthma Weren’t weren’t going outside and being exposed to these higher pollution levels and of course the the the there’s a You can see a general increase in asthma your are ER visits with higher pollution levels, but there’s a sharp drop When people are made aware that air pollution is high through these air pollution alerts These are just a few examples of how RDD has been used. It’s also been used in in data on on Elections, so if In a two-party election, whoever has more than 50% of the vote is the is the winner. And so there’s been a whole series of papers looking at looking at political elections as well as union elections as a threshold rule There are also a number of papers that look at eligibility cut-offs and this is the paper by Laura or I Don’t know if it’s Digg or dad or Dague Who looked at Medicaid premiums and rates enrollment Adults were more likely to stay enrolled if they didn’t have to pay a premium so this shows income as a percent of federal poverty line and the length of time someone was on Medicaid and at one hundred fifty percent of poverty line The premium goes from zero to ten dollars a month. And even though this is a very this isn’t very much money It has a huge impact on enrollment and so this is an important paper because it shows that that it’s it’s probably not the amount of money and much more likely the sort of the hassle costs of paying the premium or Thinking about insurance as as a monetary cost. So this is this these examples give you a sense of the range of That of the range of examples that RDD can be used for I’m going to quickly go through a last last point about RDD non-compliance most of these studies that I’ve shown you Involve a threshold rule that influences treatment assignment but does not determine it deterministically Say whether treatment is on or off Right. So what if the threshold rule only applies to some patients what if there are other indications for treatment in our HIV treatment example? stage four illness, or contraindications for treatment Or what if some patients opt out? Does despite being out being eligible from religious reasons or because they don’t have the money to come back to the clinic What if some patients opt in despite being not eligible because they they’re really strongly motivated aren’t going to advocate for themselves no matter what These cases make RDD very similar to clinical encouragement trials in which you have non-compliance on both sides and this is very common and is really The vast majority of RD designs are of this type and it’s known as fuzzy RDD Which I think it’s sort of a misnomer, but but that’s what it’s called And in in and we extended the the previous work that we did It’s 200 threshold to look at a higher 350 threshold for HIV treatment, and this was in PLOS Medicine last year where we specifically implemented a fuzzy regression discontinuity design So here’s a similar story to previously being eligible for treatment substantially and just continuously increased the probability of starting treatment within six months. Here the risk difference is 25 percentage points And we can think of this as describing three groups so even though even though all patients at the threshold are either above or below some of those patients would have initiated ART regardless, we call those always takers and that’s that 15% who would have initiated even if they were above the threshold most patients unfortunately were not going to initiate ART, even if they were eligible and we call those never takers who are people above the threshold And so it’s the people above the top of the top line The people for whom the threshold binds are called compliers. And so one way to interpret this plot is to say that the threshold binds for 25% of the population For 25% of the population Their treatment decision was based on which side of this threshold they were on and in this situation, we’re gonna interpret the overall effect of being below the threshold as an intent to treat effect and Similar to other randomized trials randomized encouragement designs where we have non-compliance the effect of the treatment itself can be recovered using a threshold rule or randomization in the trial as an instrumental variable that is we’re gonna scale the intent to treat effect by the share of patients whose treatment status was determined by the threshold rule these compliers assuming that there was no effect of eligibility on outcomes for people who either would have initiated regardless of eligibility or would not have initiated regardless of eligibility and So the intent to treat effect is a 17 percentage point increase in retention in care this is the key outcome we’re looking at And Looking at this effect specifically on patients whose treatment decision was based on eligibility threshold We found that immediate treatment eligibility increased 12 months retention by 70 percentage points amongst these patients from the threshold bonds and This is a valid causal estimate under additional assumptions of excludability and monotonicity Which we don’t have time to discuss right now, but I can talk about in the discussion section if that would be helpful Further analysis revealed that amongst this group of compliers immediate eligibility increased for tension from 21% to 91%. That was the gap. That was the 70 percentage point gap. So this is a huge increase retention amongst people forgetting the threshold binds and what’s interesting about this, is that w h o guidelines Have been made primarily based on And the three important clinical trials for this question of immediate versus deferred therapy HPTN-052, START, and Temprano trials all made what’s great efforts to make sure that the control group in those trials was retained in care and so the red and blue bars here showed the proportion retained in those trials by treatment arm and it’s essentially the same very high in both in both arms our study in Hlabisa Found in contrast that said that immediate treatment eligibility for those patients who would be willing to take up treatment Which are those patients that we would think would consent to be in one of these trials That immediate treatment eligibility dramatically improved retention and that without immediate treatment eligibility and without the additional hand-holding that we saw in these trials the rate of retention would have been much much lower so whereas most of the benefits in the clinical trials literature of immediate our eligibility have been conceived as biologic benefits There’s this other completely separate behavioral pathway that may be really really important and lead people to to To return to care only much much later when they’re very sick And when they’ve had lots of opportunity to potentially transmit the virus to other people So as a recap RDD offers a rigorous approach to causal inference when an exposure or treatment is assigned by a threshold rule It’s been described as second only to an RCT in terms of its internal validity and this comes from this notion of local randomization where noise an assignment variable Randomly allocate patients or units to being just above or just below this threshold and guarantees continuity in the potential outcome conditional expectation functions Increase there’s been increasing use of RCT in public health and medicine there are lots of potential use cases that haven’t been exploited yet clinical thresholds are some of the classic case and there’s still very few studies that use clinical thresholds, but there are many other applications to RDD have obvious benefits over observational studies in terms of the ability to make causal statements without the very strong assumption of no residual confounding But they also have benefits over RCTs in some cases There lower cost they enable evaluations and typical to randomize interventions For example legal systems as shown in the Anderson paper They are typically they enable analysis of population representative data Rather than trials which require opt-in consent Often have very selected samples and they enable analyses of interventions as implemented in real-world settings where the control group really is receiving the true standard of care and as we saw from the PLOS Medicine example There’s a real difference between true standard of care and one standard of care is in many trials The key limitation is that regression discontinuity designs are not always available as an option Threshold rules don’t always exist, but they’re more common than you think and a lot of the fun In this in this line of work is trying to identify creative natural experiments and finding threshold rules that May be sort of just about over the over the just over the horizon And there’s some detective work in that which is very interesting for complements the the statistical empirical work so I’ll I’ll leave it there and We have them about 13 minutes for discussion. I look forward to your questions Thanks Thank You Jacob terrific presentation Let’s start with a question about the nature of these studies that you Some of the examples that you gave and you listed a bunch of references on one slide that we saw just briefly How many of these were established as RDD studies from the outset that is these were prospective studies where they were planned as RDDs and how many of these were retrospective studies where you or someone dug into the situation the data discovered that you could analyze it as an RDD and did that? Yeah, that’s that’s a great question. There’s nothing that says that you cannot design a perspective RDD An advantage in doing so would be that you really aren’t aware the data generating process It’s not sort of that’s not a part of the inferences that you’re making about what the true data generating process was You know you can imagine doing that in a situation in which you had a Sort of quite clear ethical case that people You know Maybe you got a risk score and you in quite a clear case that people with higher levels of this risk Score you want to allocate to some intervention. We don’t feel comfortable Randomizing people with high levels of the risks courts that intervention, but we could use a sharp cutoff and then analyze at that threshold Whether people just above and below that eligibility threshold Benefited from the intervention so you can submit from an ethical standpoint you can imagine that being a Very reasonable way to conduct a perspective study It would have less statistical power than the RCT and it would also not allow you to identify Treatment effects it at other points along that risk score. So that’s a drawback in that situation, but it’s certainly possible I think all of the examples that I described were retrospective analysis of large administrative or household survey data sets and I you know, I think it’s certainly It’s certainly possible to do prospective RCTs. And actually I have some work with a Intensive-care Implementation science expert on using prospective RDD for continuous quality improvement Although this is possible I think a lot of the power of RDD is being able to find these experiments that exist in the world with the plethora of data sources that that have and There’s all this data hanging around Waiting to be used administrative data large household surveys and if you can find an experiment that you can then link to those To those administrator or survey data, then it can be really powerful and it can be really powerful But for the reasons that I start described that you can identify interventions implemented in real world settings without investigator interference at scale And in you know population representative and population relevant samples, which is not always the case in prospective Programs because you know studies might investigating when studies are often often smaller I Mention that one of my master students when I was at Ohio State actually did a prospective RDD It was an interesting example and as we talked about the study, but he wanted to do for his MPH project it seemed like a natural so not sure that he’s ever published that but I Agree with you that it’s possible to do it interesting that most of the or all of the examples that you’ve shown us have been retrospective we have a question from our listeners about Whether the assignment variable should always be some kind of a hard You know biological measure or whether you can use self-report measures as the assignment score? Does it matter? That’s a great question so the assignment variable needs to be used in the treatment decision or in determining the exposure so say that there was a Say for example that There was a blood pressure cutoff and I was not able to obtain The originally sort of written down blood pressure numbers from charts. So instead I asked people what their blood pressure was at their last visit well Whether or not they were on antihypertensives, it’s really a function of what was written down in their chart It’s it’s it’s a function, you know That with with a whole bunch of other stuff in it in terms of what what people self-report A blood pressure would be and so people might for example Remember more accurately their blood pressure if they were on antihypertensives So there are things like that where you know it it’s not that it has to be a hard measure But it has to be the measure that was used in treatment assignment, I’m trying to think of an example I’m trying to think of it of an example, but where self report could be used the challenge with self report is that if someone knows what the if someone knows what the What what the what the threshold rule is Then it’s there’s a lot of scope for manipulation. And so people, you know in in studies that look at income Self. They look at self-reported income This this comes up sometimes where you have People are reporting income to get just below You know a tax threshold or an eligibility threshold for something and sometimes the density test that I described doesn’t quite hold Because people even if that’s the actual income value used to determine eligibility people may not be reporting to authorities Their their income quite as accurately because they want to get just below some some threshold So and there’s a whole other literature on that kind of manipulative behavior which is interesting in itself in terms of understanding people’s people’s motivations Sort of my apologies for the circuitous answer. It’s possible to use a self-report, but it would be difficult to do so because of the potential for manipulation But but one one would have to assess that individual circumstance I Know that it’s important in the analysis of the data to model accurately the relationship between the assignment variable and the outcome the examples that you showed in some cases they were linear models in some cases it look like they were polynomial models How do you know when you’ve modeled that relationship correctly? so Two answers to that. So the first is that there are different They’re different sort of approaches to causal inference in RD and the approach that I’ve discussed described which is Exploiting, you know looking at just a treatment effect at the threshold exploiting possible random noise in the assignment variable to give you local randomization on either side of the threshold In identifying off of continuity at the threshold that approach the paper by Hahn, Todd, and Van der Klaauw 2001 which is in the reference and the slide Which I guess will be posted they showed that the local linear regression is consistent for For the for the risk difference at the threshold the there have been various steps to to improve improve the The the you know approaches to choosing the right bandwidth for that local linear regression But you can imagine if you have a curvilinear shape Then the smaller the bandwidth the closer that curve is to linear and so linear as you get smaller and smaller and smaller the linear model provides You know a better and better approximation and in the latest Optimal bandwidth approaches, you can actually then correct the estimate for that bias in improperly fitting the The curvature of the line People have shied away more recent years from the polynomial approach and the reason is that it’s sensitive to the bits of the behavior of the the data away from the threshold and So, you know what happens? Far from the threshold can shape the overall curve of the line and that ends up shaping what happen? you know shaking the fit of the line at the threshold and and some people have shied away from that approach and And favored the local linear regression approach I think the sort of you know the Standard practice is to show robustness to a lot of different specifications and if the result is not robust, then you start to have questions about whether this is a real finding or not Several people have asked how do you pick a good assignment variable? Now, obviously, that would be for a prospective study Not a retrospective study where the assignment variable was chosen presumably by somebody else But if you’re planning one, how do you how do you pick a good assignment variable? Well, you’d like the assignment variable to be continuous and you’d like the threshold to be to matter So, you know, I’ve been approached by students for other collaborators who have something like You know a seven point depression score a nine point depression score and want to use a cut-off on that depression score for RDD Unfortunately the the sparseness in the you know In just those nine points means it really doesn’t it’s not a great Application for RDD You could use you could combine it with with pre-intervention data to model the shape of that relationship between that score and the outcome and Then do a sort of difference-in-differences type approach At different points along the you know at different values of that score. There’s a difference out any you know, the Pre-existing relationship, that’s the direction. I encourage people to go but if you have just you know a few discreet units in your assignment variable, you’re not going to be able to You know You won’t be able to fit regression stable regression lines on either side and you also the sort of theory about taking limits On this continuous variable doesn’t quite hold in the same way. So I Think one of the interesting things is that often The way that risk scores are generated he’s in a continuous framework, but then they’re coarsened and So, you know you would you can imagine a Student who was working on? readmission hospital admissions intervention where there was a Risk score that was used as a basis for assigning patients to love Some additional hand-holding some phone calls some follow-up some case management And what What we encountered was that the underlying the risk score was based on a regression equation That if we had the actual regression equation, we could have had a much more continuous measure of this risk score But it was then coarsened into these discrete categories which made the analysis much harder So I sort of just I’m aware I’m sort of shifting the I’m answering a slightly different question, which is not what’s the good threshold rule But what data do we want and do we need to try to get so that we can implement? RDD for it for a given threshold rule But I think that I think that’s an important and important consideration. It needs to be continuous. The threshold needs to be known we’d like enough data on either side of the threshold and if there’s noise in the value of the assigned variable that’s good for it’s good for inference It’s not ideal if you’re trying to have a clinical cutoff, right? And so there’s actually an interesting case where the clinical optimal risk score You know, which would very clearly identify high risk and low risk patients Is So in some ways there’s there’s a there’s a tension between that on the one hand and the best you know risk score for an RDD design, which would be a random number Which In fact RDD on a random number is identical to a randomized trial, right. If you just sort everybody randomly and choose some threshold, that’s at a randomized trial so so you do you do want some noise but in the absence of noise you want you wanna you want good data on how precisely treatment was allocated how the session Was what data were collected and how they were used you want access to those original data Jacob I’ve got lots more questions, but unfortunately, it’s noon and we promised our audience that we would stop it, so I want to thank you for Your presentation we will share some of these questions with you and encourage you to draft brief answers that we could post on our website So that those who were asking them and I didn’t get a chance to get to their question Would have an opportunity to see an answer I’d be happy to do that Thank You, Dr. Murray and thank you to everyone who participated in today’s webinar on the mind the gap website prevention.nih.gov/mindthegap you will find several resources for this talk including the slides and a list of references We will also be posting a recording of today’s webinar on our website Next week, you’ll receive an email with a link to the recording when it is available. Thank you