Apr 28, 2023 This Week in Cardiology Podcast

John M. Mandrola, MD


April 28, 2023

Please note that the text below is not a full transcript and has not been copyedited. For more insight and commentary on these stories, subscribe to the This Week in Cardiology podcast, download the Medscape app or subscribe on Apple Podcasts, Spotify, or your preferred podcast provider. This podcast is intended for healthcare professionals only.

In This Week’s Podcast

For the week ending April, 28, 2023, John Mandrola, MD comments on the following news and features stories.

First, I am recording from San Antonio. Thanks to Anand Prasad and Shweta Bansal from the University of Texas for the invitation to spread the message of medical conservatism at the 6th Annual Cardio Renal Connections conference.

My friends, this is a special podcast. I normally cover four or five topics. Today I focus heavily on two topics. Next week and in subsequent weeks, This Week in Cardiology will return to its typical format. Let me know what you think.

Cardiac Device Infection

Two studies and one editorial published this month address the issue of cardiac implanted electronic (CIED) device infections. Device infections are another increasing problem of modernity.

That is, people live longer, and with more co-morbidity; there is more time to require a pacemaker because longer life equals more time to get age-related conduction disease. Also, cardiac resynchronization therapy (CRT) and implanted cardioverter defibrillators (ICDs) have become valuable additions in the treatment of heart failure.

And now, in 2023, one of the most common causes of bad CIED infections is generator changes (more on that later). More devices mean more infection.

One study, published in JAMA-Cardiology, was a secondary analysis of the Canadian PADIT trial, originally published in 2018, which compared different antibiotic regimens in nearly 20,000 patients, and found no significant difference in incremental antibiotics vs conventional antibiotics.

The purpose of this observational study using data from the randomized controlled trial (RCT) was to evaluate the association of the extent of timing of a CIED infection with mortality. Think early vs later CIED infection.

The more interesting paper came from the Journal of the American College of Cardiology (JACC) and it was an observational analysis of device infections and lead extraction. Its findings have generated provocative headlines and an editorial that implies doctors are not doing things right.

Background on CIED Infection

Before I tell you about these studies, let’s do some background on CIED infections.

  • I love cardiac devices because pacing for heart block or heart failure (HF) is one of the purest things we do. Pure in that sick patients are asking for help, and we have proven therapies that extend and enhance life.

  • Yet, even when we act on the strongest of grounds, complications can occur. One of the most feared complications of a cardiac device is bacterial infection.

  • When it happens right after placing a new device, the decision is easy; you explant the system and treat with antibiotics and try again after the infection has cleared.

  • But a lot of bacterial infections occur late after device implant. A cellulitis causes transient bacteremia that then attaches to leads. Infection on leads can then affect cardiac valves, and cause endocarditis. Staphylococcus aureus is the worst of the pathogens.

  • Once this process happens, we have a super-serious problem because pacing or ICD leads that have been implanted for years cannot easily be pulled out. The body reacts to foreign material by forming scar around the leads. This causes the lead to attach to the superior vena cava, atria, tricuspid valve, or right ventricle.

Thankfully, iterative technology has advanced the field of transvenous lead extraction (TLE). Yet TLE is not for the meek of heart. TLE is hard. Very hard. It requires extensive training, a back-up surgical team, and — wait for it — experience. Lots of experience. You can’t expect to master TLE at an industry-sponsored course. Master extractors are made over years.

Here is the problem: Infection of leads requiring extraction is uncommon, so it’s hard to build up experience. American healthcare makes it even harder to get the experience because, too often, doctors fear missing out, and instead of designating one or two docs in a system to be the extractors, hospitals may have many low-volume operators instead of a one or two high-volume operators. Industry, too, supports more, not fewer operators.

Now to the Papers

JAMA-Cardiology published the timing of infection paper. The authors took the nearly 20,000 patients in the PADIT trial and focused on the 177 who had infections. They broke these down into two categories — early (<3 months) vs late (3-12 months) and localized (pocket) and systemic (blood) infections.

  • Finding one: The cumulative incidence of infection was 0.6%, 0.7%, and 0.9% within 3, 6, and 12 months, respectively. But the absolute rate of Infection was highest in the first 3 months (0.21% per month), reducing significantly thereafter. 

  • Finding two: Any infection doubled the rate of mortality vs no infection.

  • Finding three: Early localized (pocket) infection caused no increase in mortality, however, early systemic infection caused a big, 3-fold increase in mortality.

  • But surprisingly, the fourth finding was that late local infection had 3.5 times risk of death vs no infection; and late systemic carried 9 times the risk of death vs no infection.

  • The authors concluded that while CIED infections are most common early (in the first 3 months), delayed infections, even localized infections, within the first year are associated with substantial risk.

They write that early detection and treatment of CIED infections may be important. They wonder about the common policy of checking the incision at one week and then not seeing patients again until 12 months.

Good on them, they make relevant observations, don’t make causal conclusions, and suggest considering a slightly different follow-up protocol. First author Hui-Chen Han from UBC.

Contrast This With the JACC Paper.

This was a survey of the Nationwide Readmissions Database (NRD) of 25,000 admissions for patients with CIEDs and endocarditis from 2016 to 2019. They used ICD-10 codes to identify patients.

The goals of this study were to document use and trends of TLE over the 3-year period, find predictors of use of TLE, and describe the association of TLE with all-cause death.

The NRD is a database of de-identified hospital inpatient discharges and readmissions that allows for national estimates of hospital utilization. I will not object to parts of this study, but regarding others I will have strong objections.

  • First, the rate of TLE use was 11%. That’s interesting in that 89% of patients with devices and endocarditis did not receive lead extractions. It is so low that it makes me wonder about the validity of the data.

  • The second non-controversial finding was that the rate of TLE use increased over the 3-year period. Procedural complications were around 3%.

  • Predictors of having TLE are also not too surprising. S aureus was the strongest predictor (3 times). Having dementia, cerebrovascular disease, kidney disease, or drug use were all predictors of not having extraction.

Again, these findings are good uses of observational data. Science tells us what we can do, trials tell us what we should do and registries tell us what we are doing.

The problem comes in the temptation to compare patients who had vs did not have TLE. Of course, these are non-random comparisons in which a clinician made a decision to extract or not.

  • Table 1 — patient characteristics – shows massive differences. Patients who get TLE are younger, male dominated, more likely to have ICDs. Patients who did not undergo TLE had more cerebrovascular disease, dementia, and kidney disease.

  • You know what comes next: Attempts at matching. They used propensity matching.

  • But this cannot easily match randomization, especially given the patient variation, hospital variation, and likely different levels of TLE expertise.

  • Index mortality was significantly lower among patients managed with TLE (6.0% vs 9.5%; P < 0.001).

  • After adjustment for comorbidities, TLE was still independently associated with significantly lower odds of mortality (adjusted odds ratio [OR]: 0.47; 95% confidence interval [CI]: 0.37-0.60).

  • The authors concluded that use of TLE among patients with devices and endocarditis was low, even in the presence of low rates of procedural complications.

Lead extraction is associated with significantly lower mortality.  Barriers to TLE for patients with CIEDs and endocarditis require investigation.

The authors write in their discussion: “To our knowledge, our study is the largest to date to demonstrate reduction in mortality associated with TLE management of patients with CIEDs and infective endocarditis.” Recall that if your sample is biased, it doesn’t matter if you have a million patients.

The editorialists also leap to causal language:

“The messages from the current and prior reports on the issue CIED infections are clear: 1) CIED infections increase mortality; 2) extractions reduce mortality; and 3) extraction-related complications and mortality are low at experienced centers.”

They write: ”"It is imperative for all health care providers managing patients with CIEDs to recognize CIED-related infections and make early referrals for extraction.” 


  • I agree that CIED infections, especially those that cause bacteremia are terrible. Having seen the terrible consequences of CIED infections is one of the main reasons I am conservative about who should get an implant. (In my opinion, doctors discount the seriousness of having a device with leads in the heart.)

  • I also believe strongly in meticulously evaluating the harm/benefit calculus in every generator change. Generator changes are major sources of infection, and many can sometimes be avoided with good doctoring, including frank discussions with patients and families. If you don’t open an incision to change a device, you don’t incur the risk of infection.

  • I also believe there is probably some degree of under-treatment of CIED infections, especially in rural settings with limited access to TLE centers of excellence.

  • But we can raise awareness, educate, and improve access, without using flawed studies and flawed causal conclusions. Like those in this paper. The limitations of this analysis are many.

  • The NRD is problematic.

As the authors write, the NRD only includes information from 22 states in the United States. The NRD is derived from ICD-10-CM codes, which are also limited. Missing data is likely. Both are almost disqualifying in and of themselves.

Mostly though, non-random comparisons of two groups (those who get TLE vs those who do not) is a deeply flawed analytic method. Propensity matching from this sort of data is extremely unlikely to approximate randomization. The 11% rate of TLE should have indicated that there was something amiss with this data.

Real World Evidence vs RCT

Speaking of causal inference, the Journal of the American Medical Association has published an incredibly important paper. One of the most important of 2023.

This was from a group of expert epidemiologists who are part of the RCT-Duplicate initiative. Their project, which is supported by the US Food and Drug Administration (FDA), aims to better understand real-world evidence studies.

Specifically, these authors tried to emulate RCT designs using insurance databases. They did this by identifying and implementing observational analogues of RCT design — population, intervention, comparator, outcome, time (PICOT).

  • Population, (say HF patients)

  • Intervention (say drug)

  • Comparator (say placebo or standard of care)

  • Outcome (say mortality)

  • Time frame

Then they applied confounding adjustments to the non-random comparisons. Much as an RCT does, these calculations output an effect size. Usually an odds ratio.

The question is, how good are these estimates? So, for the final and ultimate comparison, they compared the outcomes of these emulated trials to the actual RCTs.

You can see the massive implications of such research, right? I see two right off the bat.

  • One is that we know that RCTs are the best way to make causal inference of a therapy and an outcome. Randomization balances confounding variables, known and unknown. Both groups in RCTs start at the same time. Outcomes are adjudicated.

  • But even when the internal validity of an RCT is impeccable the problem comes in applying that evidence outside the special confines of a trial to your patient in the real-world.

  • The first implication, therefore, is the need to test the external validity of trial results. How does treatment A work outside the trial?

  • The second implication is that RCTs are hard to do, expensive, and some (perhaps most) things we do in Medicine are not amenable to an RCT. So, if we could extract data from an insurance database, for instance, and emulate a trial, that might inform care.

Here is what these researchers did. They selected 30 completed RCTs and two ongoing trials to attempt emulation from real world data.

Selection is important. The authors emphasize that these were highly selective choices. There had to be similar study parameters in the insurance databases, and the emulation had to be feasible in that the PICOT were all available in the databases. Many but not all the trials were cardiac.

The emulations stemmed from two insurance databases and Medicare. The authors have previously outlined how they did the emulation. Suffice to say this was 100 times more than propensity matching two groups of patients.

  • They extracted the study parameters (PICOT)

  • They then checked for feasibility (power, balance of the groups) and If it was not feasible to emulate a trial, they did not try.

  • They then documented primary and secondary analyses including looking at control outcomes. That’s important because you’d want the control outcomes of the real world data to be similar to that in the trial.

  • They then registered their protocol at clinical

  • Then they did the analysis and compared results of the real world evidence with the RCT results.

Imagine pairs. One part of the pair was the effect size and CI of the RCT. The other pair was the effect size and CI from the emulation.

They assessed the emulations in four ways. One was an overall correlation. Taking all the studies together. The pairs were assessed for 1) full statistical significance agreement if the estimates of both including the CI were on the same side of null; 2) estimate agreement, if the rea world evidence estimate fell within the 95% CI of the trial; 3) standardized difference agreement defined by less than two standard deviation differences.

And the results:

  • The overall agreement had a correlation of 0.82 (0.64-0.91).

  • 75% of real world evidence emulations met statistical significance (but only 56% full statistical agreement, 19% partial).

  • 66% of real world evidence emulations met estimate agreements.

  • 75% of real world evidence emulations met standardized difference agreements.

Pretty good, right?

Well, then the authors did a post-hoc analysis wherein they looked at the pairs that had closer emulation to the trial. In other words, the data in the database more closely approximated the trial. Here it was better.

  • 94% of emulations met statistical significance;

  • 88% met estimate agreements and standardized difference agreement.

However, the flipside of these good results were that in the 16 emulations that were not close to the trial, the concordance was quite weak, where only 50% of pairs were similar.

One barrier: 10 trials used placebo, but of course, there are no placebos in real world evidence. So, they emulated placebo with “new use of an active comparator that was strongly expected to have no effect.”

Other problematic issues with emulation included dose titration during follow-up, run-in window, drugs that were initiated in hospitals, delayed effects over longer follow-up, and of course confounding.

Comments. I realize some of this borders on technical, but the issue of how we know things and with what certainty has to be one of the most important things in all of medicine. Perhaps in all of science.

Healthcare data is ubiquitous, especially now with electronic health records. RCTs are expensive and hard to do. The idea of being able to harness all that data to increase our knowledge is enticing.

I find this report nuanced but sobering. Here is my thinking.

This is best-case scenario. These are clearly top people in causal inference. Their methods are rigorous, well documented, pre-registered, and they only attempted trial emulations when it was feasible. They carefully selected the trials to compare. They note that many trials simply cannot be emulated with real world evidence because the data isn’t similar enough.

And yet, in this best-case scenario, full emulation of RCT was modest. Let’s say it’s about 75%. That’s good, but the flip side is that 1 in 4 carefully done trial emulations, from top people, using rigorous methods, did not emulate an RCT.

One example stands out. PARADIGM-HF was a trial they chose to emulate. Recall that PARADIGM HF was an ARNI (sacubitril/valsartan) vs enalapril in patients with HF with reduced ejection fraction. Everyone knows the outcome. Sacubitril/valsartan was much better and now the drug is established as first-line therapy. There’s controversy surrounding this approval because the FDA often requires two trials, and there was the issue of enalapril being a soft comparator.

But the real world emulation came up with an OR of 1.0. No effect. Pause a moment. Let’s consider why PARADIGM doesn’t emulate.

  • One take is that the trial is correct and the real world emulation is wrong. Purists will argue this way. I suspect some of my HF colleagues would say come on Mandrola, the RCT trumps any trial emulation. And maybe that is true too. In support of this view are the overall results of the emulation paper finding that full statistical agreement was nearly a coin toss.

  • Another possibility to explain this discordance was that PARADIGM HF was flawed and the null real world effect is correct.

  • Here the argument is, as the editorialist suggests, that RCT evidence is just one component of causation. To confirm causal effects in patients, you’d like confirmation (external validity) in real world evidence.

  • The discordant effects suggest that real world trial emulation of sacubitril/valsartan does not complete what he calls the “complete causal mechanism.” In other words, if sacubitril/valsartan was truly beneficial it would be positive in both the trial and the real world.

  • But here is the core problem: I don’t know what to believe. Is the real world emulation that finds no effect correct, or is the trial correct, or are both correct, and the trial enrolled such special patients under special circumstances that it isn’t duplicated?

I was struck by this comment in the editorial: "After all, emulation of a single trial requires hundreds of subjective decisions.” And, of course, so does doing an RCT.

I am left with Dr David Cohen’s conclusion on observational studies: That is, some observational studies are correct, I just don’t know which ones.

Another caveat. It is absolutely true that many RCTs do not replicate. Rheumatologist Mike Putnam cited two RCTs of a drug for systemic lupus that were almost identical trials. Same primary endpoint, same kitschy acronym, and similar methods published the same day in the Lancet. One was significant, one null.

And I will cite a JAMA paper from a Stanford group that finds that re-analysis of trials — often by the same authors — fails to replicate 1 in 3 times.

  • To me, this means that when there are small effect sizes, which represents most of modern-day cardiology, we should have low confidence in benefits.

  • And my final statement here is to the notion that this sort of good, almost, matching of real world data and trials shows that we may be able to use these techniques for situations when there is a) no RCT data, or b) a question not amenable to RCTs.

  • I strongly oppose this notion, and this is why: Here are experts using the highest-level techniques, and while they matched trials often, e.g. they were correct, they were mismatched enough to be worrisome.

  • To make matters worse, if there was no RCT data in an area, how would we even know what to think about the emulation? Was it like those that matched trials or was it like the PARADIGM-HF emulation?

Being wrong about causation can lead to disastrous consequences. The downside risk is enormous. Doctors can be easily swayed. Therapeutic fashions, once established, are hard to break. Think antiarrhythmic drugs after myocardial infarction, hormone replacement therapy for cardiovascular protection in women, etc. I would also ask my colleagues to consider, what if they are wrong about left atrial appendage occlusion? We’ve done 100,000 cases in the absence of RCT level data.

Finally, consider this scenario: Imagine an observational study that is now called an emulation study, and it finds a benefit of a costly drug or device. Editorialists, key opinion eaders, professional societies, along with the huge marketing forces could then be put into motion because this was no readmissions database study with propensity matching, it was a trial emulation!

But in the absence of evidence, we would have no idea if the emulation study was correct, as in the case of the PARADIGM-HF example.

My thoughts on knowledge remain unchanged.

  • There is a heck of lot more equipoise than we think.

  • We need to randomize more often.

  • Trial emulation efforts should continue because this is a decent start.

  • But I would place more emphasis on creating networks wherein trials are more feasible to do.

Please do read this paper. Let me know what you think.

Multi-Morbidity Effects in Trials

Another feature you may like, which is related to the translation of evidence, is a discussion I had with my friend Andrew Foy about applying trial results to patients.

One of Andrew’s areas of research involves the way in which multimorbidity interacts with treatment effects. The novel idea he and his group has is to create a co-morbidity score with the Charlson Comorbidity Index score (CCI). The CCI is a way to quantify total co-morbidity. This is different from the typical subgroup analysis you see in papers. These graphs consider the effect of say diabetes, or chronic kidney disease or whichever. But it’s hard to apply that to patients in the clinic.

You can listen or read the transcript.


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.