Medical Evidence Blog: non-inferiority

Showing posts with label non-inferiority. Show all posts

Thursday, September 8, 2016

Hiding the Evidence in Plain Sight: One-sided Confidence Intervals and Noninferiority Trials

In the last post, I linked a video podcast of my explaining non-inferiority trials and their inherent biases. In this videocast, I revisit noninferiority trials and the use of one-sided confidence intervals. I review the Salminen et al noninferiority trial of antibiotics versus appendectomy for the treatment of acute appendicitis in adults. This trial uses a very large delta of 24%. The criteria for non-inferiority were not met even with this promiscuous delta. But the use of a 1-sided 95% confidence interval concealed a more damning revelation in the data. Watch the 13 minute videocast to learn what was hidden in plain sight!

Erratum: at 1:36 I say "excludes an absolute risk difference of 1" and I meant to say "excludes an absolute risk difference of ZERO." Similarly, at 1:42 I say "you can declare non-inferiority". Well, that's true, you can declare noninferiority if your entire 95% confidence interval falls to the left of an ARD of 0 or a HR of 1, but what I meant to say is that if that is the case "you can declare superiority."

Also, at 7:29, I struggle to remember the numbers (woe is my memory!) and I place the point estimate of the difference, 0.27, to the right of the delta dashed line at .24. This was a mistake which I correct a few minutes later at 10:44 in the video. Do not let it confuse you, the 0.27 point estimates were just drawn slightly to the right of delta and they should have been marked slightly to the left of it. I would re-record the video (labor intensive) or edit it, but I'm a novice with this technological stuff, so please do forgive me.

Finally, at 13:25 I say "within which you can hide evidence of non-inferiority" and I meant "within which you can hide evidence of inferiority."

Again, I apologize for these gaffes. My struggle (and I think about this stuff a lot) in speaking about and accurately describing these confidence intervals and the conclusions that derive from them result from the arbitrariness of the CONSORT "rules" about interpretation and the arbitrariness of the valences (some articles use negative valence for differences favoring "new" some journals use positive values to favor "new"). If I struggle with it, many other readers, I'm sure, also struggle in keeping things straight. This is fodder for the argument that these "rules" ought to be changed and made more uniform, for equity and ease of understanding and interpretation of non-inferiority trials.

It made me feel better to see this diagram in Annals of Internal Medicine (Perkins et al July 3, 2012, online ACLS training) where they incorrectly place the point estimate at slightly less than -6% (to the left of the dashed delta line in the Figure 2), when it should have been placed slightly greater than -6% (to the right of the dashed delta line). Clicking on the image will enlarge it.

Saturday, June 11, 2016

Non-inferiority Trials Are Inherently Biased: Here's Why

Debut VideoCast for the Medical Evidence Blog, explaining non-inferiority trial design and exposing its inherent biases:

In this related blog post, you can find links to the CONSORT statement in the Dec 26, 2012 issue of JAMA and a link to my letter to the editor.

Addendum: I should have included this in the video. See the picture below. In the first example, top left, the entire 95% CI favoring "new" therapy lies in the "zone of indifference", that is, the pre-specified margin of superiority, a mirror image of the "pre-specified margin of noninferiority, in this case delta= +/- 0.15. Next down, the majority of the 95% CI of the point estimate favoring "new" therapy lies in the "margin of superiority" - so even though the lower end of the 95% CI crosses "mirror delta", the best guess is that the effect of therapy falls in the zone of indifference. In the lowest example, labeled "Truly Superior", the entire 95% confidence interval falls to the left of "mirror delta" thus reasonable excluding all point estimates in the "zone of indifference" (i.e. +/- delta) and all point estimates favoring the "old" therapy. This would, in my mind, represent "true superiority" in a logical, rational, and symmetrical way that would be very difficult to mount arguments against.

Added 9/20/16: For those who question my assertion that the designation of "New" versus "Old" or "comparator" therapy is arbitrary, here is the proof: In this trial, the "New" therapy is DMARDs and the comparator is anti-tumour necrosis factor agents for the treatment of rheumatoid arthritis. The rationale for this trial is that the chronologically newer anti-TNF agents are very costly, and the authors wanted to see if similar improvements in quality of life could be obtained with chronologically older DMARDs. So what is "new" is certainly in the eye of the beholder. Imagine colistin 50 years ago, being tested against, say, a newer spectrum penicillin. The penicillin would have been found to be non-inferior, but with a superior side effect profile. Fast forward 50 years and now colistin could be the "new" resurrected agent and be tested against what 10 years ago was the standard penicillin but is now "old" because of development of resistance. Clearly, "new" and "old" are arbitrary and flexible designations.

Monday, May 20, 2013

It All Hinges on the Premises: Prophylactic Platelet Transfusion in Hematologic Malignancy

A quick update before I proceed with the current post: The Institute of Medicine has met and they agree with me that sodium restriction is for the birds. (Click here for a New York Times summary article.) In other news, the oh-so-natural Omega-3 fatty acid panacea did not improve cardiovascular outcomes as reported in the NEJM on May 9th, 2013.

An article by the TOPPS investigators in the May 9th NEJM is very useful to remind us not to believe everything we read, to always check our premises, and that some data are so dependent on the perspective from which they're interpreted or the method or stipulations of analysis that they can be used to support just about any viewpoint.

The authors sought to determine if a strategy of withholding prophylactic platelet transfusions for platelet counts below 10,000 in patients with hematologic malignancy was non-inferior to giving prophylactic platelet transfusions. I like this idea, because I like "less is more" and I think the body is basically antifragile. But non-inferior how? And what do we mean by non-inferior in this trial?

David versus Goliath on the Battlefield of Non-inferiority: Strangeness is in the Eye of the Beholder

In this week's JAMA is my letter to the editor about the CONSORT statement revision for the reporting of non-inferiority trials, and the authors' responses. I'll leave it to interested readers to view for themselves the revised CONSORT statement, and the letter and response.

In sum, my main argument is that Figure 1 in the article is asymmetric, such that inferiority is stochastically less likely than superiority and an advantage is therefore conferred to the "new" [preferred; proprietary; profitable; promulgated] treatment in a non-inferiority trial. Thus the standards for interpretation of non-inferiority trials are inherently biased. There is no way around this, save for revising the standards.

The authors of CONSORT say that my proposed solution is "strange" because it would require revision of the standards of interpretation for superiority trials as well. For me it is "strange" that we would endorse asymmetric and biased standards of interpretation in any trial. The compromised solution, as I suggested in my letter, is that we force different standards for superiority only in the context of a non-inferiority trial. Thus, superiority trial interpretation standards remain untouched. It is only if you start with a non-inferiority trial that you have a higher hurdle to claiming superiority that is contingent on evidence of non-inferiority in the trial that you designed. This would disincentivise the conduct of non-inferiority trials for a treatment that you hope/think/want to be superior. In the current interpretation scheme, it's a no-brainer - conduct a non-inferiority trial and pass the low hurdle for non-inferiority, and then if you happen to be superior too, BONUS!

In my proposed scheme, there is no bonus superiority that comes with a lower hurdle than inferiority. As I said in the last sentence, "investigators seeking to demonstrate superiority should design a superiority trial." Then, there is no minimal clinically important difference (MCID) hurdle that must be cleared, and a statistical difference favoring new therapy by any margin lets you declare superiority. But if you fail to clear that low(er) hurdle, you can't go back and declare non-inferiority.

Which leads me to something that the word limit of the letter did not allow me to express: we don't let unsuccessful superiority trials test for non-inferiority contingently, so why do we let successful non-inferiority trials test for superiority contingently?

Symmetry is beautiful; Strangeness is in the eye of the beholder.

(See also: Dabigatran and Gefitinib especially the figures, analogs of Figure 1 of Piaggio et al, on this blog.)

Monday, September 21, 2009

The unreliable assymmetric design of the RE-LY trial of Dabigatran: Heads I win, tails you lose

I'm growing weary of this. I hope it stops. We can adapt the diagram of non-inferiority shenanigans from the Gefitinib trial (see http://medicalevidence.blogspot.com/2009/09/theres-no-such-thing-as-free-lunch.html ) to last week's trial of dabigatran, which came on the scene of the NEJM with another ridiculously designed non-inferiority trial (see http://content.nejm.org/cgi/content/short/361/12/1139 ). Here we go again.

These jokers, lulled by the corporate siren song of Boehringer Ingelheim, had the utter unmitigated gall to declare a delta of 1.46 (relative risk) as the margin of non-inferiority! Unbelievable! To say that a 46% difference in the rate of stroke or arterial clot is clinically non-significant! Seriously!?

They justified this felonious choice on the basis of trials comparing warfarin to PLACEBO as analyzed in a 10-year-old meta-analysis. It is obvious (or should be to the sentient) that an ex-post difference between a therapy and placebo in superiority trials does not apply to non-inferiority trials of two active agents. Any ex-post finding could be simply fortuitously large and may have nothing to do with the MCID (minimal clinically important difference) that is SUPPOSED to guide the choice of delta in a non-inferiority trial (NIT). That warfarin SMOKED placebo in terms of stroke prevention does NOT mean that something that does not SMOKE warfarin is non-inferior to warfarin. This kind of duplicitious justification is surely not what the CONSORT authors had in mind when they recommended a referenced justification for delta.

That aside, on to the study and the figure. First, we're testing two doses, so there are multiple comparisons, but we'll let that slide for our purposes. Look at the point estimate and 95% CI for the 110 mg dose in the figure (let's bracket the fact that they used one-sided 97.5% CIs - it's immaterial to this discussion). There is a non-statistically significant difference between dabigatran and warfarin for this dose, with a P-value of 0.34. But note that in Table 2 of the article, they declare that the P-value for "non-inferiority" is <0.001 [I've never even seen this done before, and I will have to look to see if we can find a precedent for reporting a P-value for "non-inferiority"]. Well, apparently this just means that the RR point estimate for 110 mg versus warfarin is statistically significantly different from a RR of 1.46. It does NOT mean, but it is misleadingly suggested that the comparison between the two drugs on stroke and arterial clot is highly clinically significant, but it is not. This "P-value for non-inferiority" is just an artifical comparison: had we set the margin of non-inferiority at a [even more ridiculously "P-value for non-inferiority" as small as we like by just inflating the margin of non-inferiority! So this is a useless number, unless your goal is to create an artificial and exaggerated impression of the difference between these two agents.

Now let's look at the 150 mg dose. Indeed, it is statistically significantly different than warfarin (I shall resist using the term "superior" here), and thus the authors claim superiority. But here again, the 95% CI is narrower than the margin of non-inferiority, and had the results gone the other direction, as in Scenarios 3 and 4, (in favor of warfarin), we would have still claimed non-inferiority, even though warfarin would have been statistically significantly "better than" dabigatran! So it is unfair to claim superiority on the basis of a statistically significant result favoring dabigatran, but that's what they do. This is the problem that is likely to crop up when you make your margin of non-inferiority excessively wide, which you are wont to do if you wish to stack the deck in favor of your therapy.

But here's the real rub. Imagine if the world were the mirror image of what it is now and dabigatran were the existing agent for prevention of stroke in A-fib, and warfarin were the new kid on the block. If the makers of warfarin had designed this trial AND GOTTEN THE EXACT SAME DATA, they would have said (look at the left of the figure and the dashed red line there) that warfarin is non-inferior to the 110 mg dose of dabigatran, but that it was not non-inferior to the 150 mg dose of dabigatran. They would NOT have claimed that dabigatran was superior to warfarin, nor that warfarin was inferior to dabigatran, because the 95% CI of the difference between warfarin and dabigatran 150 mg crosses the pre-specified margin of non-inferiority. And to claim superiority of dabigatran, the 95% CI of the difference would have to fall all the way to the left of the dashed red line on the left. (See Piaggio, JAMA, 2006.)

The claims that result from a given dataset should not depend on who designs the trial, and which way the asymmetry of interpretation goes. But as long as we allow asymmetry in the interpretation of data, they shall. Heads they win, tails we lose.

Tuesday, September 15, 2009

Plavix (clopidogrel), step aside, and prasugrel (Effient), watch your back: Ticagrelor proves that some "me-too" drugs are truly superior

Another breakthrough is reported in last week's NEJM: http://content.nejm.org/cgi/content/abstract/361/11/1045 . Wallentin et al report the results of the PLATO trial showing that ticagrelor, a new reversible inhibitor of P2Y12 is superior to Plavix in just about every imaginable way. Moreover, when you compare the results of this trial to the trial of prasugrel (Effient, recently approved, about which I blogged here: http://medicalevidence.blogspot.com/2007/11/plavix-defeated-prasugrel-is-superior.html ), it appears that ticagrelor is going to be preferable to prasugrel in at least 2 ways: 1.) a larger population can benefit (AMI versus just patients undergoing PCI); and 2.) less bleeding, which may be a result of reversible rather than irreversible inhibition of P2Y12.

I will rarely be using either of these drugs or Plavix because I rarely treat AMI or patients undergoing PCI. My interest in this trial and that of prasugrel stems from the fact that in the cases of these two agents, the sponsoring company indeed succeeded in making what is in essence a "me-too" drug that is superior to an earlier-to-market agent(s). They did not monkey around with this non-inferiority trial crap like anidulafungin and gefitinib and just about every antihypertensive that has come to market in the past 10 years, they actually took Plavix to task and beat it, legitimately. For this, and for the sheer size of the trial and its superb design, they deserve to be commended.

One take-home message here, and from other posts on this blog is "beware the non-inferiority trial". There are a number of reasons that a company will choose to do a non-inferiority trial (NIT) rather than a superiority trial. First, as in the last post (http://medicalevidence.blogspot.com/2009/09/theres-no-such-thing-as-free-lunch.html ) running a NIT often allows you to have your cake and eat it too - you can make it easy to claim non-inferiority (wide delta) AND make the criterion for superiority (of your agent) more lenient than the inferiority criterion, a conspicuous asymmetry that just ruffles my feathers again and again. Second, you don't run the risk of people saying after the fact "that stuff doesn't work," even though absence of evidence does not constitute evidence of absence. Third, you have great latitude with delta in a NIT and that's appealing from a sample size standpoint. Fourth, you don't actually have to have a better product which might not even be your goal, which is rather to get market share for an essentially identical product. Fifth, maybe you can't recruit enough patients to do a superiority trial. The ticagrelor trial recruited over 18,000 patients. You can look at this in two ways. One is that the difference they're trying to demonstrate is quite small, so what does it matter to you? (If you take this view, you should be especially dismissive of NITs, since they're not trying to show any difference at all.) The other is that if you can recruit 18,000 patients into a trial, even a multinational trial, the problem that is being treated must be quite prevalent, and thus the opportunity for impact from a superior treatment, even one with a small advantage, is much greater. It is much easier and more likely, in a given period of time, to treat 50 acute MIs and save a life with ticagrelor (compared to Plavix - NNT=50=[1/0.02]) than it is to find 8 patients with invasive candidiasis and treat them with anidulafungin (compared to fluconazole; [1/.12~8]; see Reboli et al: http://content.nejm.org/cgi/reprint/356/24/2472.pdf ), and in that latter case, you're not saving one life but rather just preventing a treatment failure. Thus, compared to anidulafungin, with its limited scope of application and limited impact, a drug like ticagrelor has much more public health impact. You should simply pay more attention to larger trials, there's more likely to be something important going on there. By inference, the conditions they are treating are likely to be a "bigger deal".

Of course, perhaps I'm giving the industry too much credit in the cases of prasugrel and ticagrelor. Did they really have much of a choice? Probably not. Generally, when you do a non-inferiority trial, you try to show non-inferiority and also something like preferable dosing schedules, reduced cost or side effects. That way, when the trial is done (if you have shown non-inferiority), you can say, "yeah, they have basically the same effect on xyz, but my drug has better [side effects, dosing, etc.]". Because of the enhanced potency of prasugrel and ticagrelor, they knew there would be more bleeding and that this would cause alarm. So they needed to show improved mortality (or similar) to show that that bleeding cost is worth paying. Regardless, it is refreshing to see that the industry is indeed designing drugs with demonstrable benefits over existing agents. I am highly confident that the FDA will find ticagrelor to be approvable, and I wager that it will quickly supplant prasugrel. I also wager that when clopidogrel goes generic (soon), it will be a boon for patients who can know that they are sacrificing very little (2% efficacy compared to ticagrelor of prasugrel) for a large cost savings. For most people, this trade-off will be well worth it. For those fortunate enough to have insurance or another way of paying for ticagrelor, more power to them.

Sunday, September 6, 2009

There's no such thing as a free lunch - unless you're running a non-inferiority trial. Gefitinib for pulmonary adenocarcinoma

A 20% difference in some outcome is either clinically relevant, or it is not. If A is worse than B by 19% and that's NOT clinically relevant and significant, then A being better than B by 19% must also NOT be clinically relevant and significant. But that is not how the authors of trials such as this one see it: http://content.nejm.org/cgi/content/short/361/10/947 . According to Mok and co-conspirators, if gefitinib is no worse in regard to progression free survival than Carboplatin-Paclitaxel based on a 95% confidence interval that does not include 20% (that is, it may be up to 19.9% worse, but not more worse), then they call the battle a draw and say that the two competitors are equally efficacious. However, if the trend is in the other direction, that is, in favor of gefitinib BY ANY AMOUNT HOWEVER SMALL (as long as it's statistically significant), they declare gefinitib the victor and call it a day. It is only because of widespread lack of familiarity with non-inferiority methods that they can get away with a free lunch like this. A 19% difference is either significant, or it is not. I have commented on this before, and it should come as no surprise that these trials are usually used to test proprietary agents (http://content.nejm.org/cgi/content/extract/357/13/1347 ). Note also that in trials of adult critical illness, the most commonly sought mortality benefit is about 10% (more data on this forthcoming in a article soon to be submitted and hopefully published). So it's a difficult argument to subtend to say that something is "non-inferior" if it is less than 20% worse than something else. Designers of critical care trials will tell you that a 10% difference, often much less, is clinically significant.

I have created a figure to demonstrate the important nuances of non-inferiority trials using the gefitinib trial as an example. (I have adapted this from the Piaggio 2006 JAMA article of the CONSORT statement for the reporting of non-inferiority trials - a statement that has been largely ignored: http://jama.ama-assn.org/cgi/content/abstract/295/10/1152?lookupType=volpage&vol=295&fp=1152&view=short .) The authors specified delta, or the margin of non-inferiority, to be 20%. I have already made it clear that I don't buy this, but we needn't challenge this value to make war with their conclusions, although challenging it is certainly worthwhile, even if it is not my current focus. This 20% delta corresponds to a hazard ratio of 1.2, as seen in the figure demarcated by a dashed red line on the right. If the hazard ratio (for progression or death) demonstrated by the data in the trial were 1.2, that would mean that gefitinib is 20% worse than comparator. The purpose of a non-inferiority trial is to EXCLUDE a difference as large as delta, the pre-specified margin of non-inferiority. So, to demonstrate non-inferiority, the authors must show that the 95% confidence interval for the hazard ratio falls all the way to the left of that dashed red line at HR of 1.2 on the right. They certainly achieved this goal. Their data, represented by the lowermost point estimate and 95% CI, falls entirely to the left of the pre-specified margin of non-inferiority (the right red dashed line). I have no arguments with this. Accepting ANY margin of non-inferiority (delta), gefitinib is non-inferior to the comparator. What I take exception to is the conclusion that gefitinib is SUPERIOR to comparator, a conclusion that is predicated in part on the chosen delta, to which we are beholden as we make such conclusions.

First, let's look at [hypothetical] Scenario 1. Because the chosen delta was 20% wide (and that's pretty wide - coincidentally, that's the exact width of the confidence interval of the observed data), it is entirely possible that the point estimate could have fallen as pictured for Scenario 1 with the entire CI between an HR of 1 and 1.2, the pre-specified margin of non-inferiority. This creates the highly uncomfortable situation in which the criterion for non-inferiority is fulfilled, AND the comparator is statistically significantly better than gefitinib!!! This could have happened! And it's more likely to happen the larger you make delta. The lesson here is that the wider you make delta, the more dubious your conclusions are. Deltas of 20% in a non-inferiority trial are ludicrous.

Now let's look at Scenarios 2 and 3. In these hypothetical scenarios, comparator is again statistically significantly better than gefitinib, but now we cannot claim non-inferiority because the upper CI falls to the right of delta (red dashed line on the right). But because our 95% confidence interval includes values of HR less than 1.2 and our delta of 20% implies (or rather states) that we consider differences of less than 20% to be clinically irrelevant, we cannot technically claim superiority of comparator over gefitinib either. The result is dubious. While there is a statistically significant difference in the point estimate, the 95% CI contains clinically irrelevant values and we are left in limbo, groping for a situation like Scenario 4, in which comparator is clearly superior to gefitinib, and the 95% CI lies all the way to the right of the HR of 1.2.

Pretend you're in Organic Chemistry again, and visualize the mirror image (enantiomer) of scenario 4. That is what is required to show superiority of gefitinib over comparator - a point estimate for the HR whose 95% CI does not include delta or -20%, an HR of 0.8. The actual results come close to Scenario 5, but not quite, and therefore, the authors are NOT justified in claiming superiority. To do so is to try to have a free lunch, to have their cake and eat it too.

You see, the larger you make delta, the easier it is to achieve non-inferiority. But the more likely it is also that you might find a statistically significant difference favoring comparator rather than the preferred drug which creates a serious conundrum and paradox for you. At the very least, if you're going to make delta large, you should be bound by your honor and your allegiance to logic and science to make damned sure that to claim superiority, your 95% confidence interval must not include negative delta. If not, shame on you. Eat your free lunch if you will, but know that the ireful brow of logic and reason is bent unfavorably upon you.

Tuesday, March 10, 2009

PCI versus CABG - Superiority is in the heart of the angina sufferer

In the current issue of the NEJM, Serruys et al describe the results of a multicenter RCT comparing PCI with CABG for severe coronary artery disease: http://content.nejm.org/cgi/content/full/360/10/961. The trial, which was designed by the [profiteering] makers of drug-coated stents, was a non-inferiority trial intended to show the non-inferiority (NOT the equivalence) of PCI (new treatment) to CABG (standard treatment). Alas, the authors appear to misunderstand the design and reporting of non-inferiority trials, and mistakenly declare CABG as superior to PCI as a result of this study. This error will be the subject of a forthcoming letter to the editor of the NEJM.

The findings of the study can be summarized as follows: compared to PCI, CABG led to a 5.6% reduction in the combined endpoint of death from any cause, stroke, myocardial infarction, or repeat vascularization (P=0.002). The caveats regarding non-inferiority trials notwithstanding, there are other reasons to call into question the interpretation that CABG is superior to PCI, and I will enumerate some of these below.

1.) The study used a ONE-SIDED 95% confidence interval - shame, shame, shame. See: http://jama.ama-assn.org/cgi/content/abstract/295/10/1152 .
2.) Table 1 is conspicuous for the absence of cost data. The post-procedural hospital stay was 6 days longer for CABG than PCI, and the procedural time was twice as long - both highly statistically and clinically significant. I recognize that it would be somewhat specious to provide means for cost because it was a multinational study and there would likely be substantial dispersion of cost among countries, but it seems like neglecting the data altogether is a glaring omission of a very important variable if we are to rationally compare these two procedures.
3.) Numbers needed to treat are mentioned in the text for variables such as death and myocardial infarction that were not individually statistically significant. This is misleading. The significance of the composite endpoint does not allow one to infer that the individual components are significant (they were not) and I don't think it's conventional to report NNTs for non-significant outcomes.
4.) Table 2 lists significant deficencies and discrepancies between pharmocological medical management at discharge which are inadequately explained as mentioned by the editorialist.
5.) Table 2 also demonstrates a five-fold increase in amiodarone use and a three-fold increase in warfarin use at discharge among patients in the CABG group. I infer this to represent an increase in the rate of atrial fibrillation in the CABG patients, but because the rates are not reported, I am kept wondering.
6.) Neurocognitive functioning and the incidence of defecits (if measured), known complications of bypass, are not reported.
7.) It is mentioned in the discussion that after consent, more patients randomized to CABG compared to PCI withdrew consent, a tacit admission of the wariness of patients to submit to this more invasive procedure.

In all, what this trial does for me is to remind me to be wary of an overly-simplistic interpretation of complex data and a tendency toward dichotimous thinking - superior versus inferior, good versus bad, etc.

One interpretation of the data is that a 3.4 hour bypass surgery and 9 days in the hospital !MIGHT! save you from an extra 1.7 hour PCI and another 3 days in the hospital on top of your initial committment of 1.7 hours of PCI and 3 days in the hospital if you wind up requiring revascularization, the primary [only] driver of the composite endpoint. And in payment for this dubiously useful exchange, you must submit to a ~2% increase in the risk of stroke, have a cracked chest, risk surgical wound infection (rate of which is also not reported) pay an unknown (but probably large) increased financial cost, risk some probably large increased risk of atrial fibrillation and therefore be discharged on amiodarone and coumadin with their high rates of side effects and drug-drug interactions, while coincidentally risk being discharged on inadequate medical pharmacological management.

Looked at from this perspective, one sees that beauty is truly in the eye of the beholder.

Saturday, September 15, 2007

Idraparinux, the van Gogh investigators, and clinical trials pointillism: connecting the dots shows that Idraparinux increases the risk of death

It eludes me why the NEJM continues to publish specious, industry-sponsored, negative, non-inferiority trials. Perhaps they do it for my entertainment. And this past week, entertained I was indeed.

Idraparinux is yet another drug looking for an indication. Keep looking, Sanofi. Your pipeline problems will not be solved by this one.

First, let me dismiss the second article out of hand: it is not fair to test idraparinux against placebo (for the love of Joseph!) for the secondary prevention of VTE after a recent epidode! (http://content.nejm.org/cgi/content/short/357/11/1105).

It is old news that one can reduce the recurrence of VTE after a recent episode by either using low intensity warfarin (http://content.nejm.org/cgi/content/abstract/348/15/1425) or by extending the duration of warfarin anticoagulation (http://content.nejm.org/cgi/content/abstract/345/3/165). Therefore, the second van Gogh study does not merit further consideration, especially given the higher rate of bleeding in this study.

Now for the first study and its omissions and distortions. It is important to bear in mind that the only outcome that cannot be associated with ascertainment bias (assuming a high follow-up rate) is mortality, AND that the ascertainment of DVT and PE are fraught with numerous difficulties and potential biases.

The Omission: Failure to report in the abstract that Idraparinux use was associated with an increased risk of death in these studies, which was significant in the PE study, and which trended strongly in the DVT study. The authors attempt to explain this away by suggesting that the increased death rate was due to cancer, but of course we are not told how causes of death were ascertained (a notoriously difficult and messy task), and cancer is associated with DVT/PE which is among the final common pathways of death from cancer. This alone, this minor factoid that Idraparinux was associated with an increased risk of death should doom this drug and should be the main headline related to these studies.

Appropriate headline: "Idraparinux increases the risk of death in patients with PE and possibly DVT."

If we combine the deaths in the DVT and PE studies, we see that the 6-month death rates are 3.4% in the placebo group and 4.5% in the idraparinux group, with an overall (chi-square) p-value of 0.035 - significant!

This is especially worrisome from a generalizability perspective - if this drug were approved and the distinction between DVT and PE is blurred in clinical practice as it often is, we would have no way of being confident that we're using it in a DVT patient rather than a PE patient. Who wants such a messy drug?

The Obfuscations and Distortions: Where to begin? First of all, no justification of an Odds Ratio of 2.0 as a delta for non-inferiority is given. Is twice the odds of recurrent DVT/PE insignificant? It is not. This Odds Ratio is too high. Shame.

To give credit where it is due, the investigation at least used a one sided 0.025 alpha for the non-inferiority comparison.

Second, regarding the DVT study, many if not the majority of patients with DVT also have PE, even if it is subclinical. Given that ascertainment of events (other than death) in this study relied on symptoms and was poorly described, that patients with DVT were not routinely tested for PE in the absence of symptoms, and that the risk of death was increased with idraparinux in the PE study, one is led to an obvious hypothesis: that the trend towary an increased risk of death in the DVT study patients who received idraparinux was due to unrecognized PE in some of these patients. The first part of the conclusion in the abstract "in patients with DVT, once weekly SQ idraparinux for 3 or 6 months had an efficacy similar to that of heparin and vitamin K antagonists" obfuscates and conceals this worrisome possibility. Many patients with DVT probably also had undiagnosed PE and might have been more likely to die given the drug's failure to prevent recurrences in the PE study. The increased risk of death in the DVT study might have been simply muted and diluted by the lower frequency of PE in the patients in the DVT study.

Then there is the annoying the inability to reverse the effects of this drug with a very long half-life.

Scientific objectivity and patient safety mandate that this drug not receive further consideration for clinical use. Persistence with the study of this drug will most likely represent "sunk cost bias" on the part of the manufacturer. It's time to cut bait and save patients in the process.