Showing posts with label NEJM. Show all posts
Showing posts with label NEJM. Show all posts

Thursday, September 27, 2012

True Believers: Faith and Reason in the Adoption of Evidence

In last week's NEJM, in an editorial response to an article demonstrating that physicians, in essence, probability adjust (a la Expected Utility Theory) the likelihood that data are true based on the funding source of a study, editor-in-Chief Jeffery M. Drazen implored the journal's readership to "believe the data." Unfortunately, he did not answer the obvious question, "which data?" A perusal of the very issue in which his editorial appears, as well as this week's journal, considered in the context of more than a decade of related research demonstrates just how ironic and ludicrous his invocation is.

This November marks the eleventh year since the publication, with great fanfare, of Van den Berghe's trial of intensive insulin therapy (IIT) in the NEJM.  That article was followed by what I have called a "premature rush to adopt the therapy" (I should have called it a stampede), creation of research agendas in multiple countries and institutions devoted to its study, amassing of reams of robust data failing to confirm the original results, and a reluctance to abandon the therapy that is rivaled in its tenacity only by the enthusiasm that drove its adoption.  In light of all the data from the last decade, I am convinced of only one thing - that it remains an open question whether control of hyperglycemia within ANY range is of benefit to patients.
Suffice it to say that the Van den Berghe data have not suffered from lack of believers - the Brunkhorst, NICE-SUGAR, and Glucontrol data have - and  it would seem that in many cases what we have is not a lack of faith so much as a lack of reason when it comes to data.  The publication of an analysis of hypoglycemia using the NICE-SUGAR database in the September 20th NEJM, and a trial in this week's NEJM involving pediatric cardiac surgery patients by by Agus et al gives researchers and clinicians yet another opportunity to apply reason and reconsider their belief in IIT and for that matter the treatment of hyperglycemia in general.

Monday, September 21, 2009

The unreliable assymmetric design of the RE-LY trial of Dabigatran: Heads I win, tails you lose



I'm growing weary of this. I hope it stops. We can adapt the diagram of non-inferiority shenanigans from the Gefitinib trial (see http://medicalevidence.blogspot.com/2009/09/theres-no-such-thing-as-free-lunch.html ) to last week's trial of dabigatran, which came on the scene of the NEJM with another ridiculously designed non-inferiority trial (see http://content.nejm.org/cgi/content/short/361/12/1139 ). Here we go again.

These jokers, lulled by the corporate siren song of Boehringer Ingelheim, had the utter unmitigated gall to declare a delta of 1.46 (relative risk) as the margin of non-inferiority! Unbelievable! To say that a 46% difference in the rate of stroke or arterial clot is clinically non-significant! Seriously!?

They justified this felonious choice on the basis of trials comparing warfarin to PLACEBO as analyzed in a 10-year-old meta-analysis. It is obvious (or should be to the sentient) that an ex-post difference between a therapy and placebo in superiority trials does not apply to non-inferiority trials of two active agents. Any ex-post finding could be simply fortuitously large and may have nothing to do with the MCID (minimal clinically important difference) that is SUPPOSED to guide the choice of delta in a non-inferiority trial (NIT). That warfarin SMOKED placebo in terms of stroke prevention does NOT mean that something that does not SMOKE warfarin is non-inferior to warfarin. This kind of duplicitious justification is surely not what the CONSORT authors had in mind when they recommended a referenced justification for delta.

That aside, on to the study and the figure. First, we're testing two doses, so there are multiple comparisons, but we'll let that slide for our purposes. Look at the point estimate and 95% CI for the 110 mg dose in the figure (let's bracket the fact that they used one-sided 97.5% CIs - it's immaterial to this discussion). There is a non-statistically significant difference between dabigatran and warfarin for this dose, with a P-value of 0.34. But note that in Table 2 of the article, they declare that the P-value for "non-inferiority" is <0.001 [I've never even seen this done before, and I will have to look to see if we can find a precedent for reporting a P-value for "non-inferiority"]. Well, apparently this just means that the RR point estimate for 110 mg versus warfarin is statistically significantly different from a RR of 1.46. It does NOT mean, but it is misleadingly suggested that the comparison between the two drugs on stroke and arterial clot is highly clinically significant, but it is not. This "P-value for non-inferiority" is just an artifical comparison: had we set the margin of non-inferiority at a [even more ridiculously "P-value for non-inferiority" as small as we like by just inflating the margin of non-inferiority! So this is a useless number, unless your goal is to create an artificial and exaggerated impression of the difference between these two agents.

Now let's look at the 150 mg dose. Indeed, it is statistically significantly different than warfarin (I shall resist using the term "superior" here), and thus the authors claim superiority. But here again, the 95% CI is narrower than the margin of non-inferiority, and had the results gone the other direction, as in Scenarios 3 and 4, (in favor of warfarin), we would have still claimed non-inferiority, even though warfarin would have been statistically significantly "better than" dabigatran! So it is unfair to claim superiority on the basis of a statistically significant result favoring dabigatran, but that's what they do. This is the problem that is likely to crop up when you make your margin of non-inferiority excessively wide, which you are wont to do if you wish to stack the deck in favor of your therapy.

But here's the real rub. Imagine if the world were the mirror image of what it is now and dabigatran were the existing agent for prevention of stroke in A-fib, and warfarin were the new kid on the block. If the makers of warfarin had designed this trial AND GOTTEN THE EXACT SAME DATA, they would have said (look at the left of the figure and the dashed red line there) that warfarin is non-inferior to the 110 mg dose of dabigatran, but that it was not non-inferior to the 150 mg dose of dabigatran. They would NOT have claimed that dabigatran was superior to warfarin, nor that warfarin was inferior to dabigatran, because the 95% CI of the difference between warfarin and dabigatran 150 mg crosses the pre-specified margin of non-inferiority. And to claim superiority of dabigatran, the 95% CI of the difference would have to fall all the way to the left of the dashed red line on the left. (See Piaggio, JAMA, 2006.)

The claims that result from a given dataset should not depend on who designs the trial, and which way the asymmetry of interpretation goes. But as long as we allow asymmetry in the interpretation of data, they shall. Heads they win, tails we lose.

Sunday, September 6, 2009

There's no such thing as a free lunch - unless you're running a non-inferiority trial. Gefitinib for pulmonary adenocarcinoma


A 20% difference in some outcome is either clinically relevant, or it is not. If A is worse than B by 19% and that's NOT clinically relevant and significant, then A being better than B by 19% must also NOT be clinically relevant and significant. But that is not how the authors of trials such as this one see it: http://content.nejm.org/cgi/content/short/361/10/947 . According to Mok and co-conspirators, if gefitinib is no worse in regard to progression free survival than Carboplatin-Paclitaxel based on a 95% confidence interval that does not include 20% (that is, it may be up to 19.9% worse, but not more worse), then they call the battle a draw and say that the two competitors are equally efficacious. However, if the trend is in the other direction, that is, in favor of gefitinib BY ANY AMOUNT HOWEVER SMALL (as long as it's statistically significant), they declare gefinitib the victor and call it a day. It is only because of widespread lack of familiarity with non-inferiority methods that they can get away with a free lunch like this. A 19% difference is either significant, or it is not. I have commented on this before, and it should come as no surprise that these trials are usually used to test proprietary agents (http://content.nejm.org/cgi/content/extract/357/13/1347 ). Note also that in trials of adult critical illness, the most commonly sought mortality benefit is about 10% (more data on this forthcoming in a article soon to be submitted and hopefully published). So it's a difficult argument to subtend to say that something is "non-inferior" if it is less than 20% worse than something else. Designers of critical care trials will tell you that a 10% difference, often much less, is clinically significant.

I have created a figure to demonstrate the important nuances of non-inferiority trials using the gefitinib trial as an example. (I have adapted this from the Piaggio 2006 JAMA article of the CONSORT statement for the reporting of non-inferiority trials - a statement that has been largely ignored: http://jama.ama-assn.org/cgi/content/abstract/295/10/1152?lookupType=volpage&vol=295&fp=1152&view=short .) The authors specified delta, or the margin of non-inferiority, to be 20%. I have already made it clear that I don't buy this, but we needn't challenge this value to make war with their conclusions, although challenging it is certainly worthwhile, even if it is not my current focus. This 20% delta corresponds to a hazard ratio of 1.2, as seen in the figure demarcated by a dashed red line on the right. If the hazard ratio (for progression or death) demonstrated by the data in the trial were 1.2, that would mean that gefitinib is 20% worse than comparator. The purpose of a non-inferiority trial is to EXCLUDE a difference as large as delta, the pre-specified margin of non-inferiority. So, to demonstrate non-inferiority, the authors must show that the 95% confidence interval for the hazard ratio falls all the way to the left of that dashed red line at HR of 1.2 on the right. They certainly achieved this goal. Their data, represented by the lowermost point estimate and 95% CI, falls entirely to the left of the pre-specified margin of non-inferiority (the right red dashed line). I have no arguments with this. Accepting ANY margin of non-inferiority (delta), gefitinib is non-inferior to the comparator. What I take exception to is the conclusion that gefitinib is SUPERIOR to comparator, a conclusion that is predicated in part on the chosen delta, to which we are beholden as we make such conclusions.

First, let's look at [hypothetical] Scenario 1. Because the chosen delta was 20% wide (and that's pretty wide - coincidentally, that's the exact width of the confidence interval of the observed data), it is entirely possible that the point estimate could have fallen as pictured for Scenario 1 with the entire CI between an HR of 1 and 1.2, the pre-specified margin of non-inferiority. This creates the highly uncomfortable situation in which the criterion for non-inferiority is fulfilled, AND the comparator is statistically significantly better than gefitinib!!! This could have happened! And it's more likely to happen the larger you make delta. The lesson here is that the wider you make delta, the more dubious your conclusions are. Deltas of 20% in a non-inferiority trial are ludicrous.

Now let's look at Scenarios 2 and 3. In these hypothetical scenarios, comparator is again statistically significantly better than gefitinib, but now we cannot claim non-inferiority because the upper CI falls to the right of delta (red dashed line on the right). But because our 95% confidence interval includes values of HR less than 1.2 and our delta of 20% implies (or rather states) that we consider differences of less than 20% to be clinically irrelevant, we cannot technically claim superiority of comparator over gefitinib either. The result is dubious. While there is a statistically significant difference in the point estimate, the 95% CI contains clinically irrelevant values and we are left in limbo, groping for a situation like Scenario 4, in which comparator is clearly superior to gefitinib, and the 95% CI lies all the way to the right of the HR of 1.2.

Pretend you're in Organic Chemistry again, and visualize the mirror image (enantiomer) of scenario 4. That is what is required to show superiority of gefitinib over comparator - a point estimate for the HR whose 95% CI does not include delta or -20%, an HR of 0.8. The actual results come close to Scenario 5, but not quite, and therefore, the authors are NOT justified in claiming superiority. To do so is to try to have a free lunch, to have their cake and eat it too.

You see, the larger you make delta, the easier it is to achieve non-inferiority. But the more likely it is also that you might find a statistically significant difference favoring comparator rather than the preferred drug which creates a serious conundrum and paradox for you. At the very least, if you're going to make delta large, you should be bound by your honor and your allegiance to logic and science to make damned sure that to claim superiority, your 95% confidence interval must not include negative delta. If not, shame on you. Eat your free lunch if you will, but know that the ireful brow of logic and reason is bent unfavorably upon you.


Monday, June 2, 2008

"Off-Label Promotion By Proxy": How the NEJM and Clinical Trials are Used as an Advertising Apparatus. The Case of Aliskiren

In the print edition of the June 5th NEJM (mine is delivered almost a week early sometimes), readers will see on the front cover the lead article entitled "Aliskiren Combined with Losartan in Type 2 Diabetes and Nephropathy," and on the back cover a sexy advertisement for Tekturna (aliskiren), an approved antihypertensive agent, which features "mercury-man", presumably a former hypertensive patient metamorphized into elite biker (and perhaps superhero) by the marvels of Tekturna. Readers who lay the journal inside down while open may experience the same irony I did when they see the front cover lead article juxtaposed to the back cover advertisement.

The article describes how aliskiren, in the AVOID trial, reduced the mean urinary albumin-to-creatinine ratio as compared to losartan alone. There are several important issues here. First, if one wants to use a combination of agents, s/he can use losartan with a generic ACE-inhibitor (ACEi). A more equitable comparison would have pitted aliskiren plus losartan against [generic] ACEi plus losartan. The authors would retort of course that losartan alone is a recommended agent for the condition studied, but that is circular logic. If we were not in need of more aggressive therapy for this condition, then why study aliskiren in combination for it at all? If you want to study a new aggressive combination, it seems only fair to compare it to existing aggressive combinations.

Which brings me to another point - should aliskiren be used for ANY condition? No, it should not. It is a novel [branded] agent which is expensive, for which there is little experience, which may have important side effects which are only discovered after it is used in hundreds of thousands of patients, and more importantly, alternative effective agents exist which are far less costly adn for which more experience exist. A common error in decision making occurs when decision makers focus only on the agent or choice at hand and fail to consider the range of alternatives and how the agent under consideration fares when compared to the alternatives. Because aliskiren has only been shown to lower blood pressure, a surrogate endpoint, we would do well to stick with cheaper agents for which there are more data and more experience, and reserve use of aliskiren until a study shows a long-term mortality or meaningful morbidity benefit.

But here's the real rub - after an agent like this gets approved for one [common] indication (hypertension), the company is free to conduct little studies like this one, for off-label uses, to promote its sale [albeit indirectly] in patients who do not need it for its approved indication (BP lowering). And what better advertising to bring the drug into the sight of physicians than a lead article in the NEJM, with a complementary full page advertisement on the back cover? This subversive "off-label promotion by proxy", effected by study of off-label indications for which FDA approval may or may not ultimately be sought, has the immediate benefit of misleading the unwary who may increase prescriptions of this medication based on this study (which they are free to do) withouth considering the full range of alternatives.

My colleague David Majure, MD, MPH has commented to me about an equally insidious but perhaps more nefarious practice that he noticed may be occuring while attending this year's meeting of the American College of Cardiology (ACC). There, "investigtors" and corporate cronies are free to present massive amounts of non-peer reviewed data in the form of abstracts and presentations, much of which data will not and should not withstand peer review or which will be relegated to the obscurity of low-tier journals (where it likely belongs). But eager audience members, lulled by the presumed credibility of data presented at a national meeting of [company paid] experts will likely never see the data in peer-reviewed form, and instead will carry away the messages as delivered. "Drug XYZ was found to do 1-2-3 to [surrogate endpoint/off-label indication] ABC." By sheer force of repetition alone, these abstracts and presentations serve to increase product recognition, and, almost certainly, prescriptions. Whether the impact of the data presented is meaningful or not need not be considered, and probably cannot be considered without seeing the data in printed form - and this is just fine - for sales that is.

(Added 6/11/2008: this pre-publication changing of practice patterns has been described before - see http://jama.ama-assn.org/cgi/content/abstract/284/22/2886 .)

The novel mechanism of action of this agent and the scientific validity of the AVOID trial notwithstanding, the editorialship of the NEJM and the medical community should realize that science and the profit motive are inextricably interwoven when companies study these branded agents. The full page advertisement on the back cover of this week's NEJM was just too much for me.