Medical Evidence Blog: CONSORT

Showing posts with label CONSORT. Show all posts

Thursday, September 8, 2016

Hiding the Evidence in Plain Sight: One-sided Confidence Intervals and Noninferiority Trials

In the last post, I linked a video podcast of my explaining non-inferiority trials and their inherent biases. In this videocast, I revisit noninferiority trials and the use of one-sided confidence intervals. I review the Salminen et al noninferiority trial of antibiotics versus appendectomy for the treatment of acute appendicitis in adults. This trial uses a very large delta of 24%. The criteria for non-inferiority were not met even with this promiscuous delta. But the use of a 1-sided 95% confidence interval concealed a more damning revelation in the data. Watch the 13 minute videocast to learn what was hidden in plain sight!

Erratum: at 1:36 I say "excludes an absolute risk difference of 1" and I meant to say "excludes an absolute risk difference of ZERO." Similarly, at 1:42 I say "you can declare non-inferiority". Well, that's true, you can declare noninferiority if your entire 95% confidence interval falls to the left of an ARD of 0 or a HR of 1, but what I meant to say is that if that is the case "you can declare superiority."

Also, at 7:29, I struggle to remember the numbers (woe is my memory!) and I place the point estimate of the difference, 0.27, to the right of the delta dashed line at .24. This was a mistake which I correct a few minutes later at 10:44 in the video. Do not let it confuse you, the 0.27 point estimates were just drawn slightly to the right of delta and they should have been marked slightly to the left of it. I would re-record the video (labor intensive) or edit it, but I'm a novice with this technological stuff, so please do forgive me.

Finally, at 13:25 I say "within which you can hide evidence of non-inferiority" and I meant "within which you can hide evidence of inferiority."

Again, I apologize for these gaffes. My struggle (and I think about this stuff a lot) in speaking about and accurately describing these confidence intervals and the conclusions that derive from them result from the arbitrariness of the CONSORT "rules" about interpretation and the arbitrariness of the valences (some articles use negative valence for differences favoring "new" some journals use positive values to favor "new"). If I struggle with it, many other readers, I'm sure, also struggle in keeping things straight. This is fodder for the argument that these "rules" ought to be changed and made more uniform, for equity and ease of understanding and interpretation of non-inferiority trials.

It made me feel better to see this diagram in Annals of Internal Medicine (Perkins et al July 3, 2012, online ACLS training) where they incorrectly place the point estimate at slightly less than -6% (to the left of the dashed delta line in the Figure 2), when it should have been placed slightly greater than -6% (to the right of the dashed delta line). Clicking on the image will enlarge it.

Friday, April 19, 2013

David versus Goliath on the Battlefield of Non-inferiority: Strangeness is in the Eye of the Beholder

In this week's JAMA is my letter to the editor about the CONSORT statement revision for the reporting of non-inferiority trials, and the authors' responses. I'll leave it to interested readers to view for themselves the revised CONSORT statement, and the letter and response.

In sum, my main argument is that Figure 1 in the article is asymmetric, such that inferiority is stochastically less likely than superiority and an advantage is therefore conferred to the "new" [preferred; proprietary; profitable; promulgated] treatment in a non-inferiority trial. Thus the standards for interpretation of non-inferiority trials are inherently biased. There is no way around this, save for revising the standards.

The authors of CONSORT say that my proposed solution is "strange" because it would require revision of the standards of interpretation for superiority trials as well. For me it is "strange" that we would endorse asymmetric and biased standards of interpretation in any trial. The compromised solution, as I suggested in my letter, is that we force different standards for superiority only in the context of a non-inferiority trial. Thus, superiority trial interpretation standards remain untouched. It is only if you start with a non-inferiority trial that you have a higher hurdle to claiming superiority that is contingent on evidence of non-inferiority in the trial that you designed. This would disincentivise the conduct of non-inferiority trials for a treatment that you hope/think/want to be superior. In the current interpretation scheme, it's a no-brainer - conduct a non-inferiority trial and pass the low hurdle for non-inferiority, and then if you happen to be superior too, BONUS!

In my proposed scheme, there is no bonus superiority that comes with a lower hurdle than inferiority. As I said in the last sentence, "investigators seeking to demonstrate superiority should design a superiority trial." Then, there is no minimal clinically important difference (MCID) hurdle that must be cleared, and a statistical difference favoring new therapy by any margin lets you declare superiority. But if you fail to clear that low(er) hurdle, you can't go back and declare non-inferiority.

Which leads me to something that the word limit of the letter did not allow me to express: we don't let unsuccessful superiority trials test for non-inferiority contingently, so why do we let successful non-inferiority trials test for superiority contingently?

Symmetry is beautiful; Strangeness is in the eye of the beholder.

(See also: Dabigatran and Gefitinib especially the figures, analogs of Figure 1 of Piaggio et al, on this blog.)

Friday, August 20, 2010

Heads I Win, Tails it's a Draw: Rituximab, Cyclophosphamide, and Revising CONSORT

The recent article by Stone et al in the NEJM (see: http://www.nejm.org/doi/full/10.1056/NEJMoa0909905 ), which appears to [mostly] conform to the CONSORT recommendations for the conduct and reporting of NIFTs (non-inferiority trials, often abbreviated NIFs, but I think NIFTs ["Nifties"] sounds cooler), allowed me to realize that I fundamentally disagree with the CONSORT statement on NIFTs (see JAMA, http://jama.ama-assn.org/cgi/content/abstract/295/10/1152 ) and indeed the entire concept of NIFTs. I have discussed previously in this blog my disapproval of the asymmetry with which NIFTs are designed such that they favor the new (and often proprietary agent), but I will use this current article to illustrate why I think NIFTs should be done away with altogether and supplanted by equivalence trials.

This study rouses my usual and tired gripes about NIFTs: too large a delta, no justification for delta, use of intention-to-treat rather than per-protocol analysis, etc. It also describes a suspicious statistical maneuver which I suspect is intended to infuse the results (in favor of Rituximab/Rituxan) with extra legitimacy in the minds of the uninitiated: instead of simply stating (or showing with a plot) that the 95% CI excludes delta, thus making Rituxan non-inferior, the authors tested the hypothesis that the lower 95.1% CI boundary is different from delta, which test results in a very small P-value (<0.001). This procedure adds nothing to the confidence interval in terms of interpretation of the results, but seems to imbue them with an unassailable legitimacy - the non-inferiority hypothesis is trotted around as if iron-clad by this miniscule P-value, which is really just superfluous and gratuitious.

But I digress - time to focus on the figure. Under the current standards for conducting a NIFT, in order to be non-inferior, you simply need a 95% CI for the preferred [and usually proprietary] agent with an upper boundary which does not include delta in favor of the comparator (scenario A in the figure). For your preferred agent to be declared inferior, the LOWER 95% CI for the difference between the two agents must exclude the delta in favor of the comparator (scenario B in the figure.) For that to ever happen, the preferred/proprietary agent is going to have to be WAY worse than standard treatment. It is no wonder that such results are very, very rare, especially since deltas are generally much larger than is reasonable. I am not aware of any recent trial in a major medical journal where inferiority was declared. The figure shows you why this is the case.

Inferiority is very difficult to declare (the deck is stacked this way on purpose), but superiority is relatively easy to declare, because for superiority your 95% CI doesn't have to exclude an obese delta, but rather must just exclude zero with a point estimate in favor of the preferred therapy. That is, you don't need a mirror image of the 95% CI that you need for inferiority (scenario C in the figure), you simply need a point estimate in favor of the preferred agent with a 95% CI that does not include zero (scenario D in the figure). Looking at the actual results (bottom left in the figure), we see that they are very close to scenario D and that they would have only had to go a little bit more in favor of rituxan for superiority to have been able to be declared. Under my proposal for symmetry (and fairness, justice, and logic), the results would have had to be similar to scenario C, and Rituxan came nowhere near to meeting criteria for superiority.

The reason it makes absolutely no sense to allow this asymmetry can be demonstrated by imagining a counterfactual (or two) - supposing that the results had been exactly the same, but they had favored Cytoxan (cyclophosphamide) rather than Rituxan, that is, Cytoxan was associated with a 11% improvement in the primary endpoint. This is represented by scenario E in the figure; and since the 95% CI includes delta, the result is "inconclusive" according to CONSORT. So how can it be that the classification of the result changes depending on what we arbitrarily (a priori, before knowing the results) declare to be the preferred agent? That makes no sense, unless you're more interested in declaring victory for a preferred agent than you are in discovering the truth, and of course, you can guess my inferences about the motives of the investigators and sponsors in many/most of these studies. In another counterfactual example, scenario F in the figure represents the mirror image of scenario D, which represented the minimum result that would have allowed Stone et al to declare that Rituxan was superior. But if the results had favored Cytoxan by that much, we would have had another "inconclusive" result, according to CONSORT. Allowing this is just mind-boggling, maddening, and unjustifiable!

Given this "heads I win, tails it's a draw", it's no wonder that NIFTs are proliferating. It's time we stop accepting them, and require that non-inferiority hypotheses be symmetrical - in essence, making equivalence trials the standard operating procedure, and requiring the same standards for superiority as we require for inferiority.