Showing posts with label RCT. Show all posts
Showing posts with label RCT. Show all posts

Thursday, January 5, 2017

RCT Autopsy: The Differential Diagnosis of a Negative Trial

At many institutions, Journal Clubs meet to dissect a trial after its results are published to look for flaws, biases, shortcomings, limitations.  Beyond the dissemination of the informational content of the articles that are reviewed, Journal Clubs serve as a reiteration and extension of the limitations part of the article discussion.  Unless they result in a letter to the editor, or a new peer-reviewed article about the limitations of the trial that was discussed, the debates of Journal Club begin a headlong recession into obscurity soon after the meeting adjourns.

The proliferation and popularity of online media has led to what amounts to a real-time, longitudinally documented Journal Club.  Named “post-publication peer review” (PPPR), it consists of blog posts, podcasts and videocasts, comments on research journal websites, remarks on online media outlets, and websites dedicated specifically to PPPR.  Like a traditional Journal Club, PPPR seeks to redress any deficiencies in the traditional peer review process that lead to shortcomings or errors in the reporting or interpretation of a research study.

PPPR following publication of a “positive” trial, that is one where the authors conclude that their a priori criteria for rejecting the null hypothesis were met, is oftentimes directed at the identification of a host of biases in the design, conduct, and analysis of the trial that may have led to a “false positive” trial.  False positive trials are those in which either a type I error has occurred (the null hypothesis was rejected even though it is true and no difference between groups exists), or the structure of the experiment was biased in such a way as that the experiment and its statistics cannot be informative.  The biases that cause structural problems in a trial are manifold, and I may attempt to delineate them at some point in the future.  Because it is a simpler task, I will here attempt to list a differential diagnosis that people may use in PPPRs of “negative” trials.

Saturday, June 11, 2016

Non-inferiority Trials Are Inherently Biased: Here's Why

Debut VideoCast for the Medical Evidence Blog, explaining non-inferiority trial design and exposing its inherent biases:

In this related blog post, you can find links to the CONSORT statement in the Dec 26, 2012 issue of JAMA and a link to my letter to the editor.

Addendum:  I should have included this in the video.  See the picture below.  In the first example, top left, the entire 95% CI favoring "new" therapy lies in the "zone of indifference", that is, the pre-specified margin of superiority, a mirror image of the "pre-specified margin of noninferiority, in this case delta= +/- 0.15.  Next down, the majority of the 95% CI of the point estimate favoring "new" therapy lies in the "margin of superiority" - so even though the lower end of the 95% CI crosses "mirror delta", the best guess is that the effect of therapy falls in the zone of indifference.  In the lowest example, labeled "Truly Superior", the entire 95% confidence interval falls to the left of "mirror delta" thus reasonable excluding all point estimates in the "zone of indifference" (i.e. +/- delta) and all point estimates favoring the "old" therapy.  This would, in my mind, represent "true superiority" in a logical, rational, and symmetrical way that would be very difficult to mount arguments against.


Added 9/20/16:  For those who question my assertion that the designation of "New" versus "Old" or "comparator" therapy is arbitrary, here is the proof:  In this trial, the "New" therapy is DMARDs and the comparator is anti-tumour necrosis factor agents for the treatment of rheumatoid arthritis.  The rationale for this trial is that the chronologically newer anti-TNF agents are very costly, and the authors wanted to see if similar improvements in quality of life could be obtained with chronologically older DMARDs.  So what is "new" is certainly in the eye of the beholder.  Imagine colistin 50 years ago, being tested against, say, a newer spectrum penicillin.  The penicillin would have been found to be non-inferior, but with a superior side effect profile.  Fast forward 50 years and now colistin could be the "new" resurrected agent and be tested against what 10 years ago was the standard penicillin but is now "old" because of development of resistance.  Clearly, "new" and "old" are arbitrary and flexible designations.

Wednesday, October 7, 2015

Early Mobility in the ICU: The Trial That Should Not Be

I learned via twitter yesterday that momentum is building to conduct a trial of early mobility in critically ill patients.  While I greatly respect many of the investigators headed down this path, forthwith I will tell you why this trial should not be done, based on principles of rational decision making.

A trial is a diagnostic test of a hypothesis, a complicated and costly test of a hypothesis, and one that entails risk.  Diagnostic tests should not be used indiscriminately.  That the RCT is a "Gold Standard" in the hierarchy of testing hypotheses does not mean that we should hold it sacrosanct, nor does it follow that we need a gold standard in all cases.  Just like in clinical medicine, we should be judicious in our ordering of diagnostic tests.

The first reason that we should not do a trial of early mobility (or any mobility) in the ICU is because in the opinion of this author, experts in critical care, and many others, early mobility works.  We have a strong prior probability that this is a beneficial thing to be doing (which is why prominent centers have been doing it for years, sans RCT evidence).  When the prior probability is high enough, additional testing has decreasing yield and risks false negative results if people are not attuned to the prior.  Here's my analogy - a 35 year old woman with polycystic kidney disease who is taking birth control presents to the ED after collapsing with syncope.  She had shortness of breath and chest pain for 12 hours prior to syncope.  Her chest x-ray is clear and bedside ultrasound shows a dilated right ventricle.  The prior probability of pulmonary embolism is high enough that we don't really need further testing, we give anticoagulants right away.  Even if a V/Q scan (creatnine precludes CT) is "low probability" for pulmonary embolism, we still think she has it because the prior probability is so high.  Indeed, the prior probability is so high that we're willing to make decisions without further testing, hence we gave heparin.  This process follows the very rational Threshold Approach to Decision Making approach proposed by Pauker and Kasirrer in the NEJM in 1980, which is basically a reformulation of VonNeumann and Morganstern's Expected Utility Theory to adapt it to medical decisions.  Distilled it states in essence, "when you get to a threshold probability of disease where the benefits of treatment exceed the risks, you treat."  And so let it be with early mobility.  We already think the benefits exceed the risks, which is why we're doing it.  We don't need a RCT.  As I used to ask the housestaff over and over until I was cyanotic: "How will the results of that test influence what you're going to do?"

Notice that this logical approach to clinical decision making shines a blinding light upon "evidence based medicine" and the entire enterprise of testing hypotheses with frequentist methods that are deaf to prior probabilities.  Can you imagine using V/Q scanning to test for PE without prior probabilities?  Can you imagine what a mess you would find yourself in with regard to false negatives and false positives?  You would be the neophyte medical student who thinks "test positive, disease present; test negative, disease absent."  So why do we continue ad nauseum in critical care medicine to dismiss prior probabilities and decision thresholds and blindly test hypotheses in a purist vacuum?

The next reasons this trial should not be conducted flow from the first.  The trial will not have a high enough likelihood ratio to sway the high prior below the decision threshold; if the trial is "positive" we will have spent millions of dollars to "prove" something we already knew at a threshold above our treatment threshold; if the trial is positive, some will squawk "It wasn't blinded" yada yada yada in an attempt to dismiss the results as false positives; if the trial is negative, some will, like the tyro medical student, declare that "there is no evidence for early mobility" and similar hoopla and poppycock; or the worst case:  the trial shows harm from early mobility, which will get the naysayers of early mobility very agitated.  But of course, our prior probability that early mobility is harmful is hopelessly low, making such a result highly likely to be spurious.  When we clamor about "evidence" we are in essence clamoring about "testing hypotheses with RCTs" and eschewing our responsibility to use clinical judgment, recognize the limits of testing, and practice in the face of uncertainty using our "untested" prior probabilities.

Consider a trial of exercise on cardiovascular outcomes in community dwelling adults - what good can possibly come of such a trial?  Don't we already know that exercise is good for you?  If so, a positive trial reinforces what we already know (but does little to convince sedentary folks to exercise, as they too already know they should exercise), but a negative trial risks sending the message to people that exercise is of no use to you, or that the number needed to treat is too small for you to worry about.

Or consider the recent trials of EGDT which "refuted" the Rivers trial from 14 years ago.  Now, everybody is saying, "Well, we know it works, maybe not the catheters and the ScVO2 and all those minutaie , but in general, rapid early resuscitation works.  And the trials show that we've already incorporated what works into general practice!"

I don't know the solutions to these difficult quandries that we repeatedly find ourselves in trial after trial in critical care medicine.  I'm confused too.  That's why I'm thinking very hard and very critically about the limits of our methods and our models and our routines.  But if we can anticipate not only the results of the trials, but also the community reaction to them, then we have guidance about how to proceed in the future.  Because what value does a mega-trial have, if not to guide care after its completion?  And even if that is not its goal, (maybe its goal is just to inform the science), can we turn a blind eye to the fact that it will guide practice after its completion, even if that guidance is premature?

It is my worry that, given the high prior probability that a trial in critical care medicine will be "negative", the most likely result is a negative trial which will embolden those who wish to dismiss the probable benefits of early mobility and give them an excuse to not do it.

Diagnostic tests have risks.  A false negative test is one such risk.

Saturday, January 17, 2015

Clinical Trialists Should Use Economies of Scale to Maximize Profits of Large RCTs

The lever is a powerful tool
I am writing (very slowly) a review article about ionized calcium in the ICU - should it be measured, and should it be treated?  There are several recent large observational studies that look at the association between calcium and outcomes of  critical illness, but being observational, they do not offer guidance as to whether chasing calcium levels with calcium gluconate or chloride will improve outcomes or whether hypo- or hyper-calcemia is simply a marker of severity of illness (the latter is of course my bet.)

Thinking about calcium levels and causation and repletion, one cannot help but think about all sorts of other levels we check in the ICU - potassium, magnesium, phosphate - and may other things we routinely do but about which we have no real inkling of an idea as to whether we're doing any patients any good.  (Arterial lines are another example.)  Are we just wasting our time with many of the things we do?  This question becomes more urgent as evidence mounts that much of what we do (in the ICU and elsewhere) is useless, wasteful, or downright harmful.  But who or what agency is going to fund a trial of potassium or calcium replacement in the ICU?  It certainly seems unglamorous.   Don't we have other disease-specific priorities that are paramount in importance to such a trial?

I then realized that a good businessman, wanting to maximize the "profit" from a large, randomized controlled trial (and the dollars "invested" in it), would take advantage of economies of scale.  For those who are not business savvy (I do not imply that I am), business costs can be roughly divided into fixed costs and variable costs.  If you have a factory making widgets you have certain  costs such as the rent, advertising, widget making machines.  These costs are "fixed" meaning that they are invariable whether you make 100 widgets or 10,000 widgets.  Variable costs are the costs of materials, electricity, and human resources which must be scaled up as you make more widgets.  In general, the cost of making each widget goes down as the fixed costs are spread out over more widget units.  Additionally, if you can leverage your infrastructure to make wadgets, a product similar to a widget, you likewise increase profits by lowering costs per unit.

Sunday, January 27, 2013

Therapeutic Agnosticism: Stochastic Dominance of the Null Hypothesis

Here are some more thoughts on the epistemology of medical science and practice that were stimulated by reading three articles this week relating to monitoring interventions:  monitoring respiratory muscle function in the ICU (AJRCCM, January 1, 2013); monitoring intracranial pressure in traumatic brain injury (NEJM, December 27, 2013); and monitoring of gastric residual volume in the ICU (JAMA, January 16th, 2013).

In my last post about transfusion thresholds, I mused that overconfidence in their understanding of complex pathophysiological phenomena (did I say arrogance?) leads investigators and practitioners to overestimate their ability to discern the value and efficacy of a therapy in medicine.  Take, for instance, the vascular biologist studying pulmonary hypertension who, rounding in the ICU, elects to give sildenafil to a patient with acute right heart failure, and who proffers a plethora of complex physiological explanations for this selection.  Is there really any way for anyone to know the effects of sildenafil in this scenario?