Showing posts sorted by relevance for query intensive insulin. Sort by date Show all posts
Showing posts sorted by relevance for query intensive insulin. Sort by date Show all posts

Friday, January 11, 2008

Jumping the Gun with Intensive Insulin Therapy (Leuven Protocol):How ICUs across the nation rushed to adopt a therapy which is probably not beneficial

In this week's NEJM is an anxiously awaited article about intensive insulin therapy in severely septic patients in the ICU: http://content.nejm.org/cgi/content/short/358/2/125
This business of intensive insulin therapy began with publication in the NEJM in 2001 an article by Van den Berghe et al showing a remarkable reduction in mortality in surgical (mostly post-cardiac surgery) patients in a surgical ICU. Thereafter ensued a veritable rush to adopt this therapy, and ICUs around the country began developing and adopting protocols for "tight glucose control" in spite of concerns about the study and its generalization to non-surgical patients who were not being fed concentrated intravenous dextrose solutions....

I vividly remember one of the ICU attendings at Johns Hopkins Hospital, Dr. Jimmy Sylvester, telling us on the morning after the study was published that "this is either the largest break-through in intensive care therapeutics ever, or these data are faked". In essence what he was saying was that the prior expectation of a result as dramatic as demonstrated by Van den Berghe was very low (see also: http://jama.ama-assn.org/cgi/content/full/294/17/2203 ). That lower prior probability should have reduced our confidence in the results, and made us more skeptical of the population studied and the dextrose solutions and the applicability to non-surgical patients. Well then, why didn't it?

My colleague James M. O'Brien, Jr, MD, MSc and I have one possible explanation for the rush to adopt "intensive insulin therapy" which we have dubbed the "normalization heuristic." Physicians, for all of our training, remain quite simple-minded. We like simple, feel-good fixes. Normalizing lab values is one of those things. "Make it normal and all will be fine," goes the mantra. We like to make the potassium normal. We like to make the hematocrit normal. We love it when the magnesium increases after we order 4 grams. It's satisfying. And it feels like we're doing some measurable, that is, easily measurable good in the world. Normalizing blood sugars fits that paradigm and makes us feel like we are doing good. But are we?

We have learned the hard way over the years that many of the things we do to "normalize" some surface value causes an undercurrent of harm for patients. Think suppression of PVCs (the CAST trial: http://content.nejm.org/cgi/content/abstract/321/6/406 ) or transfusion thresholds (the TRICC study and others: http://content.nejm.org/cgi/content/abstract/340/6/409 ). Oftentimes, it seems, our efforts to "normalize" some value cause more harm than good. It is quite possible that this is also the case with intensive insulin, and that the "feel-good" appeal of making the blood sugars normal in the short term in acutely ill patients propelled us to early adoption of this probably useless and possibly harmful therapy.

(For an analogous contemporaneous story about biology's complexity and defiance of simple explanations and logic such as the normalization heuristic, see: http://www.nytimes.com/2008/01/11/science/11ants.html?scp=1&sq=aiding+trees+can+kill+them.)

The interesting thing regarding the "adoption" of Van den Berghe's "Leuven protocol" is that no ICU I have worked in really adopted that protocol. They softened it up, making the target blood sugar not 80-120, but rather 120-150 or some similar range. So what was adopted was "moderate insulin therapy" rather than intensive insulin therapy. Nobody has any idea whether such an approach is beneficial. It's certainly safer. But it has substantial costs in terms of nursing care that might be better spent on other interventions (think sedation interruption).

(I have been highly critical of Van den Berghe's medical insulin article, and my criticisms were published in the NEJM. I was delighted that she did not even address me/them in "the authors reply" - apparently I left her speechless: http://content.nejm.org/cgi/content/extract/354/19/2069.)

So this wonderful article in the current issue by Brunkhorst et al is music to my ears. Rather than hiding the high rate of severe hypoglycemia in supplementary material, Brunkhorst et al come right out and say that not only was the Leuven protocol NOT associated with reduced mortality, but also that it had a very high incidence of severe side effects and that their DSMB had the wherewithal to stop the study early for safety reasons. Bravo!

We await the results of several other ongoing studies of intensive insulin therapy before we nail shut the coffin on the Leuven protocol. Meanwhile, I hope that someone somewhere will design a protocol to test the "moderate insulin therapy" that we rushed to adopt after the first Van den Berghe article as a half-hearted hedge/compromise between our "normalization heuristic", our tempered enthusiasm for the Leuven protocol, our desire to "do something" for critically ill patients, and our fear of causing side effects that result directly from our interventions (omission bias: http://mdm.sagepub.com/cgi/content/abstract/26/6/575 ).

Thank you, Brunkhorst et al, for testing the Leuven protocol in an even-handed and scientifically unbiased manner and for reporting your results candidly.

Sunday, April 5, 2009

Another [the final?] nail in the coffin of intensive insulin therapy (Leuven Protocol) - and redoubled scrutiny of single center studies

In the March 26th edition of the NEJM, the NICE-SUGAR study investigators publish the results of yet another study of intensive insulin therapy in critically ill patients: http://content.nejm.org/cgi/content/abstract/360/13/1283 .

This article is of great interest to critical care practitioners because intensive insulin therapy (Leuven Protocol) or some diluted or half-hearted version of it has become a de facto standard of care in ICUs across the nation and indeed worldwide; and because it is an incredibly well-designed and well-conducted study. My own interest derives also from my own [prescient] letter to the editor of the NEJM after the second Van den Berghe study (http://content.nejm.org/cgi/content/extract/354/19/2069 , the criticisms I levied against this therapy on this blog after another follow-up study recently showed negative results (http://medicalevidence.blogspot.com/2008/01/jumping-gun-with-intensive-insulin.html ), and in a recent paper railing against the "normalization heuristic" (http://www.medical-hypotheses.com/article/S0306-9877(09)00033-4/abstract ). The results of this study also add to the growing evidence that intensive control of hyperglycemia in other settings may not be beneficial (see the ACCORD and ADVANCE studies.)

The current study was designed to largely mirror the enrollment criteria and outcome definitions of the previous studies, had excellent follow-up, had well described and simple statistical analyses with ample power, and is well reported. Key differences between it and the original Van den Berghe study were the lack of high-calorie parenteral glucose infusions, and its multicenter design. This latter characteristic may be pivotal in understanding why the initially promising Leuven Protocol results have not panned out on subsequent study.

The results of this study can be summarized simply by saying that it appears that this therapy is of NO benefit and actually probably kills patients, in addition to markedly increasing the rate of very very severe hypoglycemia (6.3% increase, P<0.001). In contrast to Van den Berghe's second study in medical patients, there were no favorable trends towards reduction in ICU length of stay, time on the ventilator, or reduced organ failures. In short, this therapy appears to be a complete flop.

So why the difference? Why did this therapy, which in 2001 appeared to have such promise that it enjoyed rapid and widespread [and premature] adoption fail to withstand the basic test of science, namely, repeatability? I think that medical history will judge two factors to be responsible. Firstly, the massive dextrose infusions in the first study markedly jeporadized the external validity of the first (positive) Van den Berghe study - it's not that intensive insulin saves you from your illness, it saves you from the harmful caloric infusions used in the surgical patients in the first study.

Secondly, and this is related to the first, single center studies also compromise external validity. In a single center, local practice patterns may be uniform and idiosyncratic, so that the benefit of any therapy tested in such a center may also be idiosyncratic. Moreover, and I dare say, investigators at a single center may have more decisional latitude and control or influence over enrollment, ascentainment of outcomes, and clinical care of enrolled patients. The so-called "trial effect" whereby patients enrolled in a trial receive superior care and have superior outcomes may be more likely in single center studies. Such effects are of increased concern in trials whre total blinding/masking or treatment assignment is not possible. (Recall that in the Van den Berghe study, kan endocrinologist was consulted for insulin adjustments; in the current trial, a computerized algorithm controlled the adjustments.) Moreover still, for single center studies, investigators and the instutution itself may have more "riding on" the outcome of the study, and collective equipoise may not exist. As an "analogy of extremes", just for illustrative purposes, if you wanted to design a trial where you could subversively influence outcomes in a way that would not be apparent from the outside, would you design a single center study (at your own institution where your cronies were) or a large multicenter, multinational study? Which design would allow you to have more influence?

I LOVE the authors' concluding statement that "a clinical trial targeting a perceived risk factor is a test of a complex strategy that may have profound effects beyond its effect on the risk factor." This resonates beautifully with our conceptualization of the "normalization heuristic" and harkens to Ben Franklin's sage old saw that "He is the best physician who knows the worthlessness of the most medicines." I think that we now have more than ample data to assure us that intensive insulin therapy (i.e., targeting a blood sugar of 80-108) is a worthless medicine, and should be largely if not wholly abandoned.

Addendum 4/7/09: Also note the scrutiny of the only other "positive" study (with mortality as the primary endpoint) in critical care in the last decade: Rivers et al; see: http://online.wsj.com/article/SB121867179036438865.html .

Tuesday, February 9, 2010

Post hoc non ergo propter hoc extended: A is associated with B therefore A causes B and removal of A removes B

From Annane et al, JAMA, January 27th, 2010 (see: http://jama.ama-assn.org/cgi/content/abstract/303/4/341 ):

"...patients whose septic shock is treated with hydrocortisone commonly have blood glucose levels higher than 180. These levels have clearly been associated with marked increase in the risk of dying...Thus, we hypothesized that normalization of blood glucose levels with intensive insulin treatment may improve the outcome of adults with septic shock who are treated with hydrocortisone."

The normalization heuristic is at work again.

Endocrine interventions as adjunctive treatments in critical care medicine have a sordid history. Here are some landmarks. Rewind 25 years, and as Angus has recently described (http://jama.ama-assn.org/cgi/content/extract/301/22/2388 ) we had the heroic administration of high dose corticosteroids (e.g. gram doses of methylprednisolone) for septic shock, which therapy was later abandoned. In the 1990s, we had two concurrent trials of human growth hormone in critical illness, showing the largest statistically significant harms (increased mortality of ~20%) from a therapy in critical illness that I'm aware of (see http://content.nejm.org/cgi/content/abstract/341/11/785 ). Early in the new millennium, based on two studies that should by now be largely discredited by their successors, we had intensive insulin therapy for patients with hyperglycemia and low dose corticosteroid therapy for septic shock. It is fitting then, and at least a little ironic, that this new decade should start with publication of a study combining these latter two therapies of dubious benefit: The aptly named COIITSS study.

I know this sounds overly pessimistic, but some of these therapies, these two in particular, just need to die, but are being kept alive by the hope, optimism, and proselytizing of those few individuals whose careers were made on them or continue to depend upon them. And I lament the fact that, as a result of the promotional efforts of these wayward souls, we have been distracted from the actual data. Allow me to summarize these briefly:

1.) The original Annane study of steroids in septic shock (see: http://jama.ama-assn.org/cgi/content/abstract/288/7/862 ) utilized an adjusted analysis of a subgroup of patients not identifiable at the outset of the trial (responders versus non-responders). The entire ITT (intention to treat) population had an ADJUSTED P-value of 0.09. I calculated an unadjusted P-value of 0.29 for the overall cohort. Since you cannot know at the outset who's a responder and who's not, for a practitioner, the group of interest is the ITT population, and there was NO EFFECT in this population. Somehow, the enthusiasm for this therapy was so great that we lost sight of the reasons that I assume the NEJM rejected this article - an adjusted analysis of a subgroup. Seriously! How did we lose sight of this? Blinded by hope and excitement, and the simplicity of the hypothesis - if it's low, make it high, and everything will be better. Then Sprung and the CORTICUS folks came along (see: http://content.nejm.org/cgi/content/abstract/358/2/111 ), and, as far as I'm concerned, blew the whole thing out of the water.

2.) I remember my attending raving about the Van den Berghe article (see: http://content.nejm.org/cgi/content/abstract/345/19/1359 ) as a first year fellow at Johns Hopkins in late 2001. He said "this is either the greatest therapy ever to come to critical care medicine, or these data are faked!" That got me interested. And I still distinctly remember circling something in the methods section, which was in small print in those days, on the left hand column of the left page back almost 9 years ago - that little detail about the dextrose infusions. This therapy appeared to work in post-cardiac surgery patients on dextrose infusions at a single center. I was always skeptical about it, and then the follow-up study came out, and lo and behold, NO EFFECT! But that study is still touted by the author as a positive one! Because again, like in Annane, if you remove those pesky patients who didn't stay in the MICU for 3 days (again, like Annane, not identifiable at the outset), you have a SUBGROUP analysis in which IIT (intensive insulin therapy - NOT intention to treat, ITT is inimical to IIT) works. Then you had NICE-SUGAR (see: http://content.nejm.org/cgi/content/abstract/360/13/1283 ) AND Brunkhorst et al (see: http://content.nejm.org/cgi/content/abstract/358/2/125 ) showing that IIT doesn't work. How much more data do we need? Why are we still doing this?

Because old habits die hard and so do true believers. Thus it was perhaps inevitable that we would have COITSS combine these two therapies into a single trial. Note that this trial does nothing to address whether hydrocortisone for septic shock is efficacious (it probably is NOT), but rather assumes that it is. I note also that it was started in 2006 just shortly before the second Van den Berghe study was published and well after the data from that study were known. Annane et al make no comments about whether those data impacted the conduct of their study, and whether participants were informed that a repeat of the trial upon which the Annane trail was predicated, had failed.

Annane did not use blinding for fludrocortisone in the current study, but this is minor. It is difficult to blind IIT, but usually what you do when you can't blind things adequately is you protocolize care. That was not obviously done in this trial; instead we are reassured that "everybody should have been following the Surviving Sepsis Campaign guidelines". (I'm paraphrasing.)

As astutely pointed out by Van den Berghe in the accompanying editorial, this trial was underpowered. It was just plain silly to assume (or play dumb) that a 3% ARR which is a ~25% RRR (since the baseline was under 10%) would translate into a 12.5% ARR with a baseline mortality of near 50%. Indeed, I don't know why we even talk about RRRs anymore, they're a ruse to inflate small numbers and rouse our emotions. (Her other comments, about "separation", which would be facilitated by having a very very intensive treatment and a very very lax control is reminiscent of what folks were saying about ARMA low/high Vt - namely that the trial was ungeneralizable because the "control" 12 cc/kg was unrealistic. Then you get into the Eichacker and Natanson arguments about U-shaped curves [to which there may be some truth] and how too much is bad, not enough is bad, but somewhere in the middle is the "sweet spot". And this is key. Would that I could know the sweet spot for blood sugar - and coax patients to remain there.)

Because retrospective power calculations are uncouth, I elected to calculate the 95% confidence interval (CI) for delta (the difference between the two groups) in this trial. The point estimate for delta is -2.96% (negative delta means the therapy was WORSE than control!) with a 95% confidence interval of -11.6% to +5.65%. It is most likely between 11% worse and 5% better, and any good betting man would wager that it's worse than control! But in either case, this confidence interval is uncomfortably wide and contains values for harm and benefit which should be meaningful to us, so in essence the data do not help us decide what to do with this therapy.

(And look at table 2, the main results, look they are still shamelessly reporting adjusted P-values! Isn't that why we randomize? So we don't have to adjust?)

TTo bring this saga full circle, I note that, as we saw in NICE-SUGAR, Brunkhorst, and Van den Berghe, severe hypoglycemia (<40!) was far more common in the IIT group. And severe hypoglycemia is associated with death (in most studies, but curiously not in this one). So, consistent with the hypothesis which was the impetus for this study (A is associated with B, thus A causes B and removal of A removes B), one conclusion from all these data is that hypoglycemia causes death, and should be avoided through avoidance of IIT.

Wednesday, May 2, 2018

Hollow Hegemony: The Opportunity Costs of Overemphasizing Sepsis


Protocols are to make complex tasks simple, not simple tasks complex. - Scott K Aberegg

Yet here we find ourselves some 16 years after the inauguration of the Surviving Sepsis Campaign, and their influence continues to metastasize, even after the message has been hollowed out like a piece of fallen, old-growth timber.

Surviving sepsis was the brainchild of Eli Lilly, who, in the year after the ill-fated FDA approval of drotrecogin-alfa, worried that the drug would not sell well if clinicians did not have an increased awareness of sepsis. That aside, in those days, there were legitimate questions surrounding the adoption and implementation of several new therapies such as EGDT, corticosteroids for septic shock, Xigris for those with APACHE scores over 25, intensive insulin therapy, etc.

Those questions are mostly answered. Sepsis is now, quite simply, a complex of systemic manifestations of infection almost all of which will resolve with treatment of the infection and general supportive care. The concept of sepsis could vanish entirely, and nothing about the clinical care of the patient would change: an infection would be diagnosed, the cause/source identified and treated, and hemodynamics and laboratory dyscrasias supported meanwhile. There is nothing else to do (because lactic acidosis does not exist.)

But because of the hegemony of the sepsis juggernaut (the spawn of the almighty dollar), we are now threatened with a mandate to treat patients carrying the sepsis label (oftentimes assigned by a hospital coder after the fact) with antibiotics and a fluid bolus within one hour of triage in the ED. Based on what evidence?

Weak recommendation, "Best Practice Statement" and some strong recommendations based on low and moderate quality evidence.  So if we whittle it down to just moderate quality of evidence, what do we have?  Give antibiotics for infections, and give vasopressors if MAP less than 65.  But now we have to hurry up and do the whole kit and caboodle boiler plate style within 60 minutes?

Sepsis need not be treated any differently than a gastrointestinal hemorrhage, or for that matter, any other disease.  You make the diagnosis, determine and control the cause (source), give appropriate treatments, and support the physiology in the meantime, all while prioritizing the sickest patients.  But that counts for all diseases, not just sepsis, and there is only so much time in an hour.  When every little old lady with fever and a UTI suddenly rises atop the priorities of the physician, this creates an opportunity cost/loss for the poor bastard bleeding next door who doesn't have 2 large-bore IVs or a type and cross yet because grandma is being flogged with 2 liters of fluid, and in a hurry.  If only somebody had poured mega-bucks into increased recognition and swift treatment of GI bleeds....


Petition to retire the surviving sepsis campaign guidelines:

(Sign the Petition Here.)

Friends,

Concern regarding the Surviving Sepsis Campaign (SSC) guidelines dates back to their inception.  Guideline development was sponsored by Eli Lilly and Edwards Life Sciences as part of a commercial marketing campaign (1).  Throughout its history, the SSC has a track record of conflicts of interest, making strong recommendations based on weak evidence, and being poorly responsive to new evidence (2-6).

The original backbone of the guidelines was a single-center trial by Rivers defining a protocol for early goal-directed therapy (7).  Even after key elements of the Rivers protocol were disproven, the SSC continued to recommend them.  For example, SSC continued to recommend the use of central venous pressure and mixed venous oxygen saturation after the emergence of evidence that they were nonbeneficial (including the PROCESS and ARISE trials).  These interventions eventually fell out of favor, despite the slow response of SSC that delayed knowledge translation. 

SSC has been sponsored by Eli Lilly, manufacturer of Activated Protein C.  The guidelines continued recommending Activated Protein C until it was pulled from international markets in 2011.  For example, the 2008 Guidelines recommended this, despite ongoing controversy and the emergence of neutral trials at that time (8,9).  Notably, 11 of 24 guideline authors had financial conflicts of interest with Eli Lilly (10).

The Infectious Disease Society of America (IDSA) refused to endorse the SSC because of a suboptimal rating system and industry sponsorship (1).  The IDSA has enormous experience in treating infection and creating guidelines.  Septic patients deserve a set of guidelines that meet the IDSA standards.


Guidelines should summarize evidence and provide recommendations to clinicians.  Unfortunately, the SSC doesn’t seem to trust clinicians to exercise judgement.  The guidelines infantilize clinicians by prescribing a rigid set of bundles which mandate specific interventions within fixed time frames (example above)(10).  These recommendations are mostly arbitrary and unsupported by evidence (11,12).  Nonetheless, they have been adopted by the Centers for Medicare & Medicaid Services as a core measure (SEP-1).  This pressures physicians to administer treatments despite their best medical judgment (e.g. fluid bolus for a patient with clinically obvious volume overload).

We have attempted to discuss these issues with the SSC in a variety of forums, ranging from personal communications to formal publications (13-15).  We have tried to illuminate deficiencies in the SSC bundles and the consequent SEP-1 core measures.  Our arguments have fallen on deaf ears. 

We have waited patiently for years in hopes that the guidelines would improve, but they have not.  The 2018 SSC update is actually worse than prior guidelines, requiring the initiation of antibiotics and 30 cc/kg fluid bolus within merely sixty minutes of emergency department triage (16).  These recommendations are arbitrary and dangerous.  They will likely cause hasty management decisions, inappropriate fluid administration, and indiscriminate use of broad-spectrum antibiotics.  We have been down this path before with other guidelines that required antibiotics for pneumonia within four hours, a recommendation that harmed patients and was eventually withdrawn (17).

It is increasingly clear that the SSC guidelines are an impediment to providing the best possible care to our septic patients.  The rigid framework mandated by SSC doesn’t help experienced clinicians provide tailored therapy to their patients.  Furthermore, the hegemony of these guidelines prevents other societies from developing better guidelines.

We are therefore petitioning for the retirement of the SSC guidelines.  In its place, we would call for the development of separate sepsis guidelines by the United States, Europe, ANZICS, and likely other locales as well.  There has been a monopoly on sepsis guidelines for too long, leading to stagnation and dogmatism.  We would hope that these new guidelines are written by collaborations of the appropriate professional societies, based on the highest evidentiary standards.  The existence of several competing sepsis guidelines could promote a diversity of opinions, regional adaptation, and flexible thinking about different approaches to sepsis. 

We are disseminating an international petition that will allow clinicians to express their displeasure and concern over these guidelines.  If you believe that our septic patients deserve more evidence-based guidelines, please stand with us.  

Sincerely,

Scott Aberegg MD MPH
Jennifer Beck-Esmay MD
Steven Carroll DO MEd
Joshua Farkas MD
Jon-Emile Kenny MD
Alex Koyfman MD
Michelle Lin MD
Brit Long MD
Manu Malbrain MD PhD
Paul Marik MD
Ken Milne MD
Justin Morgenstern MD
Segun Olusanya MD
Salim Rezaie MD
Philippe Rola MD
Manpreet Singh MD
Rory Speigel MD
Reuben Strayer MD
Anand Swaminathan MD
Adam Thomas MD
Lauren Westafer DO MPH
Scott Weingart MD

References
  1. Eichacker PQ, Natanson C, Danner RL.  Surviving Sepsis – Practice guidelines, marketing campaigns, and Eli Lilly.  New England Journal of Medicine  2006; 16: 1640-1642.
  2. Pepper DJ, Jaswal D, Sun J, Welsch J, Natanson C, Eichacker PQ.  Evidence underpinning the Centers for Medicare & Medicaid Services’ Severe Sepsis and Septic Shock Management Bundle (SEP-1): A systematic review.  Annals of Internal Medicine 2018; 168:  558-568. 
  3. Finfer S.  The Surviving Sepsis Campaign:  Robust evaluation and high-quality primary research is still needed.  Intensive Care Medicine  2010; 36:  187-189.
  4. Salluh JIF, Bozza PT, Bozza FA.  Surviving sepsis campaign:  A critical reappraisal.  Shock 2008; 30: 70-72. 
  5. Eichacker PQ, Natanson C, Danner RL.  Separating practice guidelines from pharmaceutical marketing.  Critical Care Medicine 2007; 35:  2877-2878. 
  6. Hicks P, Cooper DJ, Webb S, Myburgh J, Sppelt I, Peake S, Joyce C, Stephens D, Turner A, French C, Hart G, Jenkins I, Burrell A.  The Surviving Sepsis Campaign:  International guidelines for management of severe sepsis and septic shock: 2008.  An assessment by the Australian and New Zealand Intensive Care Society.  Anaesthesia and Intensive Care 2008; 36: 149-151.
  7. Rivers ME et al.  Early goal-directed therapy in the treatment of severe sepsis and septic shock.  New England Journal of Medicine 2001; 345: 1368-1377.
  8. Wenzel RP, Edmond MB.  Septic shock – Evaluating another failed treatment.  New England Journal of Medicine 2012; 366:  2122-2124.  
  9. Savel RH, Munro CL.  Evidence-based backlash:  The tale of drotrecogin alfa.  American Journal of Critical Care  2012; 21: 81-83. 
  10. Dellinger RP, Levy MM, Carlet JM et al.  Surviving sepsis campaign:  International guidelines for management of severe sepsis and septic shock:  2008.  Intensive Care Medicine 2008; 34:  17-60. 
  11. Allison MG, Schenkel SM.  SEP-1:  A sepsis measure in need of resuscitation?  Annals of Emergency Medicine 2018; 71: 18-20.
  12. Barochia AV, Xizhong C, Eichacker PQ.  The Surviving Sepsis Campaign’s revised sepsis bundles.  Current Infectious Disease Reports 2013; 15:  385-393. 
  13. Marik PE, Malbrain MLNG.  The SEP-1 quality mandate may be harmful: How to drown a patient with 30 ml per kg fluid!  Anesthesiology and Intensive Therapy 2017; 49(5) 323-328.
  14. Faust JS, Weingart SD.  The past, present, and future of the centers for Medicare and Medicaid Services quality measure SEP-1:  The early management bundle for severe sepsis/septic shock.  Emergency Medicine Clinics of North America 2017; 35:  219-231.
  15. Marik PE.  Surviving sepsis:  going beyond the guidelines.  Annals of Intensive Care 2011; 1: 17.
  16. Levy MM, Evans LE, Rhodes A.  The surviving sepsis campaign bundle:  2018 update.  Intensive Care Medicine.  Electronic publication ahead of print, PMID 29675566.
  17. Kanwar M, Brar N, Khatib R, Fakih MG.  Misdiagnosis of community-acquired pneumonia and inappropriate utilization of antibiotics: side effects of the 4-h antibiotic administration rule.  Chest 2007; 131: 1865-1869.

Friday, May 31, 2013

Over Easy? Trials of Prone Positioning in ARDS

Published May 20 in the  NEJM to coincide with the ATS meeting is the (latest) Guerin et al study of Prone Positioning in ARDS.  The editorialist was impressed.  He thinks that we should start proning patients similar to those in the study.  Indeed, the study results are impressive:  a 16.8% absolute reduction in mortality between the study groups with a corresponding P-value of less than 0.001.  But before we switch our tastes from sunny side up to over easy (or in some cases, over hard - referred to as the "turn of death" in ICU vernacular) we should consider some general principles as well as about a decade of other studies of prone positioning in ARDS.

First, a general principle:  regression to the mean.  Few, if any, therapies in critical care (or in medicine in general) confer a mortality benefit this large.  I refer the reader (again) to our study of delta inflation which tabulated over 30 critical care trials in the top 5 medical journals over 10 years and showed that few critical care trials show mortality deltas (absolute mortality differences) greater than 10%.   Almost all those that do are later refuted.  Indeed it was our conclusion that searching for deltas greater than or equal to 10% is akin to a fool's errand, so unlikely is the probability of finding such a difference.  Jimmy T. Sylvester, my attending at JHH in late 2001 had already recognized this.  When the now infamous sentinel trail of intensive insulin therapy (IIT) was published, we discussed it at our ICU pre-rounds lecture and he said something like "Either these data are faked, or this is revolutionary."  We now know that there was no revolution (although many ICUs continue to practice as if there had been one).  He could have just as easily said that this is an anomaly that will regress to the mean, that there is inherent bias in this study, or that "trials stopped early for benefit...."

Thursday, March 20, 2014

Sepsis Bungles: The Lessons of Early Goal Directed Therapy

On March 18th, the NEJM published early online three original trials of therapies for the critically ill that will serve as fodder for several posts.  Here, I focus on the ProCESS trial of protocol guided therapy for early septic shock.  This trial is in essence a multicenter version of the landmark 2001 trial of Early Goal Directed Therapy (EGDT) for severe sepsis by Rivers et al.  That trial showed a stunning 16% absolute reduction in mortality in sepsis attributed to the use of a protocol based on physiological goals for hemodynamic management.  That absolute reduction in mortality is perhaps the largest for any therapy in critical care medicine.  If such a reduction were confirmed, it would make EGDT the single most important therapy in the field.  If such reduction cannot be confirmed, there are several reasons why the Rivers results may have been misleading:

There were other concerns about the Rivers study and how it was later incorporated into practice, but I won't belabor them here.  The ProCESS trial randomized about 1350 patients among three groups, one simulating the original Rivers protocol, one to a modified Rivers protocol, and one representing "standard care" that is, care directed by the treating physician without a protocol.  The study had 80% power to demonstrate a mortality reduction of 6-7%.  Before you read further, please wager, will the trial show any statistically significant differences in outcome that favor EGDT or protocolized care?

Wednesday, February 10, 2016

A Focus on Fees: Why I Practice Evidence Based Medicine Like I Invest for Retirement

He is the best physician who knows the worthlessness of the most medicines."  - Ben Franklin

This blog has been highly critical of evidence, taking every opportunity to strike at any vulnerability of a trial or research program.  That is because this is serious business.  Lives and limbs hang in the balance, pharmaceutical companies stand to gain billions from "successful" trials, investigators' careers and funding are on the line if chance findings don't pan out in subsequent investigations, sometimes well-meaning convictions blind investigators and others to the truth; in short, the landscape is fertile for bias, manipulation, and even fraud.  To top it off, many of the questions about how to practice or deal with a particular problem have scant or no evidence to bear upon them, and practitioners are left to guesswork, convention, or pathophysiological reasoning - and I'm not sure which among these is most threatening.  So I am often asked, how do you deal with the uncertainty that arises from fallible evidence or paucity of evidence when you practice?

I have ruminated about this question and how to summarize the logic of my minimalist practice style for some time but yesterday the answer dawned on me:  I practice medicine like I invest in stocks, with a strategy that comports with the data, and with precepts of rational decision making.

Investors make numerous well-described and wealth destroying mistakes when they invest in stocks.  Experts such as John Bogle, Burton Malkiel, David Swenson and others have written influential books on the topic, utilizing data from studies in economics (financial and behavioral).  Key among the mistakes that investors make are trying to select high performers (such as mutual funds or hedge fund managers), chasing performance, and timing the market.  The data suggest that professional stock pickers fare little better than chance over the long run, that you cannot discern who will beat the average over the long run, and that the excess fees you are charged by high performers will negate any benefit they might otherwise have conferred to you.  The experts generally recommend that you stick with strategies that are proven beyond a reasonable doubt: a heavy concentration in stocks with their long track record of superior returns, diversification, and strict minimization of fees.  Fees are the only thing you can guarantee about your portfolio's returns.

Wednesday, April 8, 2009

The PSA Screening Quagmire - If Ignorance is Bliss then 'Tis Folly to be Wise?

The March 26th NEJM was a veritable treasure trove of interesting evidence so I can't stop after praising NICE-SUGAR and railing on intensive insulin therapy. If 6000 patients (40,000 screened) seemed like a commendable and daunting study to conduct, consider that the PLCO Project Team randomized over 76,000 US men to screening versus control (http://content.nejm.org/cgi/reprint/360/13/1310.pdf) and the ERSPC Investigators randomized over 162,000 European men in a "real-time meta-analysis" of sorts (wherein multiple simultaneous studies were conducted with similar but different enrollment requirements and combined; see: http://content.nejm.org/cgi/reprint/360/13/1320.pdf.)   This is, as the editorialist points out a "Hurculean effort" and that is fitting and poignant - because ongoing PSA screening efforts in current clinical practice represent a Hurculean effort to reduce morbidity and mortality of this disease and this reinforces the importance of the research question - are we wasting our time? Are we doing more harm than good?

The lay press was quick to start trumpeting the downfall of PSA screening with headlines such as "Prostate Test Found to Save Few Lives" . But for all their might, both of these studies give me, a longtime critic of cancer screening efforts, a good bit of pause. (Pulmonologists may be prone to "sour grapes" as a result of the failures of screening for lung cancer.)

Before I summarize briefly the studies and point out some interesting aspects of each, allow me to indulge in a few asides. First, I direct you to this interesting article in Medical Decision Making "Cure Me Even if it Kills Me". This wonderful study in judgment and decision making shows how difficult it is for patients to live with the knowledge that there is a cancer, however small growing in them. They want it out. And they want it out even if they are demonstrably worse off with it cut out or x-rayed out or whatever. It turns out that patients have a value for "getting rid of it" that probably arises from the emotional costs of living knowing there's a cancer in you. I highly recommend that anyone interested in cancer screening or treatment read this article.

This article invokes in me an unforgettable patient from my residency whom we screened in compliance with VA mandates at the time. Sure enough, this patient with heart disease had a mildly elevated PSA and sure enough he had a cancer on biopsy. And we discussed treatments in concert with our Urology colleagues. While he had many options, this patient agonized and brooded and could not live with the thought of a cancer in him He proceeded with radical prostatectomy, the most drastic of his options. And I will never forget that look of crestfallen resignation every time I saw him after that surgery because he thereafter came to clinic in diapers, having been rendered incontinent and impotent by that surgery. He was more full of self-flagellating regret than any other patient I have seen in my career. This poor man and his experience certainly jaded me at a young age and made me highly attuned to the pitfalls of PSA screening.

Against this backdrop where cancer is the most feared diagnosis in medicine, we feel an urge towards action to screen and prevent, even when there is a marginal net benefit of cancer screening, and even when other greater opportunities for improving health exist. I need not go into the literature about [ir]rational risk appraisal other than to say that our overly-exuberant fear of cancer (relative to other concerns) almost certainly leads to unrealistic hopes for screening and prevention. Hence the great interest in and attention to these two studies.

In summary, the PLCO study showed no reduction in prostate-cancer-related mortality from DRE (digital rectal examination) and PSA screening. Absence of evidence is not evidence, however, and a few points about this study deserve to be made:

~Because of high (and increasing) screening rates in the control group, this was essentially a study of the "dose" of screening. The dose in the control group was ~45 and that in the screening group was ~85%. So the question that the study asked was not really "does screening work" but rather "does doubling the dose of screening work". Had there been a favorable trend in this study, I would have been tempted to double the effect size of the screening to infer the true effect, reasoning that if increasing screening from 40% to 80% reduces prostate cancer mortality by x%, then increasing screening from 0% to 80% would reduce it by 2x%. Alas this was not the case with this study which was underpowered.

~I am very wary of studies that have cause-specific mortality as an endpoint. There's just too much room for adjudication bias, as the editorialist points out. Moreover, if you reduce prostate cancer mortality but overall mortality is unchanged, what do I, as a potential patient care? Great, you saved me from prostate cancer and I died at about the same time I would have but from an MI or a CVA instead? We have to be careful about whether our goals are good ones - the goal should not be to "fight cancer" but rather to "improve overall health". The latter, I admit, is a much less enticing and invigorating banner. We like to feel like we're fighting. (Admittedly, overall mortality appears to not differ in this study, but I'm at a loss as to what's really being reported in Table 4.) The DSMB for the ESRCP trial argue here that cancer specific mortality is most appropriate for screening trials because of dilution by other causes of mortality, and because screening for a specific cancer can only be expected to reduce mortality for that cancer. From an efficacy standpoint, I agree, but from an effectiveness standpoint, this position causes me to squint and tilt my head askance.

~It is so very interesting that this study was stopped not for futility, nor for harm, nor for efficacy, but because it was deemed necessary for the data to be released because of the [potential] impact on public health. And what has been the impact of those data? Utter confusion. That increasing screening from 40% to 80% does not improve prostate specific mortality does not say to me that we should reduce screening to 0%. In fact I don't know what to do, nor what to make of these data. Especially in the context of the next study.

In the ERSPC trial, investigators found a 20% reduction in prostate cancer deaths with screening with PSA alone in Europe. The same caveats regarding adjudication of this outcome notwithstanding, there are some very curious aspects of this trial that merit attention:

~This trial was, as I stated above, a "real-time meta-analysis" with many slightly different studies combined for analysis. I don't know what this does to internal or external validity because this is such an unfamiliar approach to me, but I'll be pondering it for a while I'm sure.

~I am concerned that I don't fully understand the way that interim analyses were performed in this trial, what the early stopping rules were, and whether a one-sided or two-sided alpha was used. Reference 6 states that it was one-sided but the index article says 2. Someone will have to help me out with the O'Brien-Fleming alpha spending function and let me know if 1% spending at each analysis is par for the course.

~As noted by the editorialist, we are not told what the "contamination rate" of screening in the control group is. If it is high, we might use my method described above to infer the actual impact of screening.

~Look at the survival curves that diverge and then appear to converge again at a low hazard rate. Is it any wonder that there is no impact on overall mortality?


So where does this all leave us? We have a population of physicians and patients that yearn for effective screening and believe in it, so much so that it is hard to conduct an uncontaminated study of screening. We have a US study that is stopped prematurely in order to inform public health, but which is inadequate to inform it. We have a European study which shows a benefit near the a priori expected benefit, but which has a bizarre design and is missing important data that we would like to consider before accepting the results. We have no hint of a benefit on overall mortality. We have lukewarm conclusions from both groups, and want desperately to know what the associated morbidities in each group are. We are spending vast amounts of resources and incurring an enormous emotional toll on men who live in fear after a positive PSA test, many of whom pay dearly ("a pound of flesh") to exorcise that fear. And we have a public over-reaction to the results of these studies which merely increase our quandary.

If ignorance is bliss, then truly 'tis folly to be wise. Perhaps this saying applies equally to individual patients, and the investigation of PSA screening in these large-scale trials. For my own part, this is one aspect of my health that I shall leave to fate and destiny, while I focus on more directly remediable aspects of preventive health, ones where the prevention is pleasurable (running and enjoying a Mediterranean diet) rather than painful (prostatectomy).

Monday, January 17, 2011

Like Two Peas in a Pod: Cis-atracurium for ARDS and the Existence of Extra Sensory Perception (ESP)

Even the lay public and popular press (see: http://www.nytimes.com/2011/01/11/science/11esp.html?_r=1&scp=1&sq=ESP&st=cse) caught on to the subversive battle between frequentist and Bayesian statistics when it was announced (ahead of print) that a prominent psychologist was to publish a report purporting to establish the presence of Extra Sensory Perception (ESP) in the Journal of Personal and Social Psychology (I don't think it's even published yet, but here's the link to the journal: http://www.apa.org/pubs/journals/psp). So we're back to my Orange Juice (OJ) analogy - if I published the results of a study showing that the enteral administration of OJ reduced severe sepsis mortality by a [marginally] statistically significant 20%, would you believe it? As Carl Sagan was fond of saying, "extraordinary claims require extraordinary evidence" - which to me means, among other things, an unbelievably small P-value produced by a study with scant evidence of bias.

And I remain utterly incredulous that the administration of a paralytic agent for 48 hours in ARDS (see Papazian et al: http://www.nejm.org/doi/full/10.1056/NEJMoa1005372#t=abstrac) is capable of reducing mortality. Indeed, FEW THERAPIES IN CRITICAL CARE MEDICINE REDUCE MORTALITY (see Figure 1 in our article on Delta Inflation: http://www.ncbi.nlm.nih.gov/pubmed/20429873). So what was the P-value of the Cox regression (read: ADJUSTED) analysis in the Papazian article? It was 0.04. This is hardly the kind of P-value that Car Sagan would have accepted as Extraordinary Evidence.

The correspondence regarding this article in the December 23rd NEJM (see: http://www.nejm.org/doi/full/10.1056/NEJMc1011677) got me to thinking again about this article. It emphasized the striking sedation practices used in this trial: patients were sedated to a Ramsay score of 6 (no response to glabellar tap) prior to randomization - the highest score on the Ramsay scale. Then they received Cis-at or placebo. Thus the Cis-at group could not, for 48 hours, "fight the vent," while the placebo group could, thereby inducing practitioners to administer more sedation. Could it be that Cis-at simply saves you from oversedation, much as intensive insulin therapy (IIT) a la 2001 Leuven protocol saved you from the deleterious effects of massive dextrose infusion after cardiac surgery?

To explore this possibility further, one needs to refer to Table 9 in the supplementary appendix of the Papazian article (see: http://www.nejm.org/doi/suppl/10.1056/NEJMoa1005372/suppl_file/nejmoa1005372_appendix.pdf ) which tabluates the total sedative doses used in the Cis-at and placebo groups DURING THE FIRST SEVEN (7) DAYS OF THE STUDY. Now, why 7 days was chosen, when the KM curves separate at 14 days (as my former colleagues O'Brien and Prescott pointed out here: http://f1000.com/5240957 ), when the study reported data on other outcomes at 28 and 90 days, remains a mystery to me. I have e-mailed the corresponding author to see if he can/will provide data on sedative doses further out. I will post any updates as further data become available. Suffice it to say, that I'm not going to be satisfied unless sedative doses further out are equivalent.

Scrutiny of Table 9 in the SA leads to some other interesting discoveries, such as the massive doses of ketamine used in this study - a practice that does not exist in the United States, as well as strong trends toward increased midazolam use in the placebo group. And if you believe Wes Ely's and others' data on benzodiazepine use, and its association with delirium and mortality, one of your eyebrows might involuntarily rise. Especially when you consider that the TOTAL sedative dose administered between groups is an elusive sum, because equivalent doses of all the various sedatives are unknown and the total sedative dose calculation is insoluble.

Thursday, May 24, 2012

Fever, external cooling, biological precedent, and the epistemology of medical evidence

It is rare occasion that one article allows me to review so many aspects of the epistemology of medical evidence, but alas Schortgen et al afforded me that opportunity in the May 15th issue of AJRCCM.

The issues raised by this article are so numerous that I shall make subsections for each one. The authors of this RCT sought to determine the effect of external cooling of febrile septic patients on vasopressor requirements and mortality. Their conclusion was that "fever control using external cooling was safe and decreased vasopressor requirements and early mortality in septic shock." Let's explore the article and the issues it raises and see if this conclusion seems justified and how this study fits into current ICU practice.

PRIOR PROBABILITY, BIOLOGICAL PLAUSIBILITY, and BIOLOGICAL PRECEDENTS

These are related but distinct issues that are best considered both before a study is planned, and before its report is read. A clinical trial is in essence a diagnostic test of a hypothesis, and like a diagnostic test, its influence on what we already know depends not only on the characteristics of the test (sensitivity and specificity in a diagnostic test; alpha and power in the case of a clinical trial) but also on the strength of our prior beliefs. To quote Sagan [again], "extraordinary claims require extraordinary evidence." I like analogies of extremes: no trial result is sufficient to convince the skeptical observer that orange juice reduces mortality in sepsis by 30%; and no evidence, however cogently presented, is sufficient to convince him that the sun will not rise tomorrow. So when we read the title of this or any other study, we should pause to ask: What is my prior belief that external cooling will reduce mortality in septic shock? That it will reduce vasopressor requirements?

Thursday, September 27, 2012

True Believers: Faith and Reason in the Adoption of Evidence

In last week's NEJM, in an editorial response to an article demonstrating that physicians, in essence, probability adjust (a la Expected Utility Theory) the likelihood that data are true based on the funding source of a study, editor-in-Chief Jeffery M. Drazen implored the journal's readership to "believe the data." Unfortunately, he did not answer the obvious question, "which data?" A perusal of the very issue in which his editorial appears, as well as this week's journal, considered in the context of more than a decade of related research demonstrates just how ironic and ludicrous his invocation is.

This November marks the eleventh year since the publication, with great fanfare, of Van den Berghe's trial of intensive insulin therapy (IIT) in the NEJM.  That article was followed by what I have called a "premature rush to adopt the therapy" (I should have called it a stampede), creation of research agendas in multiple countries and institutions devoted to its study, amassing of reams of robust data failing to confirm the original results, and a reluctance to abandon the therapy that is rivaled in its tenacity only by the enthusiasm that drove its adoption.  In light of all the data from the last decade, I am convinced of only one thing - that it remains an open question whether control of hyperglycemia within ANY range is of benefit to patients.
Suffice it to say that the Van den Berghe data have not suffered from lack of believers - the Brunkhorst, NICE-SUGAR, and Glucontrol data have - and  it would seem that in many cases what we have is not a lack of faith so much as a lack of reason when it comes to data.  The publication of an analysis of hypoglycemia using the NICE-SUGAR database in the September 20th NEJM, and a trial in this week's NEJM involving pediatric cardiac surgery patients by by Agus et al gives researchers and clinicians yet another opportunity to apply reason and reconsider their belief in IIT and for that matter the treatment of hyperglycemia in general.

Friday, May 1, 2015

Is There a Baby in That Bathwater? Status Quo Bias in Evidence Appraisal in Critical Care

"But we are not here concerned with hopes and fears, only the truth so far as our reason allows us to discover it."  -  Charles Darwin, The Descent of Man

Status quo bias is a cognitive decision making bias that leads to decision makers' preference for the choice represented by the current status quo, even when the status quo is arbitrary or irrelevant.  Decision makers tend to perceive a change from the status quo as a loss and therefore their decisions are biased toward the status quo.  This can lead to preference reversals when the status quo reference frame is changed.  The status quo can be debiased using a reversal test, i.e., manipulating the status quo either experimentally or via thought experiment to consider a change in the opposite direction.  If reluctance to change from the status quo exists in both directions, status quo bias is likely to exist.

My collaborators Peter Terry, Hal Arkes and I reported in a study published in 2006 that physicians were far more likely to abandon a therapy that was status quo or standard therapy based on new evidence of harm than they were to adopt an identical therapy based on the same evidence of benefit from a fictitious RCT (randomized controlled trial) presented in the vignette.  These results suggested that there was an asymmetric status quo bias - physicians showed a strong preference for the status quo in the adoption of new therapies, but a strong preference for abandoning the status quo when a standard of care was shown to be harmful.  Two characteristics of the vignettes used in this intersubject study deserve attention.  First, the vignettes described a standard or status quo therapy that had no support from RCTs prior to the fictitious one described in the vignette.  Second, this study was driven in part by what I perceived at the time was a curious lack of adoption of drotrecogin-alfa (Xigris), with its then purported mortality benefit and associated bleeding risk.  Thus, our vignettes had very significant trade-offs in terms of side effects in both the adopt and abandon reference frames.  Our results seemed to explain s/low uptake of Xigris, and were also consistent with the relatively rapid abandonment of hormone replacement therapy (HRT) after publication of the WHI, the first RCT of HRT.

Wednesday, August 5, 2009

Defining sample size for an a priori unidentifiable population: Tricks of the Tricksters

During a recent review of critical care literature for a paper on trial design, a few trials (and groups) were noted to have pulled a fast one and apparently slipped it by the witting or unwitting reviewers and editors. This has arisen in the case of two therapies which have in common a targeted population in which efficacy is expected which population cannot be identified at the outset. What's more, both of the therapies are thought to be mandatory to begin early for maximal efficacy, at a time when the specific target population cannot be identified. These two therapies are intensive insulin therapy (IIT) and corticosteroids for septic shock (CSS). In the case of IIT, the authors (Greet Van den Berghe et al) believe that IIT will be most effective in a population that remains in the ICU for at least some specified time, say 3 or 5 days. That is, "the therapy needs time to work." The problem is that there is no way to tell how long a person will remain in the ICU in advance. The same problem crops up for CSS because the authors (Annane, et al) wish to target non-responders to ACTH, but they cannot identify them at the outset; and they also believe that "early" administration is essential for efficacy. The solution used by both of these groups for this problem raises some interesting and troubling questions about the design of these trials and other trials like them in the future.

An "intention-to-treat" population must be identified at the trial outset. You need to have some a priori identifiable population that you target and you must analyze that population. If you don't do that, you can have selective dropout or crossovers that undermine your randomization and with it one of the basic guarantors of freedom from bias in your trial. Suppose that you had a therapy that you thought would reduce mortality, but only in patients that live at least 3 days, based on the reasoning that if you died prior to day three, you were too sick to be saved by anything. And suppose that you thought also that for your therapy to work, it had to be administered early. Suppose further that you enroll 1000 patients but 30% of them (300) die prior to day three. Would it be fair to exclude these 300 and analyze the data only for the 700 patients who lived past three days (some of whom die later than three days)? Even if you think it is allowable to do so, does the power of your trial derive from 700 patients or 1000? What if your therapy leads to excess deaths in the first three days? Even if you are correct that your therapy improves late(r) mortality, what if there are other side effects that are constant with respect to time? Do we analyze the entire population when we analyze these side effects or do we analyze the entire population, the "intention-to-treat" population?

In essence what you are saying when you design such a trial is that you think that the early deaths will "dilute out" the effect of your therapy, much as people who drop out of a trial or do not take their assigned antihypertensive pills dilute out an effect in a trial. But in these trials, you would account for drop-out rates and non-compliance by raising your sample size. Which is exactly what you should do if you think that early deaths, ACTH responders, or early-departures from the ICU will dilute out your effect. You raise your sample size.

But what I have discovered in the case of the IIT trials is that the authors wish to have their cake and eat it too. In these trials, they power the trial as if the effect they seek in the sub-population will exist in the intention-to-treat population (e.g., http://content.nejm.org/cgi/content/abstract/354/5/449 ; inadequate information is provided in the 2001 study.) In the case of CSS (http://jama.ama-assn.org/cgi/content/abstract/288/7/862?maxtoshow=&HITS=10&hits=10&RESULTFORMAT=&fulltext=annane+septic+shock&searchid=1&FIRSTINDEX=0&resourcetype=HWCIT ), I cannot even justify the power calculations that are provided in the manuscript, but another concerning problem occurs. First, note that in Table 4 ADJUSTED odds ratios are reported so these are not raw data. Overall there appears to be a trend toward benefit in the overall group in terms of an ADJUSTED odds ratio with an associated P-value of 0.09. But look at the responders versus non-responders. While, (AFTER ADJUSTMENT) there is a statistically significant benefit in non-responders (10% reduction in mortality), there is a trend towards HARM in the responders (10% increase in mortality)! [I will not even delve into the issue of presenting risk as odds when the event rate is high as it is here, and how it inflates the apparent relative benefit.] This is just the issue we are concerned about when we analyze what are basically subgroups, even if they are prospectively defined subgroups. A subgroup is NOT an intention-to-treat population, and if we focus on the subgroup, we risk ignoring harmful effects in the other patients in the trial, we inflate the apparent number needed to treat, and we run the risk of ending up with an underpowered trial because we have ignored the fact that patients who are enrolled who don't a posteriori fit our target population are essentially drop-outs and should have been accounted for in sample size calculations.

This is very similar to what happened in an early trial of a biological agent for sepsis (http://content.nejm.org/cgi/content/abstract/324/7/429 ). The agent, HA-1A human monoclonal antibody against endotoxin, was effective in the subgroup of patients with gram negative infections, which of course could not be prospectively identified. It was not effective in the overall population. It was never approved and never entered into clinical use, because, like the investigators, clinicians will have no way of knowing a priori which patients have gram negative infections and which ones will not, so their experience with the clinical use of the agent is more properly represented by the investigation's result in the overall population.

[I am reminded here of the 2004 Rumbak study in Critical Care Medicine in which a prediction was made as to who would require 14 or more days of mechanical ventilation as a requirement for entry into a study which randomized patients to tracheostomy or conventional care on day 2. In this study, an investigator made the prediction of length of mechanical ventilation, based on unspecified criteria, which was a major shortcoming of the study in spite of the fact that the investigator was correct in about 80% of cases. See: http://journals.lww.com/ccmjournal/pages/articleviewer.aspx?year=2004&issue=08000&article=00009&type=abstract ]

I propose several solutions to this problem. Firstly, studies should be powered for the expected effect in the overall population, and this effect should account for dilution caused by enrollment of patients who a posteriori are not the target population (e.g., ACTH responders or early-departures from the ICU.) Secondly, only overall results from the intention-to-treat population should be presented and heeded by clinicians. And thirdly, efforts to better identify the target population a priori should be undertaken. Surely Van den Berghe by now have sufficient data to predict who will remain in the ICU for more than 3-5 days. And surely those studying CSS could require a response or non-response to a rapid ACTH test as a requirement for enrollment or exclusion.