Showing posts sorted by date for query intensive insulin. Sort by relevance Show all posts
Showing posts sorted by date for query intensive insulin. Sort by relevance Show all posts

Thursday, April 25, 2019

The EOLIA ECMO Bayesian Reanalysis in JAMA

A Hantavirus patient on ECMO, circa 2000
Spoiler alert:  I'm a Bayesian decision maker (although maybe not a Bayesian trialist) and I "believe" in ECMO as documented here.

My letter to the editor of JAMA was published today (and yeah I know, I write too many letters, but hey, I read a lot and regular peer review often doesn't cut it) and even when you come at them like a spider monkey, the authors of the original article still get the last word (and they deserve it - they have done far more work than the post-publication peer review hecklers with their quibbles and their niggling letters.)

But to set some thing clear, I will need some more words to elucidate some points about the study's interpretation.  The authors' response to my letter has five points.
  1. I (not they) committed confirmation bias, because I postulated harm from ECMO.  First, I do not have a personal prior for harm from ECMO, I actually think it is probably beneficial in properly selected patients, as is well documented in the blog post from 2011 describing my history of experience with it in hantavirus, and as well in a book chapter I wrote in Cardiopulmonary Bypass Principles and Practice circa 2006.  There is irony here - I "believe in" ECMO, I just don't think their Bayesian reanalysis supports my (or anybody's) beliefs in a rational way!  The point is that it was a post hoc unregistered Bayesian analysis after a pre-registered frequentist study which was "negative" (for all that's worth and not worth), and the authors clearly believe in the efficacy of ECMO as do I.  In finding shortcomings in their analysis, I seek to disconfirm or at least challenge no only their but my own beliefs.  And I think that if the EOLIA trial had been positive, that we would not be publishing Bayesian reanalyses showing how the frequentist trial may be a type I error.  We know from long experience that if EOLIA had been "positive" that success would have been declared for ECMO as it has been with prone positioning for ARDS.  (I prone patients too.)  The trend is to confirm rather than to disconfirm, but good science relies more on the latter.
  2. That a RR of 1.0 for ECMO is a "strongly skeptical" prior.  It may seem strong from a true believer standpoint, but not from a true nonbeliever standpoint.  Those are the true skeptics (I know some, but I'll not mention names - I'm not one of them) who think that ECMO is really harmful on the net, like intensive insulin therapy (IIT) probably is.  Regardless of all the preceding trials, if you ask the NICE-SUGAR investigators, they are likely to maintain that IIT is harmful.  Importantly, the authors skirt the issue of the emphasis they place on the only longstanding and widely regarded as positive ARDS trial (of low tidal volume).  There are three decades of trials in ARDS patients, scores of them, enrolling tens of thousands of patients, that show no effect of the various therapies.  Why would we give primacy to the the one trial which was positive, and equate ECMO to low tidal volume?  Why not equate it to high PEEP, or corticosteroids for ARDS?  A truly skeptical prior would have been centered on an aggregate point estimate and associated distribution of 30 years of all trials in ARDS of all therapies (the vast majority of them "negative").  The sheer magnitude of their numbers would narrow the width of the prior distribution with RR centered on 1.0 (the "severely skeptical" one), and it would pull the posterior more towards zero benefit, a null result.  Indeed, such a narrow prior distribution may have shown that low tidal volume is an outlier and likely to be a false positive (I won't go any farther down that perilous path).  The point is, even if you think a RR of 1.0 is severely skeptical, the width of the distribution counts for a lot too, and the uninitiated are likely to miss that important point.
  3. Priors are not used to "boost" the effect of ECMO.  (My original letter called it a Bayesian boost, borrowing from Mayo, but the adjective was edited out.) Maybe not always, but that was the effect in this case, and the respondents did not cite any examples of a positive frequentist result that was reanalyzed with Bayesian methods to "dampen" the observed effect.  It seems to only go one way, and that's why I alluded to confirmation bias.  The "data-driven priors" they published were tilted towards a positive result, as described above.
  4. Evidence and beliefs.  But as Russell said "The degree to which beliefs are based on evidence is very much less than believers suppose."  I support Russell's quip with the aforementioned.
  5. Judgment is subjective, etc.  I would welcome a poll, in the spirit of crowdsourcing, as we did here to better understand what the community thinks about ECMO (my guess is it's split ratherly evenly, with a trend, perhaps strong, for the efficacy of ECMO).  The authors' analysis is laudable, but it is not based on information not already available to the crowd; rather it transforms it in ways may not be transparent to the crowd and may magnify it in a biased fashion if people unfamiliar with Bayesian methods do not scrutinize the chosen prior distributions.

Wednesday, May 2, 2018

Hollow Hegemony: The Opportunity Costs of Overemphasizing Sepsis


Protocols are to make complex tasks simple, not simple tasks complex. - Scott K Aberegg

Yet here we find ourselves some 16 years after the inauguration of the Surviving Sepsis Campaign, and their influence continues to metastasize, even after the message has been hollowed out like a piece of fallen, old-growth timber.

Surviving sepsis was the brainchild of Eli Lilly, who, in the year after the ill-fated FDA approval of drotrecogin-alfa, worried that the drug would not sell well if clinicians did not have an increased awareness of sepsis. That aside, in those days, there were legitimate questions surrounding the adoption and implementation of several new therapies such as EGDT, corticosteroids for septic shock, Xigris for those with APACHE scores over 25, intensive insulin therapy, etc.

Those questions are mostly answered. Sepsis is now, quite simply, a complex of systemic manifestations of infection almost all of which will resolve with treatment of the infection and general supportive care. The concept of sepsis could vanish entirely, and nothing about the clinical care of the patient would change: an infection would be diagnosed, the cause/source identified and treated, and hemodynamics and laboratory dyscrasias supported meanwhile. There is nothing else to do (because lactic acidosis does not exist.)

But because of the hegemony of the sepsis juggernaut (the spawn of the almighty dollar), we are now threatened with a mandate to treat patients carrying the sepsis label (oftentimes assigned by a hospital coder after the fact) with antibiotics and a fluid bolus within one hour of triage in the ED. Based on what evidence?

Weak recommendation, "Best Practice Statement" and some strong recommendations based on low and moderate quality evidence.  So if we whittle it down to just moderate quality of evidence, what do we have?  Give antibiotics for infections, and give vasopressors if MAP less than 65.  But now we have to hurry up and do the whole kit and caboodle boiler plate style within 60 minutes?

Sepsis need not be treated any differently than a gastrointestinal hemorrhage, or for that matter, any other disease.  You make the diagnosis, determine and control the cause (source), give appropriate treatments, and support the physiology in the meantime, all while prioritizing the sickest patients.  But that counts for all diseases, not just sepsis, and there is only so much time in an hour.  When every little old lady with fever and a UTI suddenly rises atop the priorities of the physician, this creates an opportunity cost/loss for the poor bastard bleeding next door who doesn't have 2 large-bore IVs or a type and cross yet because grandma is being flogged with 2 liters of fluid, and in a hurry.  If only somebody had poured mega-bucks into increased recognition and swift treatment of GI bleeds....


Petition to retire the surviving sepsis campaign guidelines:

(Sign the Petition Here.)

Friends,

Concern regarding the Surviving Sepsis Campaign (SSC) guidelines dates back to their inception.  Guideline development was sponsored by Eli Lilly and Edwards Life Sciences as part of a commercial marketing campaign (1).  Throughout its history, the SSC has a track record of conflicts of interest, making strong recommendations based on weak evidence, and being poorly responsive to new evidence (2-6).

The original backbone of the guidelines was a single-center trial by Rivers defining a protocol for early goal-directed therapy (7).  Even after key elements of the Rivers protocol were disproven, the SSC continued to recommend them.  For example, SSC continued to recommend the use of central venous pressure and mixed venous oxygen saturation after the emergence of evidence that they were nonbeneficial (including the PROCESS and ARISE trials).  These interventions eventually fell out of favor, despite the slow response of SSC that delayed knowledge translation. 

SSC has been sponsored by Eli Lilly, manufacturer of Activated Protein C.  The guidelines continued recommending Activated Protein C until it was pulled from international markets in 2011.  For example, the 2008 Guidelines recommended this, despite ongoing controversy and the emergence of neutral trials at that time (8,9).  Notably, 11 of 24 guideline authors had financial conflicts of interest with Eli Lilly (10).

The Infectious Disease Society of America (IDSA) refused to endorse the SSC because of a suboptimal rating system and industry sponsorship (1).  The IDSA has enormous experience in treating infection and creating guidelines.  Septic patients deserve a set of guidelines that meet the IDSA standards.


Guidelines should summarize evidence and provide recommendations to clinicians.  Unfortunately, the SSC doesn’t seem to trust clinicians to exercise judgement.  The guidelines infantilize clinicians by prescribing a rigid set of bundles which mandate specific interventions within fixed time frames (example above)(10).  These recommendations are mostly arbitrary and unsupported by evidence (11,12).  Nonetheless, they have been adopted by the Centers for Medicare & Medicaid Services as a core measure (SEP-1).  This pressures physicians to administer treatments despite their best medical judgment (e.g. fluid bolus for a patient with clinically obvious volume overload).

We have attempted to discuss these issues with the SSC in a variety of forums, ranging from personal communications to formal publications (13-15).  We have tried to illuminate deficiencies in the SSC bundles and the consequent SEP-1 core measures.  Our arguments have fallen on deaf ears. 

We have waited patiently for years in hopes that the guidelines would improve, but they have not.  The 2018 SSC update is actually worse than prior guidelines, requiring the initiation of antibiotics and 30 cc/kg fluid bolus within merely sixty minutes of emergency department triage (16).  These recommendations are arbitrary and dangerous.  They will likely cause hasty management decisions, inappropriate fluid administration, and indiscriminate use of broad-spectrum antibiotics.  We have been down this path before with other guidelines that required antibiotics for pneumonia within four hours, a recommendation that harmed patients and was eventually withdrawn (17).

It is increasingly clear that the SSC guidelines are an impediment to providing the best possible care to our septic patients.  The rigid framework mandated by SSC doesn’t help experienced clinicians provide tailored therapy to their patients.  Furthermore, the hegemony of these guidelines prevents other societies from developing better guidelines.

We are therefore petitioning for the retirement of the SSC guidelines.  In its place, we would call for the development of separate sepsis guidelines by the United States, Europe, ANZICS, and likely other locales as well.  There has been a monopoly on sepsis guidelines for too long, leading to stagnation and dogmatism.  We would hope that these new guidelines are written by collaborations of the appropriate professional societies, based on the highest evidentiary standards.  The existence of several competing sepsis guidelines could promote a diversity of opinions, regional adaptation, and flexible thinking about different approaches to sepsis. 

We are disseminating an international petition that will allow clinicians to express their displeasure and concern over these guidelines.  If you believe that our septic patients deserve more evidence-based guidelines, please stand with us.  

Sincerely,

Scott Aberegg MD MPH
Jennifer Beck-Esmay MD
Steven Carroll DO MEd
Joshua Farkas MD
Jon-Emile Kenny MD
Alex Koyfman MD
Michelle Lin MD
Brit Long MD
Manu Malbrain MD PhD
Paul Marik MD
Ken Milne MD
Justin Morgenstern MD
Segun Olusanya MD
Salim Rezaie MD
Philippe Rola MD
Manpreet Singh MD
Rory Speigel MD
Reuben Strayer MD
Anand Swaminathan MD
Adam Thomas MD
Lauren Westafer DO MPH
Scott Weingart MD

References
  1. Eichacker PQ, Natanson C, Danner RL.  Surviving Sepsis – Practice guidelines, marketing campaigns, and Eli Lilly.  New England Journal of Medicine  2006; 16: 1640-1642.
  2. Pepper DJ, Jaswal D, Sun J, Welsch J, Natanson C, Eichacker PQ.  Evidence underpinning the Centers for Medicare & Medicaid Services’ Severe Sepsis and Septic Shock Management Bundle (SEP-1): A systematic review.  Annals of Internal Medicine 2018; 168:  558-568. 
  3. Finfer S.  The Surviving Sepsis Campaign:  Robust evaluation and high-quality primary research is still needed.  Intensive Care Medicine  2010; 36:  187-189.
  4. Salluh JIF, Bozza PT, Bozza FA.  Surviving sepsis campaign:  A critical reappraisal.  Shock 2008; 30: 70-72. 
  5. Eichacker PQ, Natanson C, Danner RL.  Separating practice guidelines from pharmaceutical marketing.  Critical Care Medicine 2007; 35:  2877-2878. 
  6. Hicks P, Cooper DJ, Webb S, Myburgh J, Sppelt I, Peake S, Joyce C, Stephens D, Turner A, French C, Hart G, Jenkins I, Burrell A.  The Surviving Sepsis Campaign:  International guidelines for management of severe sepsis and septic shock: 2008.  An assessment by the Australian and New Zealand Intensive Care Society.  Anaesthesia and Intensive Care 2008; 36: 149-151.
  7. Rivers ME et al.  Early goal-directed therapy in the treatment of severe sepsis and septic shock.  New England Journal of Medicine 2001; 345: 1368-1377.
  8. Wenzel RP, Edmond MB.  Septic shock – Evaluating another failed treatment.  New England Journal of Medicine 2012; 366:  2122-2124.  
  9. Savel RH, Munro CL.  Evidence-based backlash:  The tale of drotrecogin alfa.  American Journal of Critical Care  2012; 21: 81-83. 
  10. Dellinger RP, Levy MM, Carlet JM et al.  Surviving sepsis campaign:  International guidelines for management of severe sepsis and septic shock:  2008.  Intensive Care Medicine 2008; 34:  17-60. 
  11. Allison MG, Schenkel SM.  SEP-1:  A sepsis measure in need of resuscitation?  Annals of Emergency Medicine 2018; 71: 18-20.
  12. Barochia AV, Xizhong C, Eichacker PQ.  The Surviving Sepsis Campaign’s revised sepsis bundles.  Current Infectious Disease Reports 2013; 15:  385-393. 
  13. Marik PE, Malbrain MLNG.  The SEP-1 quality mandate may be harmful: How to drown a patient with 30 ml per kg fluid!  Anesthesiology and Intensive Therapy 2017; 49(5) 323-328.
  14. Faust JS, Weingart SD.  The past, present, and future of the centers for Medicare and Medicaid Services quality measure SEP-1:  The early management bundle for severe sepsis/septic shock.  Emergency Medicine Clinics of North America 2017; 35:  219-231.
  15. Marik PE.  Surviving sepsis:  going beyond the guidelines.  Annals of Intensive Care 2011; 1: 17.
  16. Levy MM, Evans LE, Rhodes A.  The surviving sepsis campaign bundle:  2018 update.  Intensive Care Medicine.  Electronic publication ahead of print, PMID 29675566.
  17. Kanwar M, Brar N, Khatib R, Fakih MG.  Misdiagnosis of community-acquired pneumonia and inappropriate utilization of antibiotics: side effects of the 4-h antibiotic administration rule.  Chest 2007; 131: 1865-1869.

Wednesday, February 10, 2016

A Focus on Fees: Why I Practice Evidence Based Medicine Like I Invest for Retirement

He is the best physician who knows the worthlessness of the most medicines."  - Ben Franklin

This blog has been highly critical of evidence, taking every opportunity to strike at any vulnerability of a trial or research program.  That is because this is serious business.  Lives and limbs hang in the balance, pharmaceutical companies stand to gain billions from "successful" trials, investigators' careers and funding are on the line if chance findings don't pan out in subsequent investigations, sometimes well-meaning convictions blind investigators and others to the truth; in short, the landscape is fertile for bias, manipulation, and even fraud.  To top it off, many of the questions about how to practice or deal with a particular problem have scant or no evidence to bear upon them, and practitioners are left to guesswork, convention, or pathophysiological reasoning - and I'm not sure which among these is most threatening.  So I am often asked, how do you deal with the uncertainty that arises from fallible evidence or paucity of evidence when you practice?

I have ruminated about this question and how to summarize the logic of my minimalist practice style for some time but yesterday the answer dawned on me:  I practice medicine like I invest in stocks, with a strategy that comports with the data, and with precepts of rational decision making.

Investors make numerous well-described and wealth destroying mistakes when they invest in stocks.  Experts such as John Bogle, Burton Malkiel, David Swenson and others have written influential books on the topic, utilizing data from studies in economics (financial and behavioral).  Key among the mistakes that investors make are trying to select high performers (such as mutual funds or hedge fund managers), chasing performance, and timing the market.  The data suggest that professional stock pickers fare little better than chance over the long run, that you cannot discern who will beat the average over the long run, and that the excess fees you are charged by high performers will negate any benefit they might otherwise have conferred to you.  The experts generally recommend that you stick with strategies that are proven beyond a reasonable doubt: a heavy concentration in stocks with their long track record of superior returns, diversification, and strict minimization of fees.  Fees are the only thing you can guarantee about your portfolio's returns.

Friday, May 1, 2015

Is There a Baby in That Bathwater? Status Quo Bias in Evidence Appraisal in Critical Care

"But we are not here concerned with hopes and fears, only the truth so far as our reason allows us to discover it."  -  Charles Darwin, The Descent of Man

Status quo bias is a cognitive decision making bias that leads to decision makers' preference for the choice represented by the current status quo, even when the status quo is arbitrary or irrelevant.  Decision makers tend to perceive a change from the status quo as a loss and therefore their decisions are biased toward the status quo.  This can lead to preference reversals when the status quo reference frame is changed.  The status quo can be debiased using a reversal test, i.e., manipulating the status quo either experimentally or via thought experiment to consider a change in the opposite direction.  If reluctance to change from the status quo exists in both directions, status quo bias is likely to exist.

My collaborators Peter Terry, Hal Arkes and I reported in a study published in 2006 that physicians were far more likely to abandon a therapy that was status quo or standard therapy based on new evidence of harm than they were to adopt an identical therapy based on the same evidence of benefit from a fictitious RCT (randomized controlled trial) presented in the vignette.  These results suggested that there was an asymmetric status quo bias - physicians showed a strong preference for the status quo in the adoption of new therapies, but a strong preference for abandoning the status quo when a standard of care was shown to be harmful.  Two characteristics of the vignettes used in this intersubject study deserve attention.  First, the vignettes described a standard or status quo therapy that had no support from RCTs prior to the fictitious one described in the vignette.  Second, this study was driven in part by what I perceived at the time was a curious lack of adoption of drotrecogin-alfa (Xigris), with its then purported mortality benefit and associated bleeding risk.  Thus, our vignettes had very significant trade-offs in terms of side effects in both the adopt and abandon reference frames.  Our results seemed to explain s/low uptake of Xigris, and were also consistent with the relatively rapid abandonment of hormone replacement therapy (HRT) after publication of the WHI, the first RCT of HRT.

Thursday, March 20, 2014

Sepsis Bungles: The Lessons of Early Goal Directed Therapy

On March 18th, the NEJM published early online three original trials of therapies for the critically ill that will serve as fodder for several posts.  Here, I focus on the ProCESS trial of protocol guided therapy for early septic shock.  This trial is in essence a multicenter version of the landmark 2001 trial of Early Goal Directed Therapy (EGDT) for severe sepsis by Rivers et al.  That trial showed a stunning 16% absolute reduction in mortality in sepsis attributed to the use of a protocol based on physiological goals for hemodynamic management.  That absolute reduction in mortality is perhaps the largest for any therapy in critical care medicine.  If such a reduction were confirmed, it would make EGDT the single most important therapy in the field.  If such reduction cannot be confirmed, there are several reasons why the Rivers results may have been misleading:

There were other concerns about the Rivers study and how it was later incorporated into practice, but I won't belabor them here.  The ProCESS trial randomized about 1350 patients among three groups, one simulating the original Rivers protocol, one to a modified Rivers protocol, and one representing "standard care" that is, care directed by the treating physician without a protocol.  The study had 80% power to demonstrate a mortality reduction of 6-7%.  Before you read further, please wager, will the trial show any statistically significant differences in outcome that favor EGDT or protocolized care?

Wednesday, November 20, 2013

Chill Out: Homeopathic Hypothermia after Cardiac Arrest

In the Feb 21, 2002 NEJM, two trials of what came to be known as therapeutic hypothermia (or HACA - Hypothermia after Cardiac Arrest) were simultaneously published:  one by the HACA study group and another by Bernard et al.  During the past decade, I can think of only one other therapy which has caused such a paradigm shift in care in the ICU:  Intensive Insulin Therapy (ill-fated as it were).  Indeed, even though the 2002 studies specifically limited enrollment to out of hospital (OOH) cardiac arrest with either Ventricular Tachycardia (VT) or Ventricular Fibrillation (VF), the indications have been expanded at many institutions to include all patients with coma after cardiac arrest regardless of location or rhythm (or any other original exclusion criterion), so great has been the enthusiasm for this therapy, and so zealous its proponents.

Readers of this blog may know that I harbor measured skepticism for HACA even though I recognize that it may be beneficial.  From a pragmatic perspective, it makes sense to use it, since the outcome of hypoxic-ischemic encephalopathy (HIE) and ABI (Anoxic Brain Injury) is so dismal.  But what did the original two studies actually show?
  • The HACA group multicenter trial randomized 273 patients to hypothermia versus control and found that the hypothermia group had higher rates of "favorable neurological outcome" (a cerebral performance category of 1 or 2 - the primary endpoint) with RR of 1.40 and 95% CI 1.08-1.81; moreover, mortality was lower in the hypothermia group, with RR 0.74 and 95% CI 0.58-0.95
  • The Bernard et al study randomized 77 patients to hypothermia versus control and found that survival (the primary outcome) was 49% and 26% in the hypothermia and control groups, respectively, with P=0.046

Sunday, November 3, 2013

The Intensivist Giveth Then the Intensivist Taketh Away: Esmolol in Septic Patients Receiving High Dose Norepinephrine

Two studies in the October 23/30 issue of JAMA serve as fodder for reflection on the history and direction of critical care research and the hypotheses that drive it.   Morelli et all report the results of a study of Esmolol in septic shock.  To quickly summarize, this was a single center dose ranging study the primary aim of which was to determine if esmolol could be titrated to a heart rate goal (primary outcome), presumably with the later goal of performing a phase 3 clinical trial to see if esmolol, titrated in such a fashion, could favorably influence clinical outcomes of interest.  154 patients with septic shock on high dose norepinephrine with a heart rate greater than 95 were enrolled, and heart rate was indeed lower in the esmolol group (P less than 0.001).  Perhaps surprisingly, hemodynamic parameters, lactate clearance, and pressor and fluid requirements were (statistically significantly) improved in the esmolol group.  Most surprising (and probably the reason why we find this published in JAMA rather than Critical Care Medicine - consider that outlier results such as this may get disproportionate attention), mortality in the esmolol group was 50% compared to 80% in the control group (P less than 0.001).  The usual caveats apply here:  a small study, a single center, lack of blinding.  And regular readers will guess that I won't swallow the mortality difference.  I'm a Bayesian (click here for a nice easy-to-use Bayesian calcluator), there's no biological precedent for such a finding and it's too big a bite for me to swallow. So I will go on the record here as stating that I'm betting against similar results in a larger trial.

I'm more interested in how we formulate the hypothesis that esmolol will provide benefit in septic shock.  I was a second year medical student in 1995 when Gattinoni et al published the results of a trial of "goal-oriented hemodynamic therapy" in critically ill patients in the NEJM.  I realize that critical care research as we now recognize it was in its adolescence then, as a quick look at the methods section of that article demonstrates.  I also recognize that they enrolled a heterogenous patient population.  But it is worth reviewing the wording of the introduction to their article:

Recently, increasing attention has been directed to the hemodynamic treatment of critically ill patients, because it has been observed in several studies that patients who survived had values for the cardiac index and oxygen delivery that were higher than those of patients who died and, more important, higher than standard physiologic values.1-3 Cardiac-index values greater than 4.5 liters per minute per square meter of body-surface area and oxygen-delivery values greater than 650 ml per minute per square meter — derived empirically on the basis of the median values for patients who previously survived critical surgical illness — are commonly referred to as supranormal hemodynamic values.4

Friday, May 31, 2013

Over Easy? Trials of Prone Positioning in ARDS

Published May 20 in the  NEJM to coincide with the ATS meeting is the (latest) Guerin et al study of Prone Positioning in ARDS.  The editorialist was impressed.  He thinks that we should start proning patients similar to those in the study.  Indeed, the study results are impressive:  a 16.8% absolute reduction in mortality between the study groups with a corresponding P-value of less than 0.001.  But before we switch our tastes from sunny side up to over easy (or in some cases, over hard - referred to as the "turn of death" in ICU vernacular) we should consider some general principles as well as about a decade of other studies of prone positioning in ARDS.

First, a general principle:  regression to the mean.  Few, if any, therapies in critical care (or in medicine in general) confer a mortality benefit this large.  I refer the reader (again) to our study of delta inflation which tabulated over 30 critical care trials in the top 5 medical journals over 10 years and showed that few critical care trials show mortality deltas (absolute mortality differences) greater than 10%.   Almost all those that do are later refuted.  Indeed it was our conclusion that searching for deltas greater than or equal to 10% is akin to a fool's errand, so unlikely is the probability of finding such a difference.  Jimmy T. Sylvester, my attending at JHH in late 2001 had already recognized this.  When the now infamous sentinel trail of intensive insulin therapy (IIT) was published, we discussed it at our ICU pre-rounds lecture and he said something like "Either these data are faked, or this is revolutionary."  We now know that there was no revolution (although many ICUs continue to practice as if there had been one).  He could have just as easily said that this is an anomaly that will regress to the mean, that there is inherent bias in this study, or that "trials stopped early for benefit...."

Thursday, September 27, 2012

True Believers: Faith and Reason in the Adoption of Evidence

In last week's NEJM, in an editorial response to an article demonstrating that physicians, in essence, probability adjust (a la Expected Utility Theory) the likelihood that data are true based on the funding source of a study, editor-in-Chief Jeffery M. Drazen implored the journal's readership to "believe the data." Unfortunately, he did not answer the obvious question, "which data?" A perusal of the very issue in which his editorial appears, as well as this week's journal, considered in the context of more than a decade of related research demonstrates just how ironic and ludicrous his invocation is.

This November marks the eleventh year since the publication, with great fanfare, of Van den Berghe's trial of intensive insulin therapy (IIT) in the NEJM.  That article was followed by what I have called a "premature rush to adopt the therapy" (I should have called it a stampede), creation of research agendas in multiple countries and institutions devoted to its study, amassing of reams of robust data failing to confirm the original results, and a reluctance to abandon the therapy that is rivaled in its tenacity only by the enthusiasm that drove its adoption.  In light of all the data from the last decade, I am convinced of only one thing - that it remains an open question whether control of hyperglycemia within ANY range is of benefit to patients.
Suffice it to say that the Van den Berghe data have not suffered from lack of believers - the Brunkhorst, NICE-SUGAR, and Glucontrol data have - and  it would seem that in many cases what we have is not a lack of faith so much as a lack of reason when it comes to data.  The publication of an analysis of hypoglycemia using the NICE-SUGAR database in the September 20th NEJM, and a trial in this week's NEJM involving pediatric cardiac surgery patients by by Agus et al gives researchers and clinicians yet another opportunity to apply reason and reconsider their belief in IIT and for that matter the treatment of hyperglycemia in general.

Thursday, May 24, 2012

Fever, external cooling, biological precedent, and the epistemology of medical evidence

It is rare occasion that one article allows me to review so many aspects of the epistemology of medical evidence, but alas Schortgen et al afforded me that opportunity in the May 15th issue of AJRCCM.

The issues raised by this article are so numerous that I shall make subsections for each one. The authors of this RCT sought to determine the effect of external cooling of febrile septic patients on vasopressor requirements and mortality. Their conclusion was that "fever control using external cooling was safe and decreased vasopressor requirements and early mortality in septic shock." Let's explore the article and the issues it raises and see if this conclusion seems justified and how this study fits into current ICU practice.

PRIOR PROBABILITY, BIOLOGICAL PLAUSIBILITY, and BIOLOGICAL PRECEDENTS

These are related but distinct issues that are best considered both before a study is planned, and before its report is read. A clinical trial is in essence a diagnostic test of a hypothesis, and like a diagnostic test, its influence on what we already know depends not only on the characteristics of the test (sensitivity and specificity in a diagnostic test; alpha and power in the case of a clinical trial) but also on the strength of our prior beliefs. To quote Sagan [again], "extraordinary claims require extraordinary evidence." I like analogies of extremes: no trial result is sufficient to convince the skeptical observer that orange juice reduces mortality in sepsis by 30%; and no evidence, however cogently presented, is sufficient to convince him that the sun will not rise tomorrow. So when we read the title of this or any other study, we should pause to ask: What is my prior belief that external cooling will reduce mortality in septic shock? That it will reduce vasopressor requirements?

Monday, January 17, 2011

Like Two Peas in a Pod: Cis-atracurium for ARDS and the Existence of Extra Sensory Perception (ESP)

Even the lay public and popular press (see: http://www.nytimes.com/2011/01/11/science/11esp.html?_r=1&scp=1&sq=ESP&st=cse) caught on to the subversive battle between frequentist and Bayesian statistics when it was announced (ahead of print) that a prominent psychologist was to publish a report purporting to establish the presence of Extra Sensory Perception (ESP) in the Journal of Personal and Social Psychology (I don't think it's even published yet, but here's the link to the journal: http://www.apa.org/pubs/journals/psp). So we're back to my Orange Juice (OJ) analogy - if I published the results of a study showing that the enteral administration of OJ reduced severe sepsis mortality by a [marginally] statistically significant 20%, would you believe it? As Carl Sagan was fond of saying, "extraordinary claims require extraordinary evidence" - which to me means, among other things, an unbelievably small P-value produced by a study with scant evidence of bias.

And I remain utterly incredulous that the administration of a paralytic agent for 48 hours in ARDS (see Papazian et al: http://www.nejm.org/doi/full/10.1056/NEJMoa1005372#t=abstrac) is capable of reducing mortality. Indeed, FEW THERAPIES IN CRITICAL CARE MEDICINE REDUCE MORTALITY (see Figure 1 in our article on Delta Inflation: http://www.ncbi.nlm.nih.gov/pubmed/20429873). So what was the P-value of the Cox regression (read: ADJUSTED) analysis in the Papazian article? It was 0.04. This is hardly the kind of P-value that Car Sagan would have accepted as Extraordinary Evidence.

The correspondence regarding this article in the December 23rd NEJM (see: http://www.nejm.org/doi/full/10.1056/NEJMc1011677) got me to thinking again about this article. It emphasized the striking sedation practices used in this trial: patients were sedated to a Ramsay score of 6 (no response to glabellar tap) prior to randomization - the highest score on the Ramsay scale. Then they received Cis-at or placebo. Thus the Cis-at group could not, for 48 hours, "fight the vent," while the placebo group could, thereby inducing practitioners to administer more sedation. Could it be that Cis-at simply saves you from oversedation, much as intensive insulin therapy (IIT) a la 2001 Leuven protocol saved you from the deleterious effects of massive dextrose infusion after cardiac surgery?

To explore this possibility further, one needs to refer to Table 9 in the supplementary appendix of the Papazian article (see: http://www.nejm.org/doi/suppl/10.1056/NEJMoa1005372/suppl_file/nejmoa1005372_appendix.pdf ) which tabluates the total sedative doses used in the Cis-at and placebo groups DURING THE FIRST SEVEN (7) DAYS OF THE STUDY. Now, why 7 days was chosen, when the KM curves separate at 14 days (as my former colleagues O'Brien and Prescott pointed out here: http://f1000.com/5240957 ), when the study reported data on other outcomes at 28 and 90 days, remains a mystery to me. I have e-mailed the corresponding author to see if he can/will provide data on sedative doses further out. I will post any updates as further data become available. Suffice it to say, that I'm not going to be satisfied unless sedative doses further out are equivalent.

Scrutiny of Table 9 in the SA leads to some other interesting discoveries, such as the massive doses of ketamine used in this study - a practice that does not exist in the United States, as well as strong trends toward increased midazolam use in the placebo group. And if you believe Wes Ely's and others' data on benzodiazepine use, and its association with delirium and mortality, one of your eyebrows might involuntarily rise. Especially when you consider that the TOTAL sedative dose administered between groups is an elusive sum, because equivalent doses of all the various sedatives are unknown and the total sedative dose calculation is insoluble.

Tuesday, February 9, 2010

Post hoc non ergo propter hoc extended: A is associated with B therefore A causes B and removal of A removes B

From Annane et al, JAMA, January 27th, 2010 (see: http://jama.ama-assn.org/cgi/content/abstract/303/4/341 ):

"...patients whose septic shock is treated with hydrocortisone commonly have blood glucose levels higher than 180. These levels have clearly been associated with marked increase in the risk of dying...Thus, we hypothesized that normalization of blood glucose levels with intensive insulin treatment may improve the outcome of adults with septic shock who are treated with hydrocortisone."

The normalization heuristic is at work again.

Endocrine interventions as adjunctive treatments in critical care medicine have a sordid history. Here are some landmarks. Rewind 25 years, and as Angus has recently described (http://jama.ama-assn.org/cgi/content/extract/301/22/2388 ) we had the heroic administration of high dose corticosteroids (e.g. gram doses of methylprednisolone) for septic shock, which therapy was later abandoned. In the 1990s, we had two concurrent trials of human growth hormone in critical illness, showing the largest statistically significant harms (increased mortality of ~20%) from a therapy in critical illness that I'm aware of (see http://content.nejm.org/cgi/content/abstract/341/11/785 ). Early in the new millennium, based on two studies that should by now be largely discredited by their successors, we had intensive insulin therapy for patients with hyperglycemia and low dose corticosteroid therapy for septic shock. It is fitting then, and at least a little ironic, that this new decade should start with publication of a study combining these latter two therapies of dubious benefit: The aptly named COIITSS study.

I know this sounds overly pessimistic, but some of these therapies, these two in particular, just need to die, but are being kept alive by the hope, optimism, and proselytizing of those few individuals whose careers were made on them or continue to depend upon them. And I lament the fact that, as a result of the promotional efforts of these wayward souls, we have been distracted from the actual data. Allow me to summarize these briefly:

1.) The original Annane study of steroids in septic shock (see: http://jama.ama-assn.org/cgi/content/abstract/288/7/862 ) utilized an adjusted analysis of a subgroup of patients not identifiable at the outset of the trial (responders versus non-responders). The entire ITT (intention to treat) population had an ADJUSTED P-value of 0.09. I calculated an unadjusted P-value of 0.29 for the overall cohort. Since you cannot know at the outset who's a responder and who's not, for a practitioner, the group of interest is the ITT population, and there was NO EFFECT in this population. Somehow, the enthusiasm for this therapy was so great that we lost sight of the reasons that I assume the NEJM rejected this article - an adjusted analysis of a subgroup. Seriously! How did we lose sight of this? Blinded by hope and excitement, and the simplicity of the hypothesis - if it's low, make it high, and everything will be better. Then Sprung and the CORTICUS folks came along (see: http://content.nejm.org/cgi/content/abstract/358/2/111 ), and, as far as I'm concerned, blew the whole thing out of the water.

2.) I remember my attending raving about the Van den Berghe article (see: http://content.nejm.org/cgi/content/abstract/345/19/1359 ) as a first year fellow at Johns Hopkins in late 2001. He said "this is either the greatest therapy ever to come to critical care medicine, or these data are faked!" That got me interested. And I still distinctly remember circling something in the methods section, which was in small print in those days, on the left hand column of the left page back almost 9 years ago - that little detail about the dextrose infusions. This therapy appeared to work in post-cardiac surgery patients on dextrose infusions at a single center. I was always skeptical about it, and then the follow-up study came out, and lo and behold, NO EFFECT! But that study is still touted by the author as a positive one! Because again, like in Annane, if you remove those pesky patients who didn't stay in the MICU for 3 days (again, like Annane, not identifiable at the outset), you have a SUBGROUP analysis in which IIT (intensive insulin therapy - NOT intention to treat, ITT is inimical to IIT) works. Then you had NICE-SUGAR (see: http://content.nejm.org/cgi/content/abstract/360/13/1283 ) AND Brunkhorst et al (see: http://content.nejm.org/cgi/content/abstract/358/2/125 ) showing that IIT doesn't work. How much more data do we need? Why are we still doing this?

Because old habits die hard and so do true believers. Thus it was perhaps inevitable that we would have COITSS combine these two therapies into a single trial. Note that this trial does nothing to address whether hydrocortisone for septic shock is efficacious (it probably is NOT), but rather assumes that it is. I note also that it was started in 2006 just shortly before the second Van den Berghe study was published and well after the data from that study were known. Annane et al make no comments about whether those data impacted the conduct of their study, and whether participants were informed that a repeat of the trial upon which the Annane trail was predicated, had failed.

Annane did not use blinding for fludrocortisone in the current study, but this is minor. It is difficult to blind IIT, but usually what you do when you can't blind things adequately is you protocolize care. That was not obviously done in this trial; instead we are reassured that "everybody should have been following the Surviving Sepsis Campaign guidelines". (I'm paraphrasing.)

As astutely pointed out by Van den Berghe in the accompanying editorial, this trial was underpowered. It was just plain silly to assume (or play dumb) that a 3% ARR which is a ~25% RRR (since the baseline was under 10%) would translate into a 12.5% ARR with a baseline mortality of near 50%. Indeed, I don't know why we even talk about RRRs anymore, they're a ruse to inflate small numbers and rouse our emotions. (Her other comments, about "separation", which would be facilitated by having a very very intensive treatment and a very very lax control is reminiscent of what folks were saying about ARMA low/high Vt - namely that the trial was ungeneralizable because the "control" 12 cc/kg was unrealistic. Then you get into the Eichacker and Natanson arguments about U-shaped curves [to which there may be some truth] and how too much is bad, not enough is bad, but somewhere in the middle is the "sweet spot". And this is key. Would that I could know the sweet spot for blood sugar - and coax patients to remain there.)

Because retrospective power calculations are uncouth, I elected to calculate the 95% confidence interval (CI) for delta (the difference between the two groups) in this trial. The point estimate for delta is -2.96% (negative delta means the therapy was WORSE than control!) with a 95% confidence interval of -11.6% to +5.65%. It is most likely between 11% worse and 5% better, and any good betting man would wager that it's worse than control! But in either case, this confidence interval is uncomfortably wide and contains values for harm and benefit which should be meaningful to us, so in essence the data do not help us decide what to do with this therapy.

(And look at table 2, the main results, look they are still shamelessly reporting adjusted P-values! Isn't that why we randomize? So we don't have to adjust?)

TTo bring this saga full circle, I note that, as we saw in NICE-SUGAR, Brunkhorst, and Van den Berghe, severe hypoglycemia (<40!) was far more common in the IIT group. And severe hypoglycemia is associated with death (in most studies, but curiously not in this one). So, consistent with the hypothesis which was the impetus for this study (A is associated with B, thus A causes B and removal of A removes B), one conclusion from all these data is that hypoglycemia causes death, and should be avoided through avoidance of IIT.

Wednesday, August 5, 2009

Defining sample size for an a priori unidentifiable population: Tricks of the Tricksters

During a recent review of critical care literature for a paper on trial design, a few trials (and groups) were noted to have pulled a fast one and apparently slipped it by the witting or unwitting reviewers and editors. This has arisen in the case of two therapies which have in common a targeted population in which efficacy is expected which population cannot be identified at the outset. What's more, both of the therapies are thought to be mandatory to begin early for maximal efficacy, at a time when the specific target population cannot be identified. These two therapies are intensive insulin therapy (IIT) and corticosteroids for septic shock (CSS). In the case of IIT, the authors (Greet Van den Berghe et al) believe that IIT will be most effective in a population that remains in the ICU for at least some specified time, say 3 or 5 days. That is, "the therapy needs time to work." The problem is that there is no way to tell how long a person will remain in the ICU in advance. The same problem crops up for CSS because the authors (Annane, et al) wish to target non-responders to ACTH, but they cannot identify them at the outset; and they also believe that "early" administration is essential for efficacy. The solution used by both of these groups for this problem raises some interesting and troubling questions about the design of these trials and other trials like them in the future.

An "intention-to-treat" population must be identified at the trial outset. You need to have some a priori identifiable population that you target and you must analyze that population. If you don't do that, you can have selective dropout or crossovers that undermine your randomization and with it one of the basic guarantors of freedom from bias in your trial. Suppose that you had a therapy that you thought would reduce mortality, but only in patients that live at least 3 days, based on the reasoning that if you died prior to day three, you were too sick to be saved by anything. And suppose that you thought also that for your therapy to work, it had to be administered early. Suppose further that you enroll 1000 patients but 30% of them (300) die prior to day three. Would it be fair to exclude these 300 and analyze the data only for the 700 patients who lived past three days (some of whom die later than three days)? Even if you think it is allowable to do so, does the power of your trial derive from 700 patients or 1000? What if your therapy leads to excess deaths in the first three days? Even if you are correct that your therapy improves late(r) mortality, what if there are other side effects that are constant with respect to time? Do we analyze the entire population when we analyze these side effects or do we analyze the entire population, the "intention-to-treat" population?

In essence what you are saying when you design such a trial is that you think that the early deaths will "dilute out" the effect of your therapy, much as people who drop out of a trial or do not take their assigned antihypertensive pills dilute out an effect in a trial. But in these trials, you would account for drop-out rates and non-compliance by raising your sample size. Which is exactly what you should do if you think that early deaths, ACTH responders, or early-departures from the ICU will dilute out your effect. You raise your sample size.

But what I have discovered in the case of the IIT trials is that the authors wish to have their cake and eat it too. In these trials, they power the trial as if the effect they seek in the sub-population will exist in the intention-to-treat population (e.g., http://content.nejm.org/cgi/content/abstract/354/5/449 ; inadequate information is provided in the 2001 study.) In the case of CSS (http://jama.ama-assn.org/cgi/content/abstract/288/7/862?maxtoshow=&HITS=10&hits=10&RESULTFORMAT=&fulltext=annane+septic+shock&searchid=1&FIRSTINDEX=0&resourcetype=HWCIT ), I cannot even justify the power calculations that are provided in the manuscript, but another concerning problem occurs. First, note that in Table 4 ADJUSTED odds ratios are reported so these are not raw data. Overall there appears to be a trend toward benefit in the overall group in terms of an ADJUSTED odds ratio with an associated P-value of 0.09. But look at the responders versus non-responders. While, (AFTER ADJUSTMENT) there is a statistically significant benefit in non-responders (10% reduction in mortality), there is a trend towards HARM in the responders (10% increase in mortality)! [I will not even delve into the issue of presenting risk as odds when the event rate is high as it is here, and how it inflates the apparent relative benefit.] This is just the issue we are concerned about when we analyze what are basically subgroups, even if they are prospectively defined subgroups. A subgroup is NOT an intention-to-treat population, and if we focus on the subgroup, we risk ignoring harmful effects in the other patients in the trial, we inflate the apparent number needed to treat, and we run the risk of ending up with an underpowered trial because we have ignored the fact that patients who are enrolled who don't a posteriori fit our target population are essentially drop-outs and should have been accounted for in sample size calculations.

This is very similar to what happened in an early trial of a biological agent for sepsis (http://content.nejm.org/cgi/content/abstract/324/7/429 ). The agent, HA-1A human monoclonal antibody against endotoxin, was effective in the subgroup of patients with gram negative infections, which of course could not be prospectively identified. It was not effective in the overall population. It was never approved and never entered into clinical use, because, like the investigators, clinicians will have no way of knowing a priori which patients have gram negative infections and which ones will not, so their experience with the clinical use of the agent is more properly represented by the investigation's result in the overall population.

[I am reminded here of the 2004 Rumbak study in Critical Care Medicine in which a prediction was made as to who would require 14 or more days of mechanical ventilation as a requirement for entry into a study which randomized patients to tracheostomy or conventional care on day 2. In this study, an investigator made the prediction of length of mechanical ventilation, based on unspecified criteria, which was a major shortcoming of the study in spite of the fact that the investigator was correct in about 80% of cases. See: http://journals.lww.com/ccmjournal/pages/articleviewer.aspx?year=2004&issue=08000&article=00009&type=abstract ]

I propose several solutions to this problem. Firstly, studies should be powered for the expected effect in the overall population, and this effect should account for dilution caused by enrollment of patients who a posteriori are not the target population (e.g., ACTH responders or early-departures from the ICU.) Secondly, only overall results from the intention-to-treat population should be presented and heeded by clinicians. And thirdly, efforts to better identify the target population a priori should be undertaken. Surely Van den Berghe by now have sufficient data to predict who will remain in the ICU for more than 3-5 days. And surely those studying CSS could require a response or non-response to a rapid ACTH test as a requirement for enrollment or exclusion.

Wednesday, April 8, 2009

The PSA Screening Quagmire - If Ignorance is Bliss then 'Tis Folly to be Wise?

The March 26th NEJM was a veritable treasure trove of interesting evidence so I can't stop after praising NICE-SUGAR and railing on intensive insulin therapy. If 6000 patients (40,000 screened) seemed like a commendable and daunting study to conduct, consider that the PLCO Project Team randomized over 76,000 US men to screening versus control (http://content.nejm.org/cgi/reprint/360/13/1310.pdf) and the ERSPC Investigators randomized over 162,000 European men in a "real-time meta-analysis" of sorts (wherein multiple simultaneous studies were conducted with similar but different enrollment requirements and combined; see: http://content.nejm.org/cgi/reprint/360/13/1320.pdf.)   This is, as the editorialist points out a "Hurculean effort" and that is fitting and poignant - because ongoing PSA screening efforts in current clinical practice represent a Hurculean effort to reduce morbidity and mortality of this disease and this reinforces the importance of the research question - are we wasting our time? Are we doing more harm than good?

The lay press was quick to start trumpeting the downfall of PSA screening with headlines such as "Prostate Test Found to Save Few Lives" . But for all their might, both of these studies give me, a longtime critic of cancer screening efforts, a good bit of pause. (Pulmonologists may be prone to "sour grapes" as a result of the failures of screening for lung cancer.)

Before I summarize briefly the studies and point out some interesting aspects of each, allow me to indulge in a few asides. First, I direct you to this interesting article in Medical Decision Making "Cure Me Even if it Kills Me". This wonderful study in judgment and decision making shows how difficult it is for patients to live with the knowledge that there is a cancer, however small growing in them. They want it out. And they want it out even if they are demonstrably worse off with it cut out or x-rayed out or whatever. It turns out that patients have a value for "getting rid of it" that probably arises from the emotional costs of living knowing there's a cancer in you. I highly recommend that anyone interested in cancer screening or treatment read this article.

This article invokes in me an unforgettable patient from my residency whom we screened in compliance with VA mandates at the time. Sure enough, this patient with heart disease had a mildly elevated PSA and sure enough he had a cancer on biopsy. And we discussed treatments in concert with our Urology colleagues. While he had many options, this patient agonized and brooded and could not live with the thought of a cancer in him He proceeded with radical prostatectomy, the most drastic of his options. And I will never forget that look of crestfallen resignation every time I saw him after that surgery because he thereafter came to clinic in diapers, having been rendered incontinent and impotent by that surgery. He was more full of self-flagellating regret than any other patient I have seen in my career. This poor man and his experience certainly jaded me at a young age and made me highly attuned to the pitfalls of PSA screening.

Against this backdrop where cancer is the most feared diagnosis in medicine, we feel an urge towards action to screen and prevent, even when there is a marginal net benefit of cancer screening, and even when other greater opportunities for improving health exist. I need not go into the literature about [ir]rational risk appraisal other than to say that our overly-exuberant fear of cancer (relative to other concerns) almost certainly leads to unrealistic hopes for screening and prevention. Hence the great interest in and attention to these two studies.

In summary, the PLCO study showed no reduction in prostate-cancer-related mortality from DRE (digital rectal examination) and PSA screening. Absence of evidence is not evidence, however, and a few points about this study deserve to be made:

~Because of high (and increasing) screening rates in the control group, this was essentially a study of the "dose" of screening. The dose in the control group was ~45 and that in the screening group was ~85%. So the question that the study asked was not really "does screening work" but rather "does doubling the dose of screening work". Had there been a favorable trend in this study, I would have been tempted to double the effect size of the screening to infer the true effect, reasoning that if increasing screening from 40% to 80% reduces prostate cancer mortality by x%, then increasing screening from 0% to 80% would reduce it by 2x%. Alas this was not the case with this study which was underpowered.

~I am very wary of studies that have cause-specific mortality as an endpoint. There's just too much room for adjudication bias, as the editorialist points out. Moreover, if you reduce prostate cancer mortality but overall mortality is unchanged, what do I, as a potential patient care? Great, you saved me from prostate cancer and I died at about the same time I would have but from an MI or a CVA instead? We have to be careful about whether our goals are good ones - the goal should not be to "fight cancer" but rather to "improve overall health". The latter, I admit, is a much less enticing and invigorating banner. We like to feel like we're fighting. (Admittedly, overall mortality appears to not differ in this study, but I'm at a loss as to what's really being reported in Table 4.) The DSMB for the ESRCP trial argue here that cancer specific mortality is most appropriate for screening trials because of dilution by other causes of mortality, and because screening for a specific cancer can only be expected to reduce mortality for that cancer. From an efficacy standpoint, I agree, but from an effectiveness standpoint, this position causes me to squint and tilt my head askance.

~It is so very interesting that this study was stopped not for futility, nor for harm, nor for efficacy, but because it was deemed necessary for the data to be released because of the [potential] impact on public health. And what has been the impact of those data? Utter confusion. That increasing screening from 40% to 80% does not improve prostate specific mortality does not say to me that we should reduce screening to 0%. In fact I don't know what to do, nor what to make of these data. Especially in the context of the next study.

In the ERSPC trial, investigators found a 20% reduction in prostate cancer deaths with screening with PSA alone in Europe. The same caveats regarding adjudication of this outcome notwithstanding, there are some very curious aspects of this trial that merit attention:

~This trial was, as I stated above, a "real-time meta-analysis" with many slightly different studies combined for analysis. I don't know what this does to internal or external validity because this is such an unfamiliar approach to me, but I'll be pondering it for a while I'm sure.

~I am concerned that I don't fully understand the way that interim analyses were performed in this trial, what the early stopping rules were, and whether a one-sided or two-sided alpha was used. Reference 6 states that it was one-sided but the index article says 2. Someone will have to help me out with the O'Brien-Fleming alpha spending function and let me know if 1% spending at each analysis is par for the course.

~As noted by the editorialist, we are not told what the "contamination rate" of screening in the control group is. If it is high, we might use my method described above to infer the actual impact of screening.

~Look at the survival curves that diverge and then appear to converge again at a low hazard rate. Is it any wonder that there is no impact on overall mortality?


So where does this all leave us? We have a population of physicians and patients that yearn for effective screening and believe in it, so much so that it is hard to conduct an uncontaminated study of screening. We have a US study that is stopped prematurely in order to inform public health, but which is inadequate to inform it. We have a European study which shows a benefit near the a priori expected benefit, but which has a bizarre design and is missing important data that we would like to consider before accepting the results. We have no hint of a benefit on overall mortality. We have lukewarm conclusions from both groups, and want desperately to know what the associated morbidities in each group are. We are spending vast amounts of resources and incurring an enormous emotional toll on men who live in fear after a positive PSA test, many of whom pay dearly ("a pound of flesh") to exorcise that fear. And we have a public over-reaction to the results of these studies which merely increase our quandary.

If ignorance is bliss, then truly 'tis folly to be wise. Perhaps this saying applies equally to individual patients, and the investigation of PSA screening in these large-scale trials. For my own part, this is one aspect of my health that I shall leave to fate and destiny, while I focus on more directly remediable aspects of preventive health, ones where the prevention is pleasurable (running and enjoying a Mediterranean diet) rather than painful (prostatectomy).