Showing posts with label expected utility theory. Show all posts
Showing posts with label expected utility theory. Show all posts

Sunday, August 27, 2017

Just Do As I (Vaguely) Say: The Folly of Clinical Practice Guidelines

If you didn't care to know anything about finance, and you hired a financial adviser (paid hourly, not through commissions, of course) you would be happy to have him simply tell you to invest all of your assets into a Vanguard life cycle fund.  But you may then be surprised that a different adviser told one of your contemporaries that the approach was oversimple and that you should have several classes of assets in your portfolio that are not included in the life cycle funds, such as gold or commodities.  In light of the discrepancies, you may conclude that to make the best economic choices for yourself, you need to understand finance and the data upon which the advisers are basing their recommendations.

Making medical decisions optimally is akin to making economic decisions and is founded on a simple framework:  EUT, or Expected Utility Theory.  To determine whether to pursue a course of action versus another one, we add up the benefits of a course multiplied by their probability of accruing (that product is the positive utility of the course of action) and then subtract the product of the costs of the course of action and their probability of accruing (the negative utility).  If utility is positive, we pursue a course of action, and if options are available, we pursue the course with the highest positive utility.  Ideally, anybody helping you navigate such a decision framework would tell you the numbers so you could do the calculus.  Using the finance analogy again, if the adviser told you "Stocks have positive returns.  So do bonds.  Stocks are riskier than bonds" - without any quantification, you may conclude that a portfolio full of bonds is the best course of action - and usually it is not.

I regret to report that that is exactly what clinical practice guideline writers do:  provide summary information without any numerical data to support it, leaving the practitioner with two choices:

  1. Just do as the guideline writer says
  2. Go figure it out for herself with a primary data search

Thursday, April 6, 2017

Why Most True Research Findings Are Useless

In his provocative essay in PLOS Medicine over a decade ago, Ioannidis argued that most published research findings are false, owing to a variety of errors such as p-hacking, data dredging, fraud, selective publication, researcher degrees of freedom, and many more.  In my permutation of his essay, I will go a step further and suggest that even if we limit our scrutiny to tentatively true research findings (scientific truth being inherently tentative), most research findings are useless.

My choice of the word "useless" may seem provocative, and even untenable, but it is intended to have an exquisitely specific meaning:  I mean useless in an economic sense of "having zero or negligible net utility", in the tradition of Expected Utility Theory [EUT], for individual decision making.  This does not mean that true findings are useless for the incremental accrual of scientific knowledge and understanding.  True research findings may be very valuable from the perspective of scientific progress, but still useless for individual decision making, whether it is the individual trying to determine what to eat to promote a long healthy life, or the physician trying to decide what to do for a patient in the ICU with delirium.  When evaluating a research finding that is thought to be true, and may at first blush seem important and useful, it is necessary to make a distinction between scientific utility and decisional utility.  Here I will argue that while many "true" research findings may have scientific utility, they have little decisional utility, and thus are "useless".

Wednesday, June 8, 2016

Once Bitten, Twice Try: Failed Trials of Extubation



“When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.”                                                                                   – Clark’s First Law

It is only fair to follow up my provocative post about a “trial of extubation” by chronicling a case or two that didn’t go as I had hoped.  Reader comments from the prior post described very low re-intubation rates.  As I alluded in that post, decisions regarding extubation represent the classic trade-off between sensitivity and specificity.  If your test for “can breathe spontaneously” has high specificity, you will almost never re-intubate a patient.  But unless the criteria used have correspondingly high sensitivity, patients who can breathe spontaneously will be left on the vent for an extra day or two.  Which you (and your patients) favor, high sensitivity or high specificity (assuming you can’t have both) depends upon the values you ascribe to the various outcomes.  Though these are many, it really comes down to this:  what do you think is worse (or more fearsome), prolonged mechanical ventilation, or reintubation?

What we fear today we may not seem so fearsome in the future.  Surgeons classically struggled with the sensitivity and specificity trade-off in the decision to operate for suspected appendicitis.  “If you never have a negative laparotomy, you’re not operating enough” was the heuristic.  But this was based on the notion that failure to operate on a true appendicitis would lead to serious untoward outcomes.  More recent data suggest that this may not be so, and many of those inflamed appendices could have been treated with antibiotics in lieu of surgery.  This is what I’m suggesting with reintubation.  I don’t think the Epstein odds ratio (~4) of mortality for reintubation from 1996 applies today, at least not in my practice.

Thursday, July 19, 2007

The WAVE trial: The Canadians set the standard once again

Today's NEJM contains the report of an exemplary trial (the WAVE trial) comparing aspirin to aspirin and warfarin combined in the prevention of cardiovascular events in patients with peripheral vascular disease (http://content.nejm.org/cgi/reprint/357/3/217.pdf). Though this was a "negative" trial in that there was no statistically significant difference in the outcomes between the two treatment groups, I am struck by several features of its design that are worth mentioning.

Although the trial was the beneficiary of pharmaceutical funding, the authors state:

"None of the corporate sponsors had any role in the design or conduct of the trial, analysis of the data, or preparation of the manuscript".

Ideally, this would be true of all clinical trials, but right now it's a precocious idea.



One way to remove any potential or perceived conflicts of interest might be to mandate that no phase 3 study be designed, conducted, or analyzed by its sponsor. Rather, phase 3 trials could be funded by a sponsor, but are mandated to be designed, conducted, analyzed, and reported by an independent agency consisting of clinical trials experts, biostatisticians, etc. Such an agency might also receive infrastructural support from governmental agencies. It would have to be large enough to handle the volume of clinical trials, and large enough that a sponsor would not be able to know to what ad hoc design committee the trial would be assigned, thereby preventing unscrupulous sponsors from "stacking the deck" in favor of the agent in which they have an interest.

The authors of the current article also clearly define and describe inclusion and exclusion criteria for the trial, and these are not overly restrictive, increasing the generalizability of the results. Moreover, the ratinoale for the parsimonious inclusion and exclusion criteria are intuitively obvious, unlike some trials where the reader is left to guess why the authors excluded a particular subgroup. Was it because it was thought that the agent would not work in that group? Because increased risk was expected in that group? Because study was too difficult (ethically or logistically) in that group (e.g., pregnancy). Inadequate justification of inclusion and exclusion criteria make it difficult for practitioners to determine how to incorporate the findings into clinical practice. For example, were pregnant patients excluded from trials of therapeutic hypothermia after cardiac arrest (http://content.nejm.org/cgi/reprint/346/8/549.pdf) for ethical reasons, because of an increased risk to the mother or fetus, because small numbers of pregnant patients were expected, because the IRB frowns upon their inclusion or for some other reason? Without knowing this, it is difficult to know what to do with a pregnant woman who is comatose following cardiac arrest. Obviously, their lack of inclusion in the trial does not mean that this therapy is not efficacious for them (absense of evidence is not evidence of absense). If I knew that they were excluded because of a biologically plausible concern for harm to the fetus (and I can think of at least one) rather than because of IRB concerns, I would be better prepared to make a decision about this therapy when faced with pregnant patient after cardiac arrest. Improving the reporting and justification of inclusion and exclusion criteria should be part of efforts to improve the quality of reporting of clinical trials.

Interestingly, the authors also present an analysis of the composite endpoints (coprimary endpoints 1 and 2) that excludes fatal bleeding or hemorrhagic stroke. When these side effects are excluded from the composite endpoints, there is a trend favoring combination therapy (p values 0.11 and 0.09 respectively). Composite endpoints are useful because they allow a trial of a given number of patients to have greater statistical power, and it is rational to include side effects in them, as side effects reduce the net value of the therapy. However, an economist or a person versed in expected utility theory (EUT) would say that it is not fair to combine these endpoints without first weighting them based on their relative (positive or negative value). Not weighting them implies that an episode of severe bleeding in this trial is as bad (negative value or utility) as a death - a contention that I for one would not support. I would much rather bleed than die, or have a heart attack for that matter. Bleeding can usually be readily and effectively treated.

In the future, it may be worthwhile to think more about composite endpoints if we are really interested in the net value/utility of a therapy. While it is often difficult to assign a relative value to different outcomes, methods (such as standard gambles) exist and such assignment may be useful in determining the true net value (to society or to a patient) of a new therapy.