Assessing the ‘true’ effect of active antidepressant therapy v. placebo

I’ve just read this article in The British Journal of Psychiatry (2011) 199: 501-507 .    It proposes an explanation for why so many reviews fail to find a clinically relevant difference between antidepressants and placebo:  the way the data is analyzed.  The vast majority of studies use a statistical model called ANCOVA (analysis of covariance), which assumes a linear relationship between two variables.  Let’s say on one axis we put a score for how improved a patient’s depression is.  On the other axis we might put how severe the person’s depression was to begin with.  That would allow us to evaluate if there is an interdependent relationship between those two factors: they are “covariant”.

However, the problem highlighted in this paper is that depression and response to antidepressants may not be well represented with a smooth line.  People may not exist along a continuous curve of depression, but rather fall into distinct groupings and to draw a line connecting those groups ruins the validity of the ANCOVA analysis.  Instead, we might analyze the data by category:  people who respond to drug, people who don’t respond to drug, and people who respond a little, then in each category measure how many people have serious depression, major depression, and minor depression, as measured on a depressive symptoms scale.  They did this using something called the “mixture model” that I was unfamiliar with.

The authors point to another paper I don’t have access to that suggests that the clinical significant effects are buried in the very minor improvements seen in some categories of patients.  I suppose the best analogy I can come up with is evaluating a special fertilizer used to enhance the growth of garden crops.  We apply that fertilizer to our garden but also the lawn, the patio and the driveway.   Let’s say a total of 600 square meters are fertilized, where the garden is only 100 square meters.  We produce 30 kg of crops versus 15 kg the previous year, unfertilized.  You could report this as generating 15 kg increase per 100 sq m (Effectiveness Index = 0.15 kg/m2), or as 15 kg increase per 600 sq m (Effectiveness Index = 0.025 kg/m2).  If you are trying to figure whether this fertilizer is cost effective, the difference between analyzing only the garden and the entire fertilized area is significant.

Likewise, depression is often being over-treated as a matter of course.  That sounds deplorable, but it’s a very blunt instrument and sometimes necessary to treat more people than truly respond.  NNT is the “number needed to treat”, and we use it as a way of measuring how many non-responders have to be treated to get one truly effective response.  The authors of this paper produce values of 6 for response and 8 for remission.  That means you need to treat 6 people who don’t respond fully in order to get one clinically significant response, and 8 people have to be treated to get a remission of depression.  Not inspiring, I know, but actually much better than most previous studies.  I out of 8 people on antidepressants are getting enough of a benefit out of the drugs to live a more-or-less normal life.  One in 6 are at least getting measurable relief.  The other 5, the ones who are suffering side effects with no significant relief, are the ones that concern me most on this topic.

Final point, because I didn’t really cover it.  The authors found in the data a “bimodal response profile” in most cases (there were a few exceptions).  That means if we separate the patient set into two groups, they are much more distinct in terms of response to drug.

On the left here you should see Figure 1 from the paper.  The total population looks relatively flat, but when subdivided, two populations clearly emerge (bimodal distribution).

I’m very interested in this because it was my initial assumption/bias that this would be the case.  I assume that we’re broadcasting this drug to everyone during clinical trials, regardless of whether we know they will respond, and I would fully expect that a drug that changes serotonin biology (SSRIs) would only be effective in patients with an underlying serotonin problem.  I’m always very careful when I find data to support my preconceptions, so I’m trying to scrutinize this analysis as carefully as possible to avoid getting carried away with confirmation bias.

A big part of my own research has been into how to randomize cancer patients on a responder/non-responder profile of “biomarkers“, so this type of bimodal distribution is very significant to me.


6 comments on “Assessing the ‘true’ effect of active antidepressant therapy v. placebo

  1. I believe mixture models are very similar to cluster analysis. It looks at the population and forms them into sub-populations. It is then up to the researcher to determine what those sub-populations are. So for your example, cluster analysis should create 4 distinct groups, but with grass and the garden close to each other (they are both plants) and the patio and driveway close to each other (not-plants, or some other category) and those 2 groups would be further away from each other.

    I am not as sure if this would work if there were gradients in patients when it comes to level of depression and their response, but you did mention bimodal response.

    Now, onto an anecdotal note of someone who has taken antidepressants and has ADHD, my talks with my psychiatrist suggest there is a large issue with comorbidity in the studies. ADHD seems to share comorbidity with MANY other problems like anxiety, depression, etc… Treating my ADHD with stimulants as opposed to SSRis for my depression, I have reduced my depression, ADHD, and anxiety symptoms.

  2. doesn’t the research you cited earlier by John Ionniadis, or whatever his name is, specifically mention that reanalyzing the data to look for novel and felxible outcomes a cause of increased type I error?
    If so wouldn’t that mean that you are opening yourself up to a type I error, if your experiment finds no results, but then you subsequently reanalyze the data with novel end points, based on dividing the population into several groups?

    “Corollary 4: The greater the flexibility in designs, definitions, outcomes, and analytical modes in a scientific field, the less likely the research findings are to be true. Flexibility increases the potential for transforming what would be “negative” results into “positive” results, i.e., bias.”

    “True findings may be more common when outcomes are unequivocal and universally agreed (e.g., death) rather than when multifarious outcomes are devised (e.g., scales for schizophrenia outcomes)”

    Seems like you are doing exactly what the research you have cited as warned against. Because not only are you reanalyzing the data with novel end points or outcomes, you are doing so with a scale for depression (MADRS).

    • I’m being absolutely genuine here: THANK YOU for the criticism! My goal on the Internet is to promote critical examination, including my own work, and you’ve asked one of those really tough questions. I try to remain ever mindful of my own shortcomings and biases.

      In this case, I’m suspicious of this paper for the reasons you mention, and also because they’re revisiting data already analyzed. However, John Ioannidis would have framed this paper as a confirmatory analysis. That is, it has a high proportion of true to false hypotheses to be tested. I believe he calls this measure “R”.

      The prior probability of this hypothesis being true is fairly high (if only because the number of independent studies have tended to converge on a single value), so I’m inclined to give it a far consideration.

      It’s not the outcome of the study that really interested me, though, it’s the idea that subgroup analysis is needed to properly measure effect size. I think that idea is appealing, in that it lines up with my experience in cancer research.

      A chemo drug used to treat 1000 people with breast cancer will only actually work in a small fraction of those patients, say 150. When we express the results over the total population, the effect size seems very small when, in fact, the chemo is very effective in a sub-population of patients with a particular genetic complement or immune profile.

      Thanks again for keeping me honest!

  3. I’ve just realised that that should read ‘SNRI’ at the top – with a capitol ‘I’ too. I just can’t have anything to do with c0nc0rdance without accidentally learning something new. I’m not sure there’s much room left in my walnut sized brain. I’ll have to start trying to forget some things before I come hear next time.

  4. Interesting point about the bimodal analysis. SSRi’s and SSNi’s can have some significant, early, deleterious effects. I believe that many patients only show improvement after having been on them for a month or so and a significant proportion simply stop taking them due to increased depression, dizziness, faintness, nausea etc. I know you’ve talked about this before. My personal (and only partially informed) feeling is that many people currently being prescribed anti-depressants should rather be on antianxiolytics (or should that just be ‘anxiolytics?) such as the Valium based drug Ativan. Over production of stress hormones such as cortisol that aren’t connected to Cushing’s disease (which I believe implies having a brain tumour of some sort) is commonplace but often under or mistreated due to the fact that Valium based therapies are addictive. Good luck with your cancer work c0nc0rdance. I hope you can explain further about the significance of bimodal analysis if you haven’t already.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s