I’ve just read this article in The British Journal of Psychiatry (2011) 199: 501-507 . It proposes an explanation for why so many reviews fail to find a clinically relevant difference between antidepressants and placebo: the way the data is analyzed. The vast majority of studies use a statistical model called ANCOVA (analysis of covariance), which assumes a linear relationship between two variables. Let’s say on one axis we put a score for how improved a patient’s depression is. On the other axis we might put how severe the person’s depression was to begin with. That would allow us to evaluate if there is an interdependent relationship between those two factors: they are “covariant”.
However, the problem highlighted in this paper is that depression and response to antidepressants may not be well represented with a smooth line. People may not exist along a continuous curve of depression, but rather fall into distinct groupings and to draw a line connecting those groups ruins the validity of the ANCOVA analysis. Instead, we might analyze the data by category: people who respond to drug, people who don’t respond to drug, and people who respond a little, then in each category measure how many people have serious depression, major depression, and minor depression, as measured on a depressive symptoms scale. They did this using something called the “mixture model” that I was unfamiliar with.
The authors point to another paper I don’t have access to that suggests that the clinical significant effects are buried in the very minor improvements seen in some categories of patients. I suppose the best analogy I can come up with is evaluating a special fertilizer used to enhance the growth of garden crops. We apply that fertilizer to our garden but also the lawn, the patio and the driveway. Let’s say a total of 600 square meters are fertilized, where the garden is only 100 square meters. We produce 30 kg of crops versus 15 kg the previous year, unfertilized. You could report this as generating 15 kg increase per 100 sq m (Effectiveness Index = 0.15 kg/m2), or as 15 kg increase per 600 sq m (Effectiveness Index = 0.025 kg/m2). If you are trying to figure whether this fertilizer is cost effective, the difference between analyzing only the garden and the entire fertilized area is significant.
Likewise, depression is often being over-treated as a matter of course. That sounds deplorable, but it’s a very blunt instrument and sometimes necessary to treat more people than truly respond. NNT is the “number needed to treat”, and we use it as a way of measuring how many non-responders have to be treated to get one truly effective response. The authors of this paper produce values of 6 for response and 8 for remission. That means you need to treat 6 people who don’t respond fully in order to get one clinically significant response, and 8 people have to be treated to get a remission of depression. Not inspiring, I know, but actually much better than most previous studies. I out of 8 people on antidepressants are getting enough of a benefit out of the drugs to live a more-or-less normal life. One in 6 are at least getting measurable relief. The other 5, the ones who are suffering side effects with no significant relief, are the ones that concern me most on this topic.
Final point, because I didn’t really cover it. The authors found in the data a “bimodal response profile” in most cases (there were a few exceptions). That means if we separate the patient set into two groups, they are much more distinct in terms of response to drug.
On the left here you should see Figure 1 from the paper. The total population looks relatively flat, but when subdivided, two populations clearly emerge (bimodal distribution).
I’m very interested in this because it was my initial assumption/bias that this would be the case. I assume that we’re broadcasting this drug to everyone during clinical trials, regardless of whether we know they will respond, and I would fully expect that a drug that changes serotonin biology (SSRIs) would only be effective in patients with an underlying serotonin problem. I’m always very careful when I find data to support my preconceptions, so I’m trying to scrutinize this analysis as carefully as possible to avoid getting carried away with confirmation bias.
A big part of my own research has been into how to randomize cancer patients on a responder/non-responder profile of “biomarkers“, so this type of bimodal distribution is very significant to me.