Introduction

Major depressive disorder (MDD) is a highly prevalent and debilitating disorder within the community. Impacting 16 million adults in the USA each year, less than 50 % will achieve remission with their initial antidepressant medication, with around a third needing to try multiple options before finding the one that works for them [13]. Trialling multiple treatment options can be a particularly difficult process for people experiencing depression, and there are currently no tools available to clinicians to predict ahead of time who is likely to respond to which kind of treatment. The International Study to Predict Optimized Treatment in Depression (iSPOT-D) is the first international study to take an integrative neuroscience approach to the discovery of new biomarkers to predict treatment outcome in depression, looking across the disciplines of structural and functional MRI, EEG, cognitive performance, and genomics, as well as clinical features and lifestyle factors.

iSPOT-D is a randomized clinical trial conducted at 22 sites across 5 countries (USA, UK, the Netherlands, Australia, and New Zealand). All assessments were undertaken at all sites and using identical methodology, with the exception of structural and functional MRI, which were collected in a subset of patients at two site locations (in the USA and Australia). Patients were collected in two phases—an initial sample of n = 1008 patients used to develop biomarkers and a second independent sample of patients to test the replicability of these biomarkers. Assessments were taken at a medication-free baseline session, and again, after 8 weeks of treatment with one of three commonly used antidepressants—escitalopram (SSRI), sertraline (SSRI), or venlafaxine-XR (SNRI)—allocated in a randomized 1:1:1 ratio and administered by their own clinician on a treatment-as-usual basis. Depressive symptoms were assessed using the clinician-rated HDRS17 and self-rated QIDS-SR16, with a focus on the week 8 endpoints of symptom remission (scores at or below 7 and 5, respectively), and symptom response (reduction of 50 % or greater). The first n = 1008 sample have clinically rated remission and response rates of 45 and 62 %, respectively, and self-rated remission and response rates of 38 and 53 %, respectively. A group of n = 336 healthy controls were also collected using the same methodology, without taking medication. Further details of the study protocol and sample demographics are provided in Williams et al. and Saveanu et al. [4, 5].

In the past year, the first biomarkers from within each of these disciplines have been established in the initial sample of n = 1008 patients. The focus in developing these biomarkers has been twofold. Firstly, in establishing the relationships between these measures and symptom reduction, and secondly is by considering the real-world utility of these biomarkers in a clinical setting, by evaluating the accuracy with which these measures can be used to predict the likely outcome of treatment ahead of time.

Structural Imaging Biomarkers

Diffusion Tensor Imaging

Diffusion tensor imaging (DTI) provides a method of measuring the structural connectivity of white matter tracts, by using the method of fractional anisotropy (FA). FA measures the motion and direction of water molecules to identify and measure the white matter along these tracts, thereby reflecting their connective capacity and the speed with which information can travel along them. The initial focus of the iSPOT-D study has been on the connectivity of circuits related to the subgenual anterior cingulate cortex and the amygdala-hippocampal complex, as these circuits hold a dominant place in neurobiological models of both MDD and the action of antidepressants [68]. Recent evidence from DTI methods has shown that the density of these tracts can distinguish MDD patients from healthy controls [9].

Using backward logistic regression in a sample of 74 patients and pooling all three medication groups together, the connectivity of two tracts in this region was found to be related to symptom remission—the cingulate part of the cingulum bundle (connecting the cingulate gyrus and the hippocampus) and the stria terminalis (the axonal projection from the hippocampus). The white matter connectivity of these tracts predicted treatment remission and non-remission with 63 % accuracy (56 % sensitivity and 68 % specificity), and adding age into this model was found to further increase accuracy to 74 % (74 % sensitivity and 75 % specificity). This reflected remission of symptoms being related to greater connectivity in the cingulate gyrus (which was also closer to a healthy control level) and less connectivity in the stria terminalis. Being younger was also related to a greater chance of remission [10]. This is consistent with several previous studies linking better treatment response in MDD with greater rostral anterior cingulate activity [11] and greater subgenual cingulate activity and less ventrolateral prefrontal cortical activity during emotion processing [12].

There was also some evidence for a predictive role of axonal projections from the amygdala (the fornix). This structure significantly predicted a response to treatment (50 % reduction in symptoms), but not full symptom remission.

The next step will be to combine these structural connectivity predictors with functional measures of connectivity. This will enhance our understainding of how these white matter tracts contribute to the operation of dynamic functional networks and relate to symptomatology and the action of drugs in shifting to remission.

Subgroups with Very High Accuracy Predictions

In a second analysis, structural imaging volumetric and DTI measures were combined to explore whether a smaller subset of remitters or non-remitters could be identified, for whom very high certainty predictions of 80 % accuracy or greater could be made.

Quantified receiver operating characteristics (QROC), a classification technique well suited for identifying discrete subgroups, was used for this purpose [13]. Decision trees were created based upon the measures and cutoff thresholds that provided the best predictive ROC curves at each level, thereby revealing specific subgroups within the dataset with particularly low and high remission rates. The one drawback to this method is that there is a risk of overfitting the data, and since cross-validation is difficult to perform with this method, a replication step in a second group of patients was also included as part of this analysis.

Pooling the medication groups together, QROC decision trees were established in an initial training group of n = 74 patients, and a new group of n = 83 patients were then run through these trees. From this, two combinations of decision rules were identified and subsequently replicated in the second group that predicted non-remission with greater than 80 % accuracy [14]. No decision rules were identified that predicted remission with this accuracy.

In the first model, a combination of lower middle frontal volume (below 14.8 mL) and greater right angular gyrus volume (above 6.3 mL) identified a group in which 85 % were non-remitters. This group represented 55 % of all non-remitters. In the second model, a combination of connective tracts with low connectivity in the left cingulum bundle, the right superior front-occipital fasciculus, and the right superior longitudinal fasciculus (fractional anisotropy values less than 0.63, 0.54, and 0.50, respectively) identified non-remitters with 84 % accuracy and accounted for 15 % of all non-remitters.

Combining the two decision trees gave the best predictive result. When the groups identified by the two models are combined, 61 % of non-remitters are accounted for, and identified with 84 % accuracy (i.e., 84 % of those in the group are non-remitters). Clinically, this group is highly unlikely to remit to any of the three antidepressant medications tested, and having this knowledge ahead of time may enable clinicians to rule out these treatments earlier on and proceed with alternative treatment options instead.

At the neural level, these models also implicate less medial frontal volume and angular gyrus volume and less long-range connectivity, being related to a much poorer likelihood of achieving remission. Functionally, the medial frontal region is widely known to play a role in emotion and executive function, while the role of the angular gyrus in depression is less known, and possibly relates to its role in the default mode network and cognition. Integration of these results with data measures from the other disciplines may be able to better elucidate these mechanisms.

Functional Imaging Biomarkers

Emotion Processing

In exploring functional imaging biomarkers of treatment outcome, the initial focus has been on the limbic system and emotional brain networks, and particularly, the role of the amygdala within these networks, which has previously been associated with both MDD and the treatment of MDD [1517].

Emotional brain networks were activated by viewing images of facial expressions, a demonstrated method of engaging these networks [18, 19]. In addition, the automatic, nonconscious aspects of these networks can be teased apart from more conscious and elaborate processing, by presenting these images at a very fast speed that is below conscious perception (subliminal presentation of 10 ms duration, proceeded by a masking neutral face of 150 ms duration), as well as at a standard conscious perception speed (supraliminal presentation of 500 ms duration [20, 21]). Facial expressions of happiness, fear, anger, and sadness were compared to a neutral expression baseline.

In a sample of n = 80 patients, amygdala activation to the subliminal perception of happiness, fear, and anger was found to predict symptom response to treatment (50 % reduction in symptoms), but not symptom remission. When combined with age, response and non-response were predicted with an accuracy of 75 % (77 % sensitivity and 72 % specificity), remaining at 75 % after cross-validation [22].

Compared to non-responders, responders showed less bilateral activation of the amygdala during nonconscious processing of happiness and less left amygdala activation during nonconscious processing of threat-related fear and anger expressions. This greater engagement of the amygdala for non-responders was at the same level as that for a matched sample of 34 healthy controls, whereas responders showed reduced amygdala activation relative to both healthy controls and non-responders, which then moved towards normalizing by week 8.

For the SSRIs (escitalopram and sertraline), amygdala activation during nonconscious processing of sadness followed a similar pattern to the other emotions, showing a trend towards less activation in responders than that in non-responders and healthy controls. However, a different pattern was observed for the SNRI venlafaxine-XR, with responders and healthy controls being the same, and non-responders showing significantly greater activation than both groups. The combination of amygdala activity to subliminal sadness and age-predicted response and non-response for venlafaxine-XR was with 81 % accuracy (87 % sensitivity and 73 % specificity) and with a cross-validated accuracy of 77 %.

This suggests that a general loss of reactivity to emotional signals may be related to being predisposed to respond to antidepressant medication, whereas a specific mood-congruent effect of being hyper-responsive to signals of sadness indicates the involvement of a more specific network that responds to treatment with an SNRI.

Amygdala activation during the conscious processing of facial emotion did not predict either symptom response or remission. This is in line with previous work finding that the amygdala is more likely to be activated to emotional stimuli where less task-related demands are required [21].

Cognitive Processing

In the same 80 patients and 34 healthy controls, pretreatment activation during three cognitive tasks was evaluated for an association with treatment outcome—a two-tone oddball task reflecting processing of stimulus significance, a go/no-go task reflecting inhibition of automatic responses and a one-back task reflecting updating of working memory [23]. These tasks were selected to capture basic aspects of cognitive processing that are used across many different types of cognitive tasks and in everyday life. Nineteen regions of interest from two core neural circuits in cognitive functioning were selected a priori as the regions that would be focused on—the fronto-parietal attention network and the cingulo-opercular network.

Only the engagement of the dorsolateral prefrontal cortex (DLPFC; part of the fronto-parietal attention network) during the inhibition of automatic responses was found to be predictive of treatment outcome. Non-remitters showed less activity in this region than both remitters and healthy controls, with remitters and healthy controls showing no difference from each other. This treatment outcome difference was predicted with 49 % sensitivity and 65 % specificity, reflecting the predictive strength being in the ability to identify the non-remitters.

Additionally, for the two SSRIs, less inferior parietal activity (also part of the fronto-parietal attention network) during inhibition of responses in the go/no-go task was again found for non-remitters compared to remitters, with no difference between remitters and healthy controls. This treatment outcome difference was predicted with 45 % sensitivity and 69 % specificity, again reflecting the predictive strength being in the prediction of non-remission.

In the full n = 1008 sample, a separate analysis found that overall poor cognitive functioning is also associated with poorer treatment outcome [24]. It may be that the engagement of the DLPFC and inferior parietal lobule, and more generally the fronto-parietal attention network that they are part of, reflects an ability in remitters to maintain cognitive function against the influences of MDD and potentially a general-maintained ability to retain cognitive resources and thus more readily respond to treatment. This is also supported by a previous report of a reduction in depressive symptoms being predicted by greater right lateral and medial prefrontal cortical activation during inhibition of responses [25]. The other two tasks engaging working memory and the processing of stimulus significance did not show any relationship to treatment outcome.

Electroencephalography Biomarkers

Frontal Theta

The theta band of the electroencephalography (EEG) power spectrum is a measure of activity oscillating at a frequency of 4–8 cycles per second, most commonly associated with drowsiness and decreased attention and cognitive control. Increased frontal theta has been previously associated with non-response to antidepressants [2628], although there have also been some conflicting reports [2931].

In a sample of n = 667 per protocol completers from the larger n = 1008 sample, theta EEG activity was compared between treatment responders and non-responders, for each of the three antidepressants. A difference in theta activity was found only for the SNRI venlafaxine-XR, with responders being found to have less theta band activity than non-responders. Treatment response and remission were not related to theta activity for the two SSRIs or when all medication arms were pooled together [32].

eLORETA current source density analysis was subsequently used to localize the region of the cortex that this difference for the venlafaxine-XR treatment arm originated from. This greater amount of theta activity in non-responders compared to responders was found to be mainly localized to the medial frontal gyrus, the rostral anterior cingulate gyrus, and the paracentral lobule. However, when theta activity was extracted separately for activity localized to the rostral anterior cingulate cortex (rACC) and a more widespread cortical origin theta, both types of theta were correlated with symptom improvement (in a sample n = 667 per protocol completers across all treatment arms). Due to the nature of this analysis, accuracy statistics on the potential predictive value of this measure were not able to be calculated.

These results suggest that the relationship between frontal theta activity and treatment response in MDD may be specific to particular types of antidepressants, with a relationship being found here for the SNRI but not the two SSRI treatments. This also aligns with the mixed results reported in the literature. This measure may hold promise for the prediction of SNRI treatment outcome, with future analyses designed to evaluate the predictive accuracy of theta band activity.

Lateralized Activity at Rest

Activity within the alpha frequency band of the EEG spectrum (8 to 12 cycles per second) is commonly used as a measure of the brain at rest, with less alpha band activity in a given region indicating that the region is being more actively engaged at that time. The laterality of the resting brain—whether activity is greater on the left or the right side of the brain—has long been held to differ between MDD and healthy controls, with MDD patients being more right lateralized in frontal regions. This difference has been associated with differing engagement of the approach system [33], although there have also been some conflicting findings [34, 35]. Right-lateralized occipital activity has also been associated with non-response to treatment [36]; however, this has not been investigated extensively or for other brain regions.

In the same sample of n = 667 per protocol completers as for the EEG theta analyses, pretreatment frontal laterality was evaluated during resting eyes open and eyes closed conditions, for a relationship to the outcomes of response and remission. For female MDD patients, for the SSRIs only, non-remitters and non-responders showed greater right-lateralized activity than remitters and responders. This predicted remission and non-remission with a sensitivity of 60 % and a specificity of 64 %, and response and non-response with a sensitivity of 74 %, but a specificity of 48 %. No differences in frontal laterality were found between MDD patients and healthy controls [37].

These results bring new light to the long-held findings of greater right laterality differentiating MDD patients from healthy controls, in extending this to a biomarker of treatment outcome prediction, with greater right laterality predicting non-remission to treatment. The limitation of this finding to SSRI remission in females may also shed light on the conflicted findings in the literature, particularly given that no difference between MDD and healthy controls was observed in the current study.

Cognitive Performance Biomarkers

Cognitive performance was assessed using an established and validated cognitive and emotional assessment battery that was completed by all participants on a touchscreen computer [3840]. Assessment details are provided in Table 1.

Table 1 Cognitive and emotion tasks

Exploration Using Pattern Classification

In an initial exploration of the relationship between pretreatment cognitive performance and treatment outcome, pattern classification techniques were used to derive data-driven combinations of cognitive measures most related to treatment outcome. This was run separately for those patients who showed a widespread cognitive impairment (25 % of the sample) and those who did not.

Predictive classifications of sufficient significance were considered to be those with 60 % or greater accuracy. Classifications above this level of accuracy were found for escitalopram and sertraline and only for the more cognitively impaired subgroup. Symptom remission in this group was predicted with 72 % accuracy (79 % sensitivity and 69 % specificity) for escitalopram and 64 % accuracy (66 % sensitivity and 63 % specificity) for sertraline [24]

Predictive classifications for venlafaxine-XR, and for those without a general cognitive impairment, did not reach this level of accuracy. These initial results from pattern classification demonstrated the predictive power of cognitive performance for treatment outcome, for at least some drugs and subgroups of patients.

Cognitive Subgroups

Having established that cognitive performance could be predictive of treatment outcome, this was then extended by considering whether the development of more tailored predictive models could further improve accuracy and expand the patient sample that they apply to [41].

This second approach established predictive models for several different cognitive subgroups, using cross-validated logistic regression analysis. These patient subgroups were defined according to individual cognitive tasks as well as summary metrics of general and emotional cognition, and also some non-cognitive demographic and symptom severity measures. By design, each of these subgroups accounted for approximately half of the patient sample. Several different subsets of cognitive measure predictors were also created, according to both theoretical and data-driven metrics.

Using this method, separate predictive models were established for each treatment arm that exceeded 60 % sensitivity and specificity. These models applied to a different subgroup of patients for each treatment arm, with no single model applying to the full set of MDD patients for any of the treatment arms.

For escitalopram, in a subgroup of patients showing relatively poor emotional cognition (below the MDD patient group median), symptom remission was predicted with 72 % sensitivity and 67 % specificity. In this group, better attention skills and faster speed of responding predicted greater likelihood of symptom remission. For sertraline, several measures of general cognition predicted treatment outcome with 69 % sensitivity and 70 % specificity, but only for patients older than 30 years of age. In this group, better performance across tasks of working memory, attention flexibility, and executive function was predictive of symptom remission. For venlafaxine-XR, baseline symptom severity was predictive of treatment outcome (those less severe at baseline being more likely to achieve remission), but only for those with good general (non-emotion) cognitive capacity (above the MDD group median). Remission was predicted with 67 % sensitivity and 62 % specificity for this group.

These results suggest that these cognitive tests may be tapping into neural circuits that are specific to the mechanism of action for each drug. Given the high feasibility and low cost of administering this kind of test, these cognitive biomarkers may provide good candidates for tests that could be utilized in a real-world clinical setting.

Very High Accuracy Subgroups

Extending from these results, a second aim was to determine whether there are even smaller subgroups for whom either remission or non-remission can be predicted with a high enough accuracy that the prediction would be of clinical benefit despite applying to only to a small percentage of patients. Levels of positive and negative predictive values of 80 % or greater was set as a clinically beneficial threshold for this purpose [41].

Prediction models were established in a similar manner to the previous analyses, using a logistic regression approach, and several sets of cognitive predictor measures established using data-driven and theoretical methods. For positive predictive values (PPV), a model exceeding a level of 80 % was found only for sertraline. This model could be applied to 14 % of the patient sample. No models exceeding this PPV level were found for escitalopram or venlafaxine-XR.

For negative predictive values (NPV), models exceeding a level of 80 % were found for all three drugs. These models were able to be applied to 24 % of patients for escitalopram, 54 % of patients for sertraline, and 27 % of patients for venlafaxine-XR.

Genomic Biomarkers

This first investigation from iSPOT-D into the predictive potential of genomic SNPs focused on the transportation of antidepressants across the blood-brain barrier. Among the proteins involved in this transportation is P-glycoprotein, which is encoded by genetic variation in the ABCB1 gene. This transporter protein has previously been found to be associated with antidepressant treatment outcome and side effects, but results have been mixed and come from several small studies across differing patient populations. The current analysis is the first investigation of this relationship in a large-scale prospective clinical trial.

Nine SNPs in or near the ABCB1 location were investigated for their association with both treatment remission and side effects, in a sample of n = 576 per protocol patients who completed the week 8 follow-up assessment and also had genomic information available. Logistic regression analyses were used, with each SNP coded 0, 1, or 2 for each person, to reflect the number of minor, or rarer, alleles present (i.e., either 2 minor alleles, 1 minor and 1 major allele, or 2 major alleles, respectively).

Of the nine SNPs analyzed, only one (rs10245483) was associated with symptom remission. The major G allele was associated with greater likelihood of remission than non-remission with SSRIs escitalopram and sertraline, while the minor T allele was associated with greater likelihood of remission than non-remission with SNRI venlafaxine-XR, as well as fewer side effects [42]. When considering just the homozygous GG and TT SNP variants (51 % of the sample), remission to the SSRIs was predicted with 70 % sensitivity and 52 % specificity, and remission to the SNRI predicted with 61 % sensitivity and 65 % specificity.

These effects were then further explored in combination with the cognition groups of impaired and intact cognition that had been used for the initial cognition biomarker analyses. The results suggest that there may be some overlap in the groups that are being picked up in these two separate analyses. The association between the major G allele and greater likelihood of remission for escitalopram and sertraline was more apparent in the cognitively intact group, whereas the association between the minor T allele and remission for venlafaxine-XR was more apparent for the cognitively impaired group.

The next step in these analyses will be to look at other genes that have replicated associations with depression and antidepressant action, as well as a wider range of exploratory analyses across other gene variants.

Clinical Biomarkers

The utility of clinical features of depression to predict likelihood of responding antidepressant medications, or specifically to either SSRIs or SNRIs, is mixed. Particular clinical subtypes are commonly held to be less likely to respond to antidepressants, such as melancholic, atypical, and anxious depression subtypes. However, there equally as many investigations who do not find these relationships as those that do.

These common clinical subtypes were investigated in the iSPOT-D patient sample, and there was found to be no relationship between the presence of any of these subtypes and treatment outcome [43]. A high proportion of 75 % of the sample was found to meet criteria for at least one clinical subtype, of which 52 % met criteria for more than one subtype. This overlap between subtypes is not often considered and may explain the lack of a clear relationship to treatment outcome in the literature. The specific combinations of multiple subtypes also did not relate to treatment outcome nor did the number of subtypes that a patient met criteria for.

However, even though meeting criteria for specific subtypes was not predictive of treatment outcome, severity of anxiety and depressive symptoms have been found to have a predictive utility. Self-reported anxiety symptoms were found to be related to symptom remission, even after controlling for depressive symptom severity, comorbid diagnoses, and side effects [5]. Greater severity of depressive symptoms at baseline was found to be predictive of less likelihood of remitting, but only for the SNRI venlafaxine-XR, and only in those with better cognitive performance [41]. Symptom severity was not prognostic when poor cognition was also present.

Next Steps: Integration and Replication

Initial biomarkers for antidepressant treatment outcome have been identified over the past year from each of the discipline areas in the iSPOT-D trial—structural and functional imaging, EEG, cognition, genomics, and clinical features—all from the same sample of patients (see Table 2). Taking an integrative neuroscience approach to the discovery of these biomarkers, the next step will be to bring these separate pieces of information together.

Table 2 Predictors of treatment outcome

There are already indications that some of these biomarkers from across the different discipline areas may be tapping the same fundamental drivers. For example, markers relating to less attention and cognitive control being predictive of non-remission are present across several domains, including greater presence of slow frontal theta band EEG rhythms, less task-related engagement of dorsolateral prefrontal cortex during behavioral inhibitory control from functional imaging, smaller volume of the medial frontal region and less long-range connectivity, and poorer cognitive performance.

There are also several predictors that have arisen that are specific either to the SSRIs or SNRI. This includes symptom improvement to SSRIs predicted by both the rs10245483 SNP major allele and better performance on the cognitive assessment, and in females, more right-lateralized resting brain activity. While symptom improvement to the SNRI has been uniquely predicted by the rs10245483 SNP minor allele, less amygdala engagement to sad facial expressions and greater frontal EEG theta activity was observed.

These different biomarkers may represent different pieces of the same puzzle, each providing greater insight by coming from a different perspective. Alternatively, they may capture discrete subgroups that respond to the drugs for different reasons, in which case combining these measures into a single predictive model may improve the accuracy of the predictions, or make them applicable to a wider group of patients. Having these measures in the same people will allow these possibilities to be tested empirically.

Replication of any scientific finding is of central importance and is a key aspect of the iSPOT-D trial. A second sample of MDD patients has been collected as part of the study design, for the purpose of testing the replicability of biomarkers that have been established from the first sample. Analysis of the second imaging sample has begun, and analysis of the larger non-imaging sample will begin later this year.

The key driver of the iSPOT-D study is to develop simple, accessible tests that can be used as an adjunct in selecting antidepressant treatments and that are feasible for use in a typical real-world clinical setting. In patient care, clinicians are increasingly likely to take a stepwise approach and to provide a continuum of care—starting with the simplest and most cost-effective solution, and increasingly moving upwards as is necessary. It may be that the most clinically feasible integration of treatment prediction biomarkers will fall within this model, with patients first taking an easily accessible and low-cost cognitive predictive test; then genomics being considered for those that the cognitive predictors do not apply to and then other EEG and imaging components being considered after that, dependent on the accessibility and cost-effectiveness of each.

To date, the first biomarkers across several domains of speciality have been developed in the initial study sample, with 14 main predictive biomarkers of treatment outcome in total. With the possible integration of these biomarkers, and replication in a new patient sample, these hold promise to provide the first predictive biomarkers of treatment outcome in psychiatry that are of sufficient clinical value and feasibility to be integrated into clinical practice.