Sie sind auf Seite 1von 14

See

discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/301696512

Are Therapists Uniformly Effective Across


Patient Outcome Domains? A Study on
Therapist Effectiveness in Two Different
Treatme....

Article in Journal of Counseling Psychology · April 2016


DOI: 10.1037/cou0000151

CITATIONS READS

5 784

7 authors, including:

Helene A. Nissen-Lie Simon B Goldberg


University of Oslo VA Puget Sound Health Care System
20 PUBLICATIONS 173 CITATIONS 35 PUBLICATIONS 665 CITATIONS

SEE PROFILE SEE PROFILE

William T. Hoyt Fredrik Falkenström


University of Wisconsin–Madison Linköping University
72 PUBLICATIONS 3,122 CITATIONS 35 PUBLICATIONS 463 CITATIONS

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

International Study of the Development of Psychotherapists View project

Mindfulness meta-analysis and effect size simulation study View project

All content following this page was uploaded by Stevan Lars Nielsen on 01 May 2016.

The user has requested enhancement of the downloaded file.


Journal of Counseling Psychology
Are Therapists Uniformly Effective Across Patient Outcome
Domains? A Study on Therapist Effectiveness in Two
Different Treatment Contexts
Helene A. Nissen-Lie, Simon B. Goldberg, William T. Hoyt, Fredrik Falkenström, Rolf Holmqvist,
Stevan Lars Nielsen, and Bruce E. Wampold
Online First Publication, April 28, 2016. http://dx.doi.org/10.1037/cou0000151

CITATION
Nissen-Lie, H. A., Goldberg, S. B., Hoyt, W. T., Falkenström, F., Holmqvist, R., Nielsen, S. L., &
Wampold, B. E. (2016, April 28). Are Therapists Uniformly Effective Across Patient Outcome
Domains? A Study on Therapist Effectiveness in Two Different Treatment Contexts. Journal of
Counseling Psychology. Advance online publication. http://dx.doi.org/10.1037/cou0000151
Journal of Counseling Psychology © 2016 American Psychological Association
2016, Vol. 63, No. 3, 000 0022-0167/16/$12.00 http://dx.doi.org/10.1037/cou0000151

Are Therapists Uniformly Effective Across Patient Outcome Domains? A


Study on Therapist Effectiveness in Two Different Treatment Contexts

Helene A. Nissen-Lie Simon B. Goldberg and William T. Hoyt


University of Oslo University of Wisconsin-Madison

Fredrik Falkenström and Rolf Holmqvist Stevan Lars Nielsen


Linköping University Brigham Young University
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Bruce E. Wampold
This document is copyrighted by the American Psychological Association or one of its allied publishers.

University of Wisconsin-Madison and Modum Bad Psychiatric Center, Vikersund, Norway

As established in several studies, therapists differ in effectiveness. A vital research task now is to
understand what characterizes more or less effective therapists, and investigate whether this differential
effectiveness systematically depends on client factors, such as the type of mental health problem. The
purpose of the current study was to examine whether therapists are universally effective across patient
outcome domains reflecting different areas of mental health functioning. Data were obtained from 2 sites:
the Research Consortium of Counseling and Psychological Services in Higher Education (N ⫽ 5,828) in
the United States and from primary and secondary care units (N ⫽ 616) in Sweden. Outcome domains
were assessed via the Outcome Questionnaire-45 (Lambert et al., 2004) and the CORE-OM (Evans et al.,
2002). Multilevel models with observations nested within patients were used to derive a reliable estimate
for each patient’s change (which we call a multilevel growth d) based on all reported assessment points.
Next, 2 multilevel confirmatory factor analytic models were fit in which these effect sizes (multilevel ds)
for the 3 subscales of the OQ-45 (Study 1) and 6 subscales of CORE-OM (Study 2) were indicators of
1 common latent factor at the therapist level. In both data sets, such a model, reflecting a global therapist
effectiveness factor, yielded large factor loadings and excellent model fit. Results suggest that therapists
effective (or ineffective) within one outcome domain are also effective within another outcome domain.
Tentatively, therapist effectiveness can thus be conceived of as a global construct.

Keywords: therapist effects, therapist uniformity, multilevel factor analysis

When patients seek help from a therapist, they typically hope a clinically versatile and therapeutically flexible clinician who can
that the therapist has the ability to help them with the particular treat a range of patients. Similarly, a mental health clinic would
problem or difficulty with which they struggle. Unless they know likely want to employ clinicians able to treat most concerns in the
about the therapist’s particular area of expertise, they must trust patient population they serve (see Benton, Robertson, Tseng, New-
that the therapist is able to address a range of psychological ton, & Benton, 2003).
problems. Likewise, when training therapists most universities and Does reality meet the expectation that therapists are skilled
training institutions would expect their trainees to learn to work across problem domains? In other words, is therapist effectiveness
with a range of mental health concerns when they enter the a global factor, or does it depend on the type of patient difficulty?
profession (Norcross & Beutler, 2000). Usually, the aim is to foster We will explore this question with client outcome data from the
United States and Sweden in two different investigations using
new methods to study therapist uniformity.
Previous research has persuasively demonstrated that therapists
do differ in effectiveness (Baldwin & Imel, 2013; Crits-Christoph
Helene A. Nissen-Lie, Department of Psychology, University of Oslo; & Mintz, 1991; Kim, Wampold, & Bolt, 2006; Kraus, Castonguay,
Simon B. Goldberg and William T. Hoyt, Department of Counseling Boswell, Nordberg, & Hayes, 2011; Lutz, Leon, Martinovich,
Psychology, University of Wisconsin-Madison; Fredrik Falkenström and Lyons, & Stiles, 2007; Nissen-Lie, Monsen, Ulleberg, & Røn-
Rolf Holmqvist, Department of Behavioural Sciences and Learning, nestad, 2013; Okiishi, Lambert, Nielsen, & Ogles, 2003; Wampold
Linköping University; Stevan Lars Nielsen, Department of Psychology,
& Brown, 2005). In a meta-analysis of therapist effects based on
Brigham Young University; Bruce E. Wampold, Department of Counsel-
k ⫽ 45 studies, Baldwin and Imel (2013), reported that 5% of the
ing Psychology, University of Wisconsin-Madison and Modum Bad Psy-
chiatric Center, Vikersund, Norway. variability in outcomes was explained by differences among ther-
Correspondence concerning this article should be addressed to Helene A. apists. While 5% may seem modest, it should be understood in
Nissen-Lie, Department of Psychology, University of Oslo, P.O. Box 1094 context. For one, the effect of receiving psychotherapy versus not
Blindern, 0317 Oslo, Norway. E-mail: h.a.nissen-lie@psykologi.uio.no is generally estimated to be at or below Cohen’s d ⫽ 0.80
1
2 NISSEN-LIE ET AL.

(Wampold & Imel, 2015), which translates into explaining about also effective with more severely distressed patients. In other
14% of the variation in outcome (Baldwin & Imel, 2013). In words, therapist effectiveness was consistent across patient age
addition, the proportion of variance explained by the therapist is and symptom severity (Wampold & Brown, 2005). In support of
roughly equivalent in magnitude to the variability in outcomes this notion, a recently published study by Green, Barkham, Kellett,
attributable to other key therapy ingredients, most notably the and Saxon (2014) reported a remarkably high correlation (viz., r ⫽
therapeutic alliance (Horvath, Del Re, Flückiger, & Symonds, .96) between therapist rankings in treating depressive symptoms
2011). Further, even small differences can yield substantial real- and treating anxiety symptoms in a large sample of patients (N ⫽
world effects. Saxon and Barkham (2012), for example, found that 1,122) seen by 21 practitioners, indicating that therapists who are
of 119 therapists in practice treating almost 2,000 patients, 19 more effective at treating depression are also more effective at
therapists had outcomes that were considered “below average.” If treating anxiety. This correlation may not be as noteworthy since
the patients treated by these therapists had been seen by one of the treating depression and treating anxiety would probably not rep-
other 100 therapists instead, an additional 265 patients of the total resent very different clinical challenges as they are both internal-
patient sample would likely have recovered (Laska, Gurman, & izing symptoms, while a mixture of externalizing and internalizing
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Wampold, 2014). For these 265 patients, the effect of the therapist symptoms might present more variation and thus potentially re-
This document is copyrighted by the American Psychological Association or one of its allied publishers.

was crucial. quire more specific skills. However, especially by proponents of a


Having established that therapists differ in effectiveness, the medical model of mental health and treatment, there could well be
next step is to examine whether this difference varies systemati- important differences in treating anxiety versus depression since
cally with certain client factors (e.g., gender; Owen, Wong, & they are viewed as two different types of illness; (e.g., in CBT one
Rodolfa, 2009), and how to explain therapist effects in terms of would suggest exposure for anxiety and behavioral activation or
characteristics of effective and less effective psychotherapists changing maladaptive schemas for depression, which potentially
(e.g., Anderson, Ogles, Pattersen, Lambert, & Vermeersch, 2009; require quite different skills).
Nissen-Lie et al., 2013). Ultimately, our understanding of this There is contrasting evidence that some therapists are more
question will have important implications for the training and skilled in treating some types of problems than other problems. In
supervision of therapists, as well as for how mental health care is a large sample of 6,960 patients treated by 696 therapists, Kraus et
organized, as mentioned above.
al. (2011) rank ordered therapists by their effectiveness in each of
In this study, we examine whether a conjecture that therapists
12 patient problem domains (viz., sexual functioning, work func-
are effective across patient outcome domains, as measured by two
tioning, violence, social functioning, panic/anxiety, substance
frequently used outcome questionnaires, is justified. In some ways
abuse, psychosis, quality of life, sleep, suicidality, depression, and
this question parallels a debate within psychotherapy research
mania) assessed by the Treatment Outcome Package (TOP; Kraus,
generally. In psychotherapy, considerable debate has occurred
Seligman, & Jordan, 2005). Within each problem domain, reliable
regarding whether psychotherapy works due to the specific ingre-
change scores were calculated for each patient and then therapists
dients present in a given treatment (i.e., treatment methods, spe-
were ranked from the one who most consistently created change in
cific interventions; Chambless & Hollon, 1998) or due to factors
the domain to the one who least consistently created change. To
that are common across therapies (i.e., therapeutic alliance;
assess the uniformity of therapist effectiveness, Kraus et al. cor-
Wampold & Imel, 2015). Similarly, therapists may be effective
due to specific skills they possess for treating a specific disorder related these rank orderings for the various domains. The correla-
(which would lead to differentiation in effectiveness across do- tions across outcome domains were relatively modest (.03 to .37),
mains) or they may be effective due to more global therapist even if they were mostly statistically significant. Based on this, the
characteristics (e.g., ability to form alliance with a variety of authors concluded that the therapists were not globally competent.
patients, therapist interpersonal skills; Schottke, Flückiger, Gold- They noted: “Therapists skilled in one domain may be harmful in
berg, Eversmann, & Lange, 2015), which would lead to uniformity another” (Kraus et al., 2011, p. 273). In support of this finding of
in outcomes across domains. It should be noted that, despite the therapist specific competency, Nordberg et al. (2010) found that
fact that common factors have been linked to outcome to a greater therapists’ specific skill at treating uncomplicated depressive
extent than specific ingredients (see Wampold & Imel, 2015), it is symptoms was compromised when depressive patients also pre-
possible— even likely—that psychotherapy works through a suc- sented with comorbid substance abuse, which indicated that these
cessful interplay between the two set of factors. Applied here, it is patient differences represented different clinical challenges.
plausible that therapist effectiveness can be both global and spe- There is also evidence that there are therapist effects for client
cific. gender (e.g., Owen et al., 2009) suggesting that a specific psycho-
Previous research on whether therapist effectiveness depends therapist gender competence may exist. Examining outcomes of 31
systematically on client factors is scarce but there are studies psychotherapists treating 93 male and 229 female clients, Owen et
covering client problem type and problem severity, gender and al. found that while the gender of the client did not relate to
ethnicity. We will provide a short overview of these studies. outcome (women and men had similar outcomes), some therapists
Some investigations suggest that more effective therapists are were better at treating men, whereas others were better at treating
effective across a range of patients. For example, Wampold and women, while the remaining therapists did equally well or equally
Brown (2005) demonstrated consistency across time by predicting poor with male and female clients. Even though differential gender
therapists’ patient outcomes in a later time frame from patient competence did account for a meaningful proportion of the within-
outcomes at an earlier time. Further, therapists who were success- therapist variance in client outcomes in this study, it was not large
ful in treating adults were also successful in treating children, and and data did not allow for an analysis of why some therapists
those who were effective with moderately distressed patients were might have been more helpful for female clients compared to male
THERAPIST UNIFORMITY 3

clients or vice versa. Therapists’ own gender did not account for Method
this difference.
In addition to this finding, a number of studies seem to find Participants and Procedures (Study 1)
evidence of a specific cultural competence of psychotherapists
(Hayes, Owen, & Bieschke, 2015; Imel et al., 2011). These studies, Data were obtained from the treatment research archive at the
examining disparities in therapists effectiveness in treating racial/ counseling center of a large, public university in the Western
ethnic minority (REM) patients compared to treating majority United States. Data were collected over the course of 18.43 years
(White) patients, have found evidence that some therapists produce (from December 1995 to May, 2014) on therapists in practice
better outcomes with REM patients than other therapists, suggest- during that period. Psychotherapy at the counseling center was
ing that therapeutic competence in treating majority clients and provided without session limits or extra fees beyond academic
more specific cultural competence needed when treating patients tuition. Patients completed the Outcome Questionnaire-45 (OQ-
of cultural minorities, are distinct. However, again, the effects 45; Lambert et al., 2004) prior to each session. We limited our
found were relatively small, and generally those therapists who analyses of the available data in several ways in keeping with
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

obtained better outcomes with minority (or REM) patients also studies examining naturalistic psychotherapy data (e.g., Baldwin,
This document is copyrighted by the American Psychological Association or one of its allied publishers.

obtained better outcomes with majority (White) patients, which Berkeljon, Atkins, Olsen, & Nielsen, 2009). First, we included
only outcome data from individual counseling sessions (excluding
seems to indicate that there is overlap between the two competen-
group and couples therapy). Second, to avoid cross-classification
cies.
of patients and therapists, we included individuals who met with
There are many methodological approaches to the question of
only one therapist at a time. Third, we included only the first
therapist uniformity but all the studies mentioned above operation-
episode of care with this therapist, considering an episode of care
ally defined therapist effectiveness or competence on the basis of
as ending if a period of 120 days had elapsed between sessions.
client outcomes, as recommended in the literature (e.g., Wampold,
Fourth, we included only those patients who attended at least three
2005). In our study we build on this analytic approach and propose
sessions and completed at least three OQ-45 measures. Fifth, we
that psychotherapists’ global competence across outcome domains
included only those patients whose first OQ-45 total score was in
should be defined as the psychotherapist’s ability to achieve pos- the clinical range (i.e., 63 or above; Lambert et al., 2004). This was
itive psychotherapy outcomes across domains. In addition, in our important especially since these data are from a university treat-
study we propose some new analytic tools to examine the extent to ment center. It is a typical cut-off when one wants one’s findings
which therapists effects depend on patient outcome domain or not. to be generalizable to more general clinical practices. Lastly, we
First, we investigated outcome measures with subscales (or do- included only those patients whose therapist had 10 or more cases
mains) that are relevant for all patients (i.e., symptom distress, in the data set. Setting a minimum number of clients per therapist
work functioning, interpersonal problems), irrespective of clinical was intended to allow more reliable estimates of therapist-level
diagnosis. This may be important because some problem types outcomes (Baldwin et al., 2012). The data set included OQ mea-
(like mania) occur at lower base rates, which will attenuate asso- surements from a total of 49,600 sessions. Patients attended on
ciations at the therapist level and increase the likelihood of ther- average 8.58 sessions (SD ⫽ 8.48, range ⫽ 3 to 153).
apists failing to show effectiveness in a given problem area (when Patients. Based on these requirements, sufficient data were
this may simply be due to not encountering patients with a given available for 5,828 patients: 3,672 (63.0%) were women and 2,156
difficulty, or not addressing that area of difficulty during treat- were men; average age at intake was 22.63 years (SD ⫽ 4.11).
ment). Second, the study uses data from two different treatment Reported ethnicities were 81.9% Caucasian; 6.0% Hispanic; 3.4%
settings (even different cultures), including a primary sample Asian; 1.4% Indigenous American, 1.3% Pacific Islander; 0.8%
(Study 1) with data from the United States and a replication1 Black; 0.5% Other; and 4.6% gave no report. No diagnostic
(Study 2) with data from Sweden. We address the question of assessment was conducted so we do not have information on
therapist uniformity across outcomes through the use of two dif- clinical diagnoses of the patients enrolled. Consent was obtained
ferent but commonly used outcome measures (i.e., the OQ-45 and prior to initiating treatment. Patients had agreed to use of these
the CORE-OM, see below). We hope this design reduces cultural, de-identified records in research, and the university’s human sub-
site, and instrument effects. Also, particularly in the primary study, ject review board approved use of these de-identified records.
we have a large sample size, with many patients per therapist to Therapists. Psychotherapy was provided by 158 therapists,
ensure reliable estimates of therapist effectiveness (see Baldwin, 65 (41.1%) women and 93 men. On average, therapists saw 36.89
Imel, & Atkins, 2012; Baldwin & Imel, 2013). Finally, by applying patients in the data set (SD ⫽ 47.76, range ⫽ 10 to 333). Of the
a multilevel modeling approach to investigate the question, both psychotherapy sessions provided, 30.5% were provided by train-
when deriving a reliable patient effect size and when addressing ees, 38.7% were provided by licensed professionals, and 30.8%
effectiveness across domains, we account for the nested structure were provided by the therapists who straddled these two statuses.
of the data (with patient level effect sizes in outcome nested within The majority of therapists described themselves as following an
therapists). The aims of the present study were: integrative or eclectic approach to treatment, adopting techniques,
interventions, and styles as seemed to fit the therapeutic situation.
1. To investigate the conjecture of therapist uniformity with
new analytic methods 1
When we use the term replication study, we mean replication of the
statistical methods used to examine our research question, not a replication
2. To investigate the extent to which therapists are effective of all methodological features, such as inclusion criteria, sample charac-
across outcome domains teristics, treatments and so on.
4 NISSEN-LIE ET AL.

Exceptions were one therapist who described himself as a practi- therapist uniformity), the patients selected met the same criteria as
tioner of Rational Emotive Behavior therapy, another who de- in Study 1: (a) patients had attended therapy and completed as-
scribed herself as a psychodynamically oriented therapist, and two sessments for at least three sessions and (b) patients had been
others who identified themselves as ACT (Acceptance and Com- treated by a therapist who had at least 10 patients enrolled in the
mitment Therapy) therapists. We do not have information on the study. The sample meeting these criteria was 616 outpatients
total amount of clinical experience these therapists had, but as a treated by 38 therapists (see below for more information of these
proxy we know that they had between 0.32 to 17.93 years of data participants). The average patient caseload per therapist enrolled in
available in the data set, with a mean of 4.91 years (SD ⫽ 5.15). this study was 16.20 patients (ranging from 10 –28). The mean
Additionally, therapists were on average 33.08 years old (SD ⫽ number of sessions in the primary care sample was 7.20 sessions
9.14, range ⫽ 22.72 to 70.10) with an average of 5.38 years (SD ⫽ (range ⫽ 3–55) and in the psychiatric sample 7.65 (range 3–54).
7.71, range ⫽ 0.05 to 40.10) since they began their graduate Primary care patients were included during a period of 6 months
training. (November 2009 through April 2010). The psychiatry data was
collected from 2011 through 2014. The patient received an enve-
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

lope containing the CORE-OM at the reception desk, completed it


Outcome Measure (Study 1)
This document is copyrighted by the American Psychological Association or one of its allied publishers.

in the waiting room before the session and delivered it in a closed


In Study 1, patient outcome domains were measured with the envelope. The therapist did not see the ratings. The study had
45-item Outcome Questionnaire (OQ-45; Lambert et al., 2004). received approval from the regional ethical vetting board in
The OQ was designed to measure progress in therapy by assessing Linköping (2012-355–31). Also in this sample, consent was ob-
patients’ reports about the previous week. Nine OQ items are tained prior to initiating treatment.
positively worded (e.g., Item 1, “I get along well with others”); 36 Patients. The primary care data consisted of reports from 520
items describe problems (e.g., Item 2, “I tire quickly”). Respon- patients treated by 31 therapists. Patients were referred by general
dents rate the frequency at which each event or situation occurred practitioners, or self-referred to psychological treatment at these
on a 5-point Likert-type scale ranging from Never to Almost primary care units. The psychological treatment is usually an
Always. The 45 OQ items were developed to assess three domains: adjunct treatment available for those patients whom the medical
Symptom Distress (SD, 25 items, e.g., “I feel no interest in doctors evaluate as being in need of more or other forms of
things”), Interpersonal Relationships (IR, 11 items, e.g., “I have treatment than the medication that the doctor may prescribe. About
frequent arguments”), and Social Role performance (SR, 9 items, a third of the patients used psychotropic medication, mostly anti-
e.g., “I feel stressed at work/school”). The measure has been depressants. The secondary care patients (n ⫽ 96), treated by
widely used and shown to possess desirable psychometric proper- seven psychotherapists, were referred from primary care to spe-
ties, including adequate internal consistency reliability (␣s ⫽ .93, cialized psychiatric outpatient units. Among the primary care
.71, and .78 for the SD, SR, and IR subscales in the current sample) patients, the mean age was 37.3 (SD ⫽ 14.3), in the psychiatry
and adequate test–retest reliability over a 3-week range (from .78 patients it was 32.5 (SD ⫽ 13.7). Most of the patients in Study 2
to .84; Snell, Mallinckrodt, Hill, & Lambert, 2001). The OQ were born in Sweden (⬎90%). The primary care sample was 80%
instrument has demonstrated strong evidence of construct validity female and the psychiatric sample 58% female. Formal diagnostic
via associations with measures such as the Symptom Checklist-90 assessment was not part of the treatment in the primary care units
and the Beck Depression Inventory (Lambert et al., 1996/2011), so we do not have information on psychiatric diagnoses in this
and has evidenced sensitivity to change in psychotherapy (Lambert sample. However, we have information on patients’ presenting
et al., 1996; Okiishi et al., 2003). problems as reported by the therapists in an assessment form. The
The hypothesized three-factor structure (i.e., SD, IR and SR) of most common presenting problems in the primary care sample
the 45 items has been reproduced in a variety of samples with were anxiety, relationship problems, depression, grief, work-
moderate to good fit (see Amble et al., 2014). There have, how- related problems, and somatic complaints. The majority of patients
ever, been studies failing to demonstrate this solution, instead had more than one problem. In the psychiatric clinic assessment
proposing alternative factors to represent the OQ scores (e.g., Kim, was done according to the ICD-10 (10th revision of the Interna-
Beretvas, & Sherry, 2010; Rice, Suh, & Ege, 2014). We chose to tional Statistical Classification of Diseases and Related Health
employ the traditional three-factor solution intended by the scale Problems; World Health Organization, 2004). The most common
developers (Lambert et al., 1996) when analyzing therapist effec- diagnoses among these patients were anxiety disorders (30%), and
tiveness across subscales. This was done in order to most closely mood disorders (20%). The most common treatment types in the
match the way the OQ is typically used in clinical and research primary care group, which were evenly distributed, were CBT,
contexts. Additional empirical support for this choice is provided psychodynamic, and supportive therapies, but crisis intervention,
within the Statistical Analyses (below) and Results sections. cognitive, behavioral, relational, existential, systemic, and inter-
personal therapies were also reported by the therapists. In the
psychiatric sample a majority of the patients used psychotropic
Participants and Procedures (Study 2)
medication, mostly antidepressants. In this sample, according to
In the replication study (Study 2), data were collected at primary the therapist-reported assessment forms, a large part suffered mod-
care units (Holmqvist, Ström, & Foldemo, 2014) and psychiatric erate to severe depression (ca 40%), or moderate to severe anxiety
specialty clinics, both in Sweden (see Falkenström, Josefsson, disorder (about 40%). The most common treatment types accord-
Berggren, & Holmqvist, 2016). Except for the criterion of scores ing to therapists’ reports were CBT, psychodynamic, and support-
within the clinical range at baseline (which if applied would lead ive therapy. As in the primary care sample, many therapists re-
to too low a number of therapists to conduct our analyses on ported to use more than one therapeutic orientation. The patients
THERAPIST UNIFORMITY 5

included in Study 2 and treatments they received are viewed as available for the Swedish translation of the questionnaire (Elfström
representative for mental health care in Sweden (see also Hol- et al., 2013). In the current study the internal consistency of six of
mqvist et al., 2014; Falkenström et al., 2016). the seven subscales used was acceptable for all subscales: ␣ ⫽ .72
Therapists. The 31 therapists in primary care were social (Subjective well-being), .80 (Depression), .77 (Anxiety), .66
workers (n ⫽ 20), psychologists (n ⫽ 10), and one psychiatric (Problems in close relationships), .80 (Problems in general func-
nurse. Their mean years of practice experience was 10.0 years. tioning), and .60 (Problems in social relationships). The one ex-
Twenty-eight therapists had postgraduate training in psychother- ception was for Physical symptoms (␣ ⫽ .42), which was likely
apy (in addition to the training included in their basic education), due to the subscale being based on only two items. This subscale
and eight had also received advanced postgraduate psychotherapy was thus excluded from analyses (see below). Again, we relied on
training, leading to certification according to the Swedish system. the factors proposed by the scale developers (Evans et al., 2002)
Seventeen therapists had training in psychodynamic therapy when examining therapist uniformity in these data. As with the
(PDT), 15 in cognitive therapy and nine in cognitive– behavioral OQ-45, factor analyses of the CORE-OM yield a somewhat com-
therapy (CBT) or behavior therapy. Many therapists had training in plex structure; some analyzes show two different factors repre-
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

several methods. In the secondary care sample there were seven senting psychological distress and risk, or three different factors
This document is copyrighted by the American Psychological Association or one of its allied publishers.

therapists. These therapists were psychologists or social workers reflecting positively and negatively keyed items and risk items, or
with basic or advanced training in psychotherapy. Most of them a structure with lower order factors loading on a higher order
had received training in both CBT and PDT. Most therapists in factor latent for psychological distress (Lyne, Barrett, Evans, &
Study 2 were female (80%). Barkham, 2006). We discuss the implications of the pretreatment
factor structure in the present sample in the Results section.
Outcome Measure (Study 2)
In Study 2, patient outcome domains were measured with the Statistical Analyses
Clinical Outcome in Routine Evaluation–Outcome Measure
(CORE-OM; Evans et al., 2002). The CORE-OM comprises 34 Subscale scores of the OQ-45 and CORE-OM were computed as
items developed to assess four main (problematic) aspects of a the mean of the items drawn from that subscale based on published
patient’s life: Subjective Well-Being, Problems, Function, and scoring methods. This was done to support the generalizability of
Risk, which in turn reflect 10 different outcome domains: 1) the current study to other researchers and clinicians using these
Subjective well-being (e.g., “I have felt overwhelmed by prob- same measures. We examined therapist variation in baseline scores
lems”) measured by four items; 2) Depression (e.g., “I have felt on these subscales by computing intraclass correlations (ICCs),
despairing and hopeless”) measured by four items; 3) Anxiety denoting therapist variance in proportion to total variance (patient
(e.g., “I have felt panic or terror”) measured by four items; 4) and therapist variance). Additionally, we examined intercorrela-
Trauma (e.g., “Unwanted images or memories have been distress- tions of the subscales at baseline, both at the within level (patient
ing me”) measured by two items; 5) Physical symptoms (e.g., “I level) and at the between level (therapist level) to provide an
have difficulty getting to sleep or staying asleep”) measured by indication of their overlap and to assess the degree to which they
two items; 6) Problems in close relationships (e.g., ”I have felt I were independent at each level.
have no friends”) measured by four items; 7) Problems in general Effect size estimation procedure. To prepare for the therapist
functioning (e.g., ”I have been able to do most things I needed uniformity analyses using a multilevel confirmatory factor analytic
to”)2 measured by four items; 8) Problems in social functioning framework (see below), we calculated patient-level effect sizes
(e.g.,”I have felt humiliated or shamed by other people”) measured which we call multilevel growth ds. In doing so we fitted two-level
by four items; 9) Risk to self (e.g.,”I made plans to end my life”) multilevel models (MLMs) with outcome observations nested
measured by four items; and 10) Risk to other (e.g., “I have been within patients predicting scores on OQ-45 or CORE-OM sub-
physically violent to others” measured by two items. Patients scales from session number (centered at the first session). These
scores on Subscales 1, 2, 3, 5, 7, and 8 were used in the present MLMs were fit using the R programming language (version 3.1.0;
(replication) study only, excluding those of Subscales 4, 9, and 10 R Development Core Team, 2014) and the “nlme” multilevel
(trauma, risk to self, and risk to other), which were relatively modeling package (Pinheiro, Bates, DebRoy, Sarkar, & the R
infrequently reported and thus not relevant to most patients, and Development Core Team, 2013). Separate models were con-
Subscale 6 (Physical symptoms) due too poor internal consistency. structed for each OQ-45 and CORE-OM subscale. First simpler
Respondents rated the frequency at which each event or situation models were constructed that included only a fixed effect for
occurred within the last week on a 5-point Likert-type scale session number. These models were then compared with more
ranging from 0 (Never) to 4 (Almost all the time). The CORE-OM complex models that included additional random slope parameters
has demonstrated good psychometric properties by all relevant for session number. Formal model comparison was used to assess
criteria—reliability (i.e., the scores demonstrate high internal and the contribution of random slopes. An example of these second
test-rest reliability), convergent validity (i.e., meaningful correla- more complex models is as follows:
tions with similar instruments such as BDI, BAI, SCL-90 and
IIP-32), and discriminant validity (i.e., scores discriminate well Y ij ⫽ ␤00 ⫹ ␤10(Session #)i ⫹ [b00j ⫹ b10j(Session #)i ⫹ eij]
between clinical and nonclinical samples)—as well as demonstrat- (1)
ing sensitivity to change (in terms of substantial and statistically
significant improvements on all scores in three different treatment
2
settings, see Evans et al., 2002). Similar psychometric support is This and seven other items of the CORE-OM are reverse scored.
6 NISSEN-LIE ET AL.

Where Yij is the OQ-45 or CORE-OM subscale score (e.g., OQ


T_g
Symptom Distress; CORE-OM Anxiety) for a given patient (j) at
a given session (i). The fixed components of the model are outside
the brackets and include ␤00 reflecting the fixed intercept (i.e.,
conceptually the overall group baseline OQ-45 or CORE-OM
score) and ␤10 (session #) the fixed slope (i.e., overall group d_subscale d_subscale d_subscale
change per session). The random components of the model (within 1 2 3
brackets) include b00j reflecting the within-person random inter- Between-
Therapists
cept (i.e., deviation from the group mean baseline OQ-45 or
CORE-OM score) and b10j reflecting the within-person random Within-
slope (i.e., deviation from the group mean slope over time). The Therapists

final random term (eij) is the error or residual term reflecting the d_subscale d_subscale
d_subscale
unexplained variation in OQ-45 or CORE-OM scores. 1 2 3
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Both the fixed and random slope components were used in order
This document is copyrighted by the American Psychological Association or one of its allied publishers.

to compute an estimate of overall change. The fixed slope estimate


was first added to each patient’s random slope estimate with the
sum then multiplied by the number of sessions each patient at- Figure 1. Example of model of therapist uniformity using a multilevel
tended minus one (as the final OQ-45 or CORE-OM assessment confirmatory factor framework. Note: T_g is the latent global therapist
took place prior to the last session). This product reflects each effectiveness factor. d_subscale is the multilevel growth d (patient-level
patient’s overall change in OQ from pre- to posttreatment. In order effect size) for any given outcome domains (subscales) that are indicators
of the T_g. The single arrows are factor loadings, the double arrows are
to provide a more interpretable index of change in standardized
correlations, in keeping with standard structural equation notation. The line
units, these estimates were then divided by the standard devi-
divides the analyses conducted on the between (therapist) level and the
ation of the given OQ-45 or CORE-OM scale at pretreatment. within (patient) level.
This way, a metric of change in units comparable to Cohen’s d
(i.e., change in standard deviation units; Cohen, 1988) was
derived using all available patient-level data. We call this Muthén, 2011) was used to conduct the analyses. The Bayesian
metric multilevel growth dj: estimator was applied due to the relatively small sample size at the
therapist level, relative to the within-level (patient) sample size in
Multilevel Growth d j
both samples.
关␤10(Session #)i ⫹ ␤10j(Session #)i兴 * [(# of Session)i ⫺ 1]

OQSDPre Results
(2)
Pretreatment Scale Structure
Multilevel factor analysis. Patient-level estimates of change
(multilevel growth ds) were then used to fit two multilevel con- To interpret the results of the multilevel factor analysis of
firmatory factor analytic models (CFAs; one for the OQ-45 sub- treatment effects, which investigates the structure of treatment
scales and one for the CORE-OM subscales) within a structural changes at the therapist level, it is useful to examine the structure
equation modeling (SEM) framework. These analyses were at pretreatment at both the patient level and the therapist level.
deemed suitable to examine if a model specifying a latent factor at Many instruments, including the OQ and the CORE-OM, attempt
the therapist level (the therapist global effectiveness factor, or to assess independent domains of psychological functioning. This
g-factor) would represent the covariation among the subscale ef- is a challenge for two reasons: (a) comorbidity is the norm not the
fect sizes at the patient level, while simultaneously modeling the exception (Gadermann, Alonso, Vilagut, Zaslavsky, & Kessler,
patient-level correlations between these subscales. To visualize 2012), and (b) there is evidence that psychopathology itself has a
this model, see Figure 1. predominant general factor (Caspi et al., 2014). Nevertheless, it is
In these CFAs, therapists represented the “between” level and instructive to compare these baseline correlations of the subscales
patients the “within” level of analysis, the multilevel growth d of the OQ and CORE-OM to those reported later for correlations
estimates for the OQ-45 and CORE-OM subscales constituted the among change indices, as even attributes that are relatively highly
input files, and the therapist identification number was the cluster correlated at a single time point need not change at similar rates.
variable. The model command involved specifying a latent thera- The degree to which the within-therapist structure and the
pist effectiveness factor (called the therapist g_factor) at the be- between-therapist structure are similar depends on the magnitude
tween (therapist) level indicated by the multilevel growth ds for of therapist-level variance (i.e., therapist effects). If there is no
the OQ-45 subscales (called d_SD, d_SR, and d_IR) or the therapist variability in (i.e., the ICC for therapist at baseline is
CORE-OM subscales (d_WB, d_A, d_D, etc.). At the within approaching zero), there is little variance at the therapist level, and
(patient) level, we specified unrestricted correlations between correspondingly little covariance at the therapist level (see below).
these indicators. This way we were able to assess whether one OQ. The ICCs for the OQ were .003, .004, and .005 (p ⬎
latent factor across subscales at the therapist level accounted for .001) for the SD, SR, and IR subscales, respectively, indicating
the covariation between the patient level effect sizes or not, that is, that there were no therapist effects at baseline (see Table 1). This
to which extent therapist effectiveness can be conceived of as a makes sense given that patients in this sample were quasirandomly
global construct. The statistical program Mplus (Muthén & assigned to therapists and with certainty were not assigned based
THERAPIST UNIFORMITY 7

Table 1 for SD, IR, and SR subscales, all p ⬍ .001). Thus, the MLM
Intraclass Correlation Coefficients (ICCs) of Baseline Scores of equation defined above (Equation 1) represents the model used to
CORE-OM and OQ-45 estimate random slope parameters. Patient-level random slopes
were then converted into multilevel growth ds (Equation 2).
Baseline scores ICCs Patients showed substantial change in OQ scores from baseline to
CORE subscale posttreatment as indexed by the magnitude of their multilevel growth
WB .091 d effect sizes. Effect sizes for the OQ total score and three OQ
A .100 subscales were as follows: d_total: mean ⫽ ⫺0.74, 95% CI ⫽
D .086 [⫺0.76, ⫺0.72], SD ⫽ 0.67; d_SD: mean ⫽ ⫺0.74, [⫺0.76, ⫺0.72],
CR .080
GF .074 SD ⫽ 0.65; d_SR: mean ⫽ ⫺0.48, [⫺0.50, ⫺0.46], SD ⫽ 0.54; d_IR:
SR .057 mean ⫽ ⫺0.38, [⫺0.40, ⫺0.36], SD ⫽ 0.48. All 95% CIs did not
OQ subscale contain zero indicating that patient-level change across time differed
SD .003 significantly from zero.
SR .004
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Intraclass correlation coefficients (ICCs) were computed to es-


IR .005
This document is copyrighted by the American Psychological Association or one of its allied publishers.

timate the proportion of variance in each outcome attributable to


Note. WB ⫽ Well Being; A ⫽ Anxiety; D ⫽ Depression; CR ⫽ Close the therapist. In this instance, the ICC represents the proportion of
Relationships; GF ⫽ General Functioning; SR ⫽ Social Relationships
between-therapist variance in OQ within-person ds relative to total
(⫽CORE-OM); SD ⫽ Symptom distress; SR ⫽ Social Relationships; IR ⫽
Interpersonal Relationships (⫽OQ); ICCs ⫽ therapist intraclass correla- variance (see Table 1). ICCs were as follows: d_Total ⫽ .0189,
tions indicating proportion of therapist variability to total variability. d_SD ⫽ .0198, d_IR ⫽ .0125 and d_SR ⫽ .0191, indicating that,
typically, under 2% of the variance in OQ effect sizes was ac-
counted for by therapists.
on OQ scores or initial levels of severity. That is, the severity level Study 2. The same procedure was used in producing within-
of each therapist’s caseload was approximately equal. Moreover, person ds for the six CORE-OM subscales: d_WB (Well-be-
the correlations among subscales within therapists and between ing): M ⫽ ⫺0.52 [⫺0.56, ⫺0.49], SD ⫽ 0.47; d_A (Anxiety):
therapists are also approximately equal (to each other and to the M ⫽ ⫺0.52 [⫺0.55, ⫺0.48], SD ⫽ 0.47; d_D (Depression):
total correlations, which are those that ignore therapists). The three M ⫽ ⫺0.55 [⫺0.59, ⫺0.51], SD ⫽ 0.50; d_CR (Close relation-
OQ subscales were found to correlate as follows: rs ⫽ .27, .44, and ships): M ⫽ ⫺0.24 [⫺0.27, ⫺0.22], SD ⫽ 0.35; d_GF (General
.17 for SD with IR, SD with SR, and IR with SR, respectively functioning): M ⫽ ⫺0.30 [⫺0.33, ⫺0.27], SD ⫽ 0.34; d_SR
(mean correlation ⫽ .29). Note that the between-level correlations (Social relationships) ⫽ ⫺0.30 [⫺0.32, ⫺0.27], SD ⫽ 37. The
for the baseline scores of the OQ could not be modeled, due to the 95% CIs did not contain zero indicating again that patient-level
low ICCs of therapists at baseline which essentially means that change across time differed significantly from zero.
there was no covariance at the therapist level. Therapist intraclass correlation coefficients (ICCs) for the multi-
CORE-OM. The ICCs for the CORE-OM are also given in level growth d change scores in the six CORE-OM subscales were
Table 1. Here the ICCs are quite large, ranging from .057 to .100, also computed, and were as follows: d_ WB (Well-being): 0.1; d_A
indicating that in this sample, the patients might have been as- (Anxiety): 0.1; d_D (Depression): d_CR (Close relationships): 0.2
signed to therapists based on initial severity, as reflected by their d_GF (General functioning); 0.02 and d_SR (Social relationships):
CORE-OM scores. Thus, the correlations among the subscales 0.1. Hence, in the Swedish sample, therapists accounted for around
may well differ depending on whether they are calculated within 1% of the variance in patient change in the CORE-OM.
therapists or between therapists. The correlations for each level are
presented in Table 2. As can be seen, the correlations range from Multilevel Factor Analysis
.27 to .88 (M ⫽ .61). Generally the correlations at the within
In order to investigate therapist uniformity across the outcome
therapist level are slightly smaller than at the between (therapist)
domains of the OQ-45 and the CORE-OM, we fit two CFAs
level (means of .57 and .65, respectively). Hence, in the CORE-
OM, we do assess more closely related dimensions of functioning
than in the OQ, just as the ICCs reflect.
Table 2
Intercorrelations Between Baseline Values of
Patient-Level Change CORE-OM Subscales
Study 1. Prior to producing estimates of patient-level change,
Subscale WB A D CR GF SR
it was first necessary to define a best fitting MLM for computing
Level W/B W/B W/B W/B W/B W/B
multilevel growth ds. A formal model comparison was conducted
between a simpler MLM that did not allow patients to vary in their WB 1
change in OQ (or CORE-OM) scores across session number with A .71/.86 1
D .79/.84 .72/.88 1
a more complex MLM that did allow patients to vary (i.e., random CR .52/.60 .43/.58 .54/.70 1
intercept only model compared with a random intercept and ran- GF .68/.52 .58/.64 .62/.48 .48/.27 1
dom slope model). The addition of a random slope parameter SR .49/.67 .50/.78 .56/.71 .49/.62 .45/.57 1
significantly improved model fit for the OQ total score (log- Note. WB ⫽ Well Being; A ⫽ Anxiety; D ⫽ Depression; CR ⫽ Close
likelihood ratio ⫽ 7360.26, p ⬍ .001) as well as all three OQ Relationships; GF ⫽ General Functioning; SR ⫽ Social Relationships;
subscales (log-likelihood ratios ⫽ 7443.94, 3822.40, and 3748.18 W/B ⫽ Correlation at within-level/correlation at between-level.
8 NISSEN-LIE ET AL.

modeling the covariation between the indicators (i.e., change in an 2). The model fit was excellent, with a posterior predictive p value
OQ-45 subscale) as loading on a latent effectiveness factor at the of .46 (this value is .50 if fit is perfect).
therapist level while simultaneously modeling the patient level Study 2. Replicating Study 1, in the model of therapist uni-
correlations (see Figure 2). To accomplish this we used the Bayes- form effectiveness across outcome domains, the standardized fac-
ian estimator3 in both data sets, which applies Markov Chain tor loadings of the indicators (i.e., subscale effect sizes) were large;
Monte Carlo (MCMC) simulation. MCMC simulation is less sen- four were statistically significant (Well-being, Depression, Close
sitive to sample size issues (Hamaker & Klugkist, 2011). Two relationships, Social relationships, p ⬍ .05) while two (Anxiety
MCMC chains were run, using 50,000 iterations to achieve a stable and General functioning) were not. R-Squared indices for the
convergence. Convergence was checked using the potential scale therapist level were all statistically significant at p ⬍ .001, ranging
reduction factor (⬍1.05) and the Kolmogorov Smirnoff test for from .50 (General functioning) to .90 (Depression). As in Study 1,
differences between the two MCMC chains (should be nonsignif- patient-level correlations were also statistically significant and
icant for all parameters). Mplus default priors were used, with slightly smaller in size (ranging from .50⫺.90) than the therapist-
infinite variances to make them noninformative in order to base level factor loadings. In this model of six indicators loading on one
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

estimates only on the data. When Bayesian estimation is used a latent factor, model fit was also very good, with a posterior
This document is copyrighted by the American Psychological Association or one of its allied publishers.

model fit test statistic is obtained even if the model is ‘just predictive p value of .40. See Figure 3 for a visualization of the
identified’ (i.e., with 0 degrees of freedom), as it was in the MLM CFA of the Swedish data. See also Table 3 with intercor-
primary (U.S.) study with only three indicators loading on one relelations between subscales at the within (patient) level in this
latent factor. In this case, model fit tests the distributional assump- model.
tions of the model (Asparouhov & Muthén, 2010). In the replica- To sum, both Studies 1 and 2 suggest that therapists effective (or
tion (Swedish) study, the model fit can be interpreted directly since ineffective) within one outcome domain are also effective within
it models one latent factor based on six indicators. See detailed the other domains of these two commonly used instruments.
results from the two CFAs below.
Study 1. The standardized output from the model in the U.S.
sample yielded large and significant factor loadings for all sub- Discussion
scales of the OQ: ␭s ⫽ .97, .94, and .62 for SD, SR, and IR, Although there is accumulating evidence to suggest that thera-
respectively (all ps ⬍ .001), indicating that a latent general factor pists on average differ in effectiveness (Baldwin & Imel, 2013),
was well represented by the three subscale effect sizes. few studies have investigated the extent to which this differential
The patient-level correlations that were fit in the same model efficacy systematically depends on patient problem domain (e.g.,
were also significant, albeit smaller than the factor loadings (SD Kraus et al., 2011; Wampold & Brown, 2005). This study, com-
with IR ⫽ .68, p ⬍ .001; SD with SR ⫽ .71, p ⬍ .001 and SR with prising one primary study and one replication, utilized new meth-
IR ⫽ .53, p ⬍ .001). The R-squared indices at the therapist level ods to investigate change and to assess the therapist uniformity
showed significant explained variance for all subscales (SD ⫽ .93, conjecture. Using two common outcome measures (OQ-45,
p ⬍ .001; SR ⫽ .88, p ⬍ .001, and IR ⫽ .38, p ⬍ .01; see Figure CORE-OM), we assessed the interrelationships between their sep-
arate problem domains reflecting different outcomes. The findings
of the main study indicate that these domains are relatively dis-
T_g tinct. In the second study they are less distinct and we discuss this
in the limitations. To measure therapist uniformity across outcome
.97 .62 domains, we computed a multilevel growth d which takes advan-
.94
tage of all available assessment points, accounts for treatment
length, and yields an effect size in Cohen’s d units.
d_SD d_SR d_IR In the multilevel confirmatory factor analyses, modeling these
Between-
multilevel ds as indicators of one latent therapist factor, fit indices
Therapists suggested an excellent model fit in both studies. At the patient
level, large correlations between the multilevel ds of the outcome
Within- .68
Therapists .71 .53 domains indicated that patients who improve more in one domain
also improve in the other domains. At the therapist level, which
d_SD d_SR d_IR directly addresses the therapist uniformity conjecture, factor load-
ings were large for the latent factor (the “therapist g-factor”) of
therapist effectiveness across domains, providing support for the
notion of therapist uniformity. Large loadings at the therapist level
Figure 2. Model of therapist uniformity from the multilevel confirmatory
were present even when simultaneously modeling the relationships
factor analysis (MLM CFA) using OQ subscales (Study 1). Note: d_SD; between the subscale multilevel ds at the patient level in both data
d_SR and d_IR are patient-level estimates of change in the subscales of
Symptom distress (SD); Social role performance (SR) and Interpersonal 3
In the main study, we first estimated the model using the default
relationships (IR) from OQ-45 (Lambert et al., 2004); T_g is the latent estimator, Maximum Likelihood with robust standard errors (MLR), but
therapist “g-factor.” Bold characters mark latent factor and factor loadings. the results showed a nonpositive definite covariance matrix due to a
All factor loadings and intercorrelations are significant, p ⬍ .001. Bayesian negative (but nonsignificant) variance estimate (for Symptom distress).
estimation was used in model fitting. The model showed excellent fit This was likely due to an overly complex model on the between-level in
(posterior predictive value ⫽ .46). relation to the between-level sample size.
THERAPIST UNIFORMITY 9

in Sweden versus the United States, it is nonetheless comforting to


T_g see results replicate across cultural contexts. This cross-cultural
replication arguably strengthens the external validity of our find-
ings.
.89 .85 If we take the findings of the current study alongside previous
.81 .70
.95 .91 findings suggestive of therapist uniformity (e.g., Green et al.,
2014; Wampold & Brown, 2005), one might wonder what makes
d_WB d_A d_D d_CR d_GF d_SR
effective therapists uniformly effective. One possibility is that
Between-
Therapists
effective therapists are effective because they possess a flexibility
to adapt their treatment approach to fit the unique needs and
presenting concerns of the patient that they treat at any given point.
Within-
Therapists d_WB d_A d_D d_CR d_GF d_SR To demonstrate this level of clinical acumen, highly effective
therapists may possess a heightened sensitivity to perceive the
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

signals of the patient, and a responsiveness to patients’ reactions to


This document is copyrighted by the American Psychological Association or one of its allied publishers.

their therapeutic interventions or interpersonal manner (cf. Stiles,


Figure 3. Model of therapist uniformity from the multilevel confirmatory Honos-Webb, & Surko, 1998). These therapists may show a will-
factor analysis (Study 2) using the CORE-OM. Note: d_WB, d_A, d_D, ingness to take actions to self-correct when required. It may be that
d_CR, d_GF and d_SR are patient-level estimates of change in the sub- these qualities are more important than specific techniques in
scales of Well-being, Anxiety, Depression, Close Relationships, General treating a given disorder or problem domain.
Functioning and Social Functioning. Factor loadings for all subscales Such an interpretation is in line with a number of recent findings
except Anxiety and General Functioning were significant at p ⬍ .05, and from work examining therapist effects and therapist characteristics
intercorrelations at the patient level (see intercorrelations in Table 2) were
that relate to patient outcomes: Effective therapist form alliances
significant, p ⬍ .001. Bayesian estimation was used in model estimation.
with a wide range of patients (e.g., Del Re, Flückiger, Horvath,
The model showed very good fit (posterior predictive value ⫽ .40).
Symonds, & Wampold, 2012; Zuroff, Kelly, Leybman, Blatt, &
Wampold, 2010), demonstrate high levels of facilitative interper-
sets. This implies that those therapists who are effective (or inef- sonal skills (e.g., Anderson et al., 2009; Schottke et al., 2015), are
fective) within one domain are also effective (or ineffective) with willing to admit doubt and shortcomings as a therapist (e.g.,
the other domains. Importantly, this appears to hold across two Najavits & Strupp, 1994; Nissen-Lie et al., 2013), and to use
different measures, samples, treatment settings, and even cultural feedback and adjust treatment accordingly (see Tracey, Wampold,
contexts. Lichtenberg, & Goodyear, 2014; Wampold & Imel, 2015).
Due to the wide-ranging implications of the question of therapist In terms of theoretical implications, our findings can be viewed
uniform effectiveness for mental health administrators, practicing in a wider context as supporting a contextual model of psycho-
therapists, and training institutions as well as for the scientific therapy (see Wampold & Imel, 2015), in which psychotherapy is
understanding of therapeutic change, the intention in the current viewed as effective due to its to common factors (the quality of the
work was to examine extent to which therapist effectiveness is therapeutic alliance, general change principles; Goldfried, 2009)
uniform across well-known and widely used outcome domains. and correspondingly to more global characteristics of the psycho-
We intended to use analytic methods that we believe are especially therapist (e.g., the ability to form a good alliance with a variety of
suited to examine the uniformity conjecture and which, to our patients) as opposed to the specific ingredients (e.g., treatment
knowledge, have not been used before. First, we used outcome methods, specific interventions) which would require a set of
domains (i.e., symptom distress, work functioning) that are rele- specific skills of the therapist in treating a given mental health
vant for almost all patients regardless of clinical diagnosis or problem or clinical disorder. In accordance with this, a puzzling
presenting problem. Thus, if change were occurring across thera- yet replicated finding is that therapist effectiveness seems unre-
pists’ caseloads, it could feasibly be detected on our outcome lated to professional variables such as theoretical orientation, type,
measures. Second, we accounted for dependencies in the data and amount of training and professional experience or degree of
caused by nesting (Hox, 2010; Snijders & Bosker, 2011) by
applying a multilevel modeling (MLM) framework, both when
deriving an estimate of psychological change and when fitting a Table 3
multilevel confirmatory factor analyses with patient effect sizes Intercorrelations Between Effect Sizes of the
nested within therapists. Third, we maintained variability in out- CORE-OM subscales
comes by using continuous (rather than categorical) estimates of
Effects sizes
patient-level change. Fourth, besides the use of MLMs, applying a of CORE-OM d_WB d_A d_D d_CR d_GF d_SR
SEM modeling approach allowed us to estimate adequacy of
model fit. Fifth, by having a relatively large number of patients per d_WB 1
d_A .86 1
therapist (37 on average in the primary study), we increased the d_D .89 .86 1
reliability of therapist level estimates (see Adelson & Owen, 2012; d_CR .68 .61 .69 1
Baldwin et al., 2012; Baldwin & Imel, 2013). Lastly, we studied d_GF .79 .72 .72 .56 1
two different samples of therapists/patients from different cultures. d_SR .68 .67 .72 .62 .59 1
With regard to the potential impact of culture, while we have no a Note. WB ⫽ Well Being; A ⫽ Anxiety; D ⫽ Depression; CR ⫽ Close
priori reason to expect that therapists are less uniformly effective Relationships; GF ⫽ General Functioning; SR ⫽ Social Relationships.
10 NISSEN-LIE ET AL.

adherence to a protocol— even competence in delivering certain specific diagnoses rather than the more general outcome domains
therapeutic interventions (Barber, 2009; Beutler et al., 2004; Gold- that were used. Despite the evidence of therapist uniformity, there
berg et al., 2016; Tracey et al., 2014). is a chance that therapist effectiveness can be both global (i.e., cut
An additional line of basic research in support of this notion can across different outcome domains) and specific (i.e., depend on
be found within the study of psychopathology. A growing body of some aspect of patients’ problems or pathology; Kraus et al.,
evidence suggests that clinical disorders or outcome domains do 2011). Future work might shed more light on this possibility.
not represent distinct categories; indeed, comorbidity is the rule A related limitation of this study is that the subscales of the OQ
rather than the exception (Gadermann et al., 2012). Furthermore, and CORE-OM at pretreatment were correlated. The correlations
there is evidence that mental disorders are sequentially comorbid among the three subscales of the OQ were relatively low (viz., .29)
and dimensional, rather than distinct categories; one general psy- but they were larger for the CORE-OM (viz., .57 within therapists
chopathology dimension, the “p factor,” explained the variation and .65 between therapists). One could argue that the CORE
among patients with different mental health concerns in a recent outcomes measured were not sufficiently distinct at pretreatment;
study (see Caspi et al., 2014). Evidence drawn from genetics however, these correlations are similar to the correlations of well-
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

likewise supports the possibility of shared genetic risk factors accepted subscales including the SCL-90 (Hafkenscheid, 1993)
This document is copyrighted by the American Psychological Association or one of its allied publishers.

underlying multiple forms of psychopathology (e.g., Hyman, and the TOP (Kraus et al., 2005).
2010). Higher scores on such a general pathology factor represent Moreover, outcomes in two out of six subscales of the
more life impairment, worse prognosis, and early life damage CORE-OM did not yield factor loadings that reached statistical
(Caspi et al., 2014). If this proposed p factor is valid, even though significance, and thus might not seem to fully support the therapist
we did not study therapist effectiveness across clinical diagnoses, uniformity conjecture. However, it should be noted that the two
our findings are even less surprising; there could well be a match insignificant loadings (for Anxiety and General functioning) were
between patients’ general pathology and therapist global effective- large (i.e., .81 and .70) and that failing to reach significance may
ness, rather than a match between specific problem areas and skills be related to low statistical power, since in this study the number
in treating each of them. The observations of Caspi et al. (2014) on of therapists was rather modest (n ⫽ 38).
the patient level correspond in turn with findings of Saxon and A third limitation is that the U.S. data are from one single
Barkham (2012) suggesting that therapists differences increase university center, which may limit generalizability of the findings.
with increasing patient severity of problems, rather than with type A general lack of detailed information of participants is a problem
of problem. in this kind of study. In order to be better able to determine
As mentioned before, with regards to differential therapist ef- whether a sample is representative we would have wanted to know
fectiveness with cultural minorities (Imel et al., 2011; Hayes et al., the level of education, income status, gender/sexual orientation,
2015), although they are indicative of a specific cultural compe- and clinical diagnoses of the patients.
tence, it seems that this competence should be better conceived of Notwithstanding the limitations of the current study, we suggest
as a form of Multicultural Orientation (MCO; Owen, 2013) char- that the contributions to the literature of this work are: (a) probing
acterized for example by therapists’ cultural humility, openness further the question of therapist uniformity; (b) proposing analytic
and cultural comfort rather than a specific skill or cultural knowl- methods to investigate therapist uniformity; and (c) providing
edge; that is, a way of being, rather than a way of doing (Owen, further evidence suggesting that therapists are uniformly effective
2013). These ideas are receiving empirical support (e.g., Owen et across the subscales of two widely used outcome measures (i.e.,
al., 2015) and to us seem to align with an understanding that highly OQ, CORE-OM). We hope our effort can stimulate other research-
effective therapists have an openness and a flexibility that allow ers to further our collective knowledge on the nature of therapist
them to tailor interventions to suit the specific needs of a client and effectiveness and its interplay with client factors.
a sensitivity toward context. This corresponds well with the notion
of therapist appropriate responsiveness (see Stiles et al., 1998), References
defined as the therapist’s ability to achieve optimal benefit for the
client by continually modifying responses to the state of the client Adelson, J. L., & Owen, J. (2012). Bringing the psychotherapist back:
Basic concepts for reading articles examining therapist effects using
and the interpersonal interaction (Hatcher, 2015). This promising
multilevel modeling. Psychotherapy, 49, 152–162. http://dx.doi.org/10
concept, may likely distinguish more from less effective therapists .1037/a0023990
in general. Amble, I., Gude, T., Stubdal, S., Øktedalen, T., Skjorten, A. M., Andersen,
The current study has several important limitations. One limi- B. J., . . . Wampold, B. E. (2014). Psychometric properties of the
tation may be that the patient outcome domains that were mea- Outcome Questionnaire-45.2: The Norwegian version in an international
sured were somewhat global in nature (i.e., generally not disorder- context. Psychotherapy Research, 24, 504 –513, http://dx.doi.org/10
specific) and may thus not have detected the specificity of therapist .1080/10503307.2013.849016
effectiveness, if specificity indeed exists. There is reason to be- Anderson, T., Ogles, B. M., Patterson, C. L., Lambert, M. J., & Ver-
lieve that there may be a trade-off between two contrasting con- meersch, D. A. (2009). Therapist effects: Facilitative interpersonal skills
siderations in testing the uniformity conjecture: the advantage of a as a predictor of therapist success. Journal of Clinical Psychology, 65,
755–768. http://dx.doi.org/10.1002/jclp.20583
wide spectrum of patient problems, parts of which are likely
Asparouhov, T., & Muthén, B. O. (2010). Bayesian analysis using Mplus:
irrelevant for a sizable portion of a patient sample, or having a Technical implementation. Retrieved from http://citeseerx.ist.psu.edu/
more limited set of generally relevant outcome domains, with a viewdoc/download?doi⫽10.1.1.310.3903&rep⫽rep1&type⫽pdf
loss of problem area specificity. It is possible that therapist spec- Baldwin, S. A., Berkeljon, A., Atkins, D. C., Olsen, J. A., & Nielsen, S. L.
ificity in effectiveness could have emerged if we applied the (2009). Rates of change in naturalistic psychotherapy: Contrasting dose-
current statistical methods to examine therapist uniformity across effect and good-enough level models of change. Journal of Consulting
THERAPIST UNIFORMITY 11

and Clinical Psychology, 77, 203–211. http://dx.doi.org/10.1037/ clinical setting. Journal of Counseling Psychology, 63, 1–11. http://dx
a0015235 .doi.org/10.1037/cou0000131
Baldwin, S. A., & Imel, Z. E. (2013). Therapist effects: Findings and Goldfried, M. R. (2009). Searching for therapy change principles: Are we
methods. In M. J. Lambert (Ed.), Bergin and Garfield’s handbook of there yet? Applied & Preventive Psychology, 13, 32–34. http://dx.doi
psychotherapy and behavior change (6th ed., pp. 258 –297). Hoboken, .org/10.1016/j.appsy.2009.10.013
NJ: Wiley. Green, H., Barkham, M., Kellett, S., & Saxon, D. (2014). Therapist effects
Baldwin, S. A., Imel, Z. E., & Atkins, D. C. (2012). The influence of and IAPT Psychological Wellbeing Practitioners (PWPs): A multilevel
therapist variance on the dependability of therapists’ alliance scores: A modelling and mixed methods analysis. Behaviour Research and Ther-
brief comment on “The dependability of alliance assessments: The apy, 63, 43–54. http://dx.doi.org/10.1016/j.brat.2014.08.009
alliance-outcome correlation is larger than you think” (Crits-Christoph et Hamaker, E. L., & Klugkist, I. (2011). Bayesian estimation of multilevel
al., 2011). Journal of Consulting and Clinical Psychology, 80, 947–951. models. In J. J. Hox & J. K. Roberts (Eds.), Handbook for advanced
http://dx.doi.org/10.1037/a0027935 multilevel analysis (pp. 137–161). New York, NY: Routledge/Taylor &
Barber, J. P. (2009). Toward a working through of some core conflicts in Francis Group.
psychotherapy research. Psychotherapy Research, 19, 1–12. http://dx Hatcher, R. L. (2015). Interpersonal competencies: Responsiveness, tech-
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

.doi.org/10.1080/10503300802609680 nique, and training in psychotherapy. American Psychologist, 70, 747–


This document is copyrighted by the American Psychological Association or one of its allied publishers.

Benton, S. A., Robertson, J. M., Tseng, W. C., Newton, F. B., & Benton, 757. http://dx.doi.org/10.1037/a0039803
S. L. (2003). Changes in counseling center client problems across 13 Hayes, J. A., Owen, J., & Bieschke, K. J. (2015). Therapist differences in
years. Professional Psychology: Research and Practice, 34, 66 –72. symptom change with racial and ethnic minority clients. Psychotherapy,
http://dx.doi.org/10.1037/0735-7028.34.1.66 52, 308 –314.
Beutler, L. E., Malik, M., Alimohamed, S., Harwood, T. M., Talebi, H., Hafkenscheid, A. (1993). Psychometric evaluation of the Symptom Check-
Noble, S., & Wong, E. (2004). Therapist variables. In M. J. Lambert list (SCL-90) in psychiatric patients. Personality & Individual Differ-
(Ed.), Bergin & Garfield’s handbook of psychotherapy and behavior ences, 14, 751–756.
change (5th ed., pp. 227–306). New York, NY: Wiley. Holmqvist, R., Ström, T., & Foldemo, A. (2014). The effects of psycho-
Caspi, A., Houts, R. M., Belsky, D. W., Goldman-Mellor, S. J., Harrington, logical treatment in primary care in Sweden—A practice-based study.
H., Israel, S., . . . Moffitt, T. E. (2014). The p factor: One general Nordic Journal of Psychiatry, 68, 204 –212. http://dx.doi.org/10.3109/
psychopathology factor in the structure of psychiatric disorders? Clinical 08039488.2013.797023
Psychological Science, 2, 119 –137. http://dx.doi.org/10.1177/ Horvath, A. O., Del Re, A. C., Flückiger, C., & Symonds, D. (2011).
2167702613497473 Alliance in individual psychotherapy. Psychotherapy, 48, 9 –16. http://
Chambless, D. L., & Hollon, S. D. (1998). Defining empirically supported dx.doi.org/10.1037/a0022186
therapies. Journal of Consulting and Clinical Psychology, 66, 7–18. Hox, J. (2010). Multilevel analysis. Techniques and applications (2nd ed.).
http://dx.doi.org/10.1037/0022-006X.66.1.7 Mahwah, NJ: Erlbaum.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences Hyman, S. E. (2010). The diagnosis of mental disorders: The problem of
(2nd ed.). Hillsdale, NJ: Erlbaum. reification. Annual Review of Clinical Psychology, 6, 155–179. http://
Crits-Christoph, P., & Mintz, J. (1991). Implications of therapist effects for dx.doi.org/10.1146/annurev.clinpsy.3.022806.091532
the design and analysis of comparative studies of psychotherapies. Imel, Z. E., Baldwin, S., Atkins, D. C., Owen, J., Baardseth, T., &
Journal of Consulting and Clinical Psychology, 59, 20 –26. http://dx.doi Wampold, B. E. (2011). Racial/ethnic disparities in therapist effective-
.org/10.1037/0022-006X.59.1.20 ness: A conceptualization and initial study of cultural competence.
Del Re, A. C., Flückiger, C., Horvath, A. O., Symonds, D., & Wampold, Journal of Counseling Psychology, 58, 290 –298. http://dx.doi.org/10
B. E. (2012). Therapist effects in the therapeutic alliance-outcome rela- .1037/a0023284
tionship: A restricted-maximum likelihood meta-analysis. Clinical Psy- Kim, D., Wampold, B., & Bolt, D. (2006). Therapist effects in psycho-
chology Review, 32, 642– 649. http://dx.doi.org/10.1016/j.cpr.2012.07 therapy: A random-effects modeling of the National Institute of Mental
.002 Health Treatment of Depression Collaborative Research Program data.
Elfström, M. L., Evans, C., Lundgren, J., Johansson, B., Hakeberg, M., & Psychotherapy Research, 16, 161–172. http://dx.doi.org/10.1080/
Carlsson, S. G. (2013). Validation of the Swedish version of the Clinical 10503300500264911
Outcomes in Routine Evaluation Outcome Measure (CORE-OM). Clin- Kim, S., Beretvas, S. N., & Sherry, A. R. (2010). A validation of the factor
ical Psychology & Psychotherapy, 20, 447– 455. http://dx.doi.org/10 structure of OQ-45 scores using factor mixture modeling. Measurement
.1002/cpp.1788 and Evaluation in Counseling and Development, 42, 275–295. http://dx
Evans, C., Connell, J., Barkham, M., Margison, F., McGrath, G., Mellor- .doi.org/10.1177/0748175609354616
Clark, J., & Audin, K. (2002). Towards a standardised brief outcome Kraus, D. R., Castonguay, L., Boswell, J. F., Nordberg, S. S., & Hayes,
measure: Psychometric properties and utility of the CORE-OM. The J. A. (2011). Therapist effectiveness: Implications for accountability and
British Journal of Psychiatry, 180, 51– 60. http://dx.doi.org/10.1192/bjp patient care. Psychotherapy Research, 21, 267–276. http://dx.doi.org/10
.180.1.51 .1080/10503307.2011.563249
Falkenström, F., Josefsson, A., Beggren, T., & Holmqvist, R. (2016). How Kraus, D. R., Seligman, D. A., & Jordan, J. R. (2005). Validation of a
much therapy is enough? Comparing dose-effect and good-enough- behavioral health treatment outcome and assessment tool designed for
models in two different settings. Psychotherapy, 53, 130 –139. http://dx naturalistic settings: The Treatment Outcome Package. Journal of Clin-
.doi.org/10.1037/pst0000039 ical Psychology, 61, 285–314. http://dx.doi.org/10.1002/jclp.20084
Gadermann, A. M., Alonso, J., Vilagut, G., Zaslavsky, A. M., & Kessler, Lambert, M. J., Burlingame, G. M., Umphress, V., Hansen, N. B., Ver-
R. C. (2012). Comorbidity and disease burden in the National Comor- meersch, D. A., Clouse, G. C., & Yanchar, S. C. (1996). The reliability
bidity Survey Replication (NCS-R). Depression and Anxiety, 29, 797– and validity of the Outcome Questionnaire. Clinical Psychology &
806. http://dx.doi.org/10.1002/da.21924 Psychotherapy, 3, 249 –258. http://dx.doi.org/10.1002/(SICI)1099-
Goldberg, S. B., Rousmaniere, T., Miller, S. D., Whipple, J., Nielsen, S. L., 0879(199612)3:4⬍249::AID-CPP106⬎3.0.CO;2-S
Hoyt, W. T., & Wampold, B. E. (2016). Do psychotherapists improve Lambert, M. J., Morton, J. J., Hatfield, D., Harmon, C., Hamilton, S., Reid,
with time and experience? A longitudinal analysis of outcomes in a R. C., . . . Burlingame, G. B. (2004). Administration and scoring manual
12 NISSEN-LIE ET AL.

for the Outcome Questionnaire-45. Orem, UT: American Professional R Development Core Team. (2014). R: A language and environment for
Credentialing Services. statistical computing. Vienna, Austria: R Foundation for Statistical
Laska, K. M., Gurman, A. S., & Wampold, B. E. (2014). Expanding the Computing. Retrieved from http://www.R-project.org/
lens of evidence-based practice in psychotherapy: A common factors Rice, K. G., Suh, H., & Ege, E. (2014). Further evaluation of the Outcome
perspective. Psychotherapy, 51, 467– 481. http://dx.doi.org/10.1037/ Questionnaire– 45.2. Measurement and Evaluation in Counseling and De-
a0034332 velopment, 47, 102–117. http://dx.doi.org/10.1177/0748175614522268
Lutz, W., Leon, S. C., Martinovich, Z., Lyons, J. S., & Stiles, W. B. (2007). Saxon, D., & Barkham, M. (2012). Patterns of therapist variability: Ther-
Therapist effects in outpatient psychotherapy: A three-level growth apist effects and the contribution of patient severity and risk. Journal of
curve approach. Journal of Counseling Psychology, 54, 32–39. http://dx Consulting and Clinical Psychology, 80(4), 535-–546. http://dx.doi.org/
.doi.org/10.1037/0022-0167.54.1.32 10.1037/a0028898
Lyne, K. J., Barrett, P., Evans, C., & Barkham, M. (2006). Dimensions of Schottke, H., Flückiger, C., Goldberg, S. B., Eversmann, J., & Lange, J.
variation on the CORE-OM. The British Journal of Clinical Psychology, (2015). Predicting psychotherapy outcome based on therapist interper-
45, 185–203. http://dx.doi.org/10.1348/014466505X39106 sonal skills: A five-year longitudinal study of a therapist assessment
Muthén, L. K., & Muthén, B. O. (1998 –2011). Mplus user’s guide (6th protocol. Psychotherapy Research. Advance online publication. http://
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

ed.). Los Angeles, CA: Muthén & Muthén. dx.doi.org/10.1080/10503307.2015.1125546


Snell, M. N., Mallinckrodt, B., Hill, R. D., & Lambert, M. J. (2001).
This document is copyrighted by the American Psychological Association or one of its allied publishers.

Najavits, L., & Strupp, H. H. (1994). Differences in the effectiveness of


psychodynamic therapies: A process-outcome study. Psychotherapy: Predicting counseling center clients’ response to counseling: A 1-year
Theory, Research, Practice, Training, 31, 114 –123. http://dx.doi.org/10 follow-up. Journal of Counseling Psychology, 48, 463– 473.
.1037/0033-3204.31.1.114 Snijders, T. A., & Bosker, R. J. (2012). Multilevel analysis: An introduc-
Nissen-Lie, H. A., Monsen, J. T., Ulleberg, P., & Rønnestad, M. H. (2013). tion to basic and advance multilevel modeling (2nd ed.). London, UK:
Psychotherapists’ self-reports of their interpersonal functioning and dif- Sage.
ficulties in practice as predictors of patient outcome. Psychotherapy Stiles, W. B., Honos-Webb, L., & Surko, M. (1998). Responsiveness in
Research, 23, 86 –104. http://dx.doi.org/10.1080/10503307.2012 psychotherapy. Clinical Psychology: Science and Practice, 5, 439 – 458.
.735775 http://dx.doi.org/10.1111/j.1468-2850.1998.tb00166.x
Norcross, J. C., & Beutler, L. E. (2000). A prescriptive eclectic approach Tracey, T. J. G., Wampold, B. E., Lichtenberg, J. W., & Goodyear, R. K.
to psychotherapy training. Journal of Psychotherapy Integration, 10, (2014). Expertise in psychotherapy: An elusive goal? American Psychol-
247–261. http://dx.doi.org/10.1023/A:1009444912173 ogist, 69, 218 –229. http://dx.doi.org/10.1037/a0035099
Nordberg, S. S., Boswell, J. F., Kraus, D., Castonguay, L., Hayes, J. A., & Wampold, B. E. (2005). What should be validated? The psychotherapist. In
Wampold, B. E. (2010, June). Therapist effectiveness treating depres- J. C. Norcross, L. E. Beutler, & R. F. Levant (Eds.), Evidence-based
sion with and without co-morbid substance abuse. Paper presented at the practices in mental health: Debate and dialogue on the fundamental
annual meeting of the Society for Psychotherapy Research, Asilomar, questions (pp. 200 –208, 236 –238). Washington, DC: American Psycho-
CA. logical Association.
Okiishi, J., Lambert, M., Nielsen, S., & Ogles, B. (2003). Waiting for Wampold, B. E., & Brown, G. S. (2005). Estimating variability in out-
supershrink: An empirical analysis of therapist effects. Clinical Psychol- comes attributable to therapists: A naturalistic study of outcomes in
ogy & Psychotherapy, 10, 361–373. http://dx.doi.org/10.1002/cpp.383 managed care. Journal of Consulting and Clinical Psychology, 73,
Owen, J. (2013). Early career perspectives on psychotherapy research and 914 –923. http://dx.doi.org/10.1037/0022-006X.73.5.914
practice: Psychotherapist effects, multicultural orientation, and couple Wampold, B. E., & Imel, Z. E. (2015). The great psychotherapy debate:
interventions. Psychotherapy, 50, 496 –502. http://dx.doi.org/10.1037/ The evidence for what makes psychotherapy work (2nd ed.). New York,
a0034617 NY: Routledge.
Owen, J., Drinane, J., Tao, K. W., Adelson, J. L., Hook, J. N., Davis, D., World Health Organization. (2004). International statistical classification
& Fookune, N. (2015). Racial/ethnic disparities in client unilateral of diseases and related health problems, 10th revision (ICD-10). Re-
termination: The role of therapists’ cultural comfort. Psychotherapy trieved from http://apps.who.int/classifications/apps/icd/icd10online
Research. Advance online publication. http://dx.doi.org/10.1080/ 2004/fr-icd.htm
10503307.2015.1078517 Zuroff, D. C., Kelly, A. C., Leybman, M. J., Blatt, S. J., & Wampold, B. E.
Owen, J., Wong, Y. J., & Rodolfa, E. (2009). Empirical search for psy- (2010). Between-therapist and within-therapist differences in the quality
chotherapists’ gender competence in psychotherapy. Psychotherapy: of the therapeutic relationship: Effects on maladjustment and self-critical
Theory, Research, Practice, Training, 46, 448 – 458. http://dx.doi.org/10 perfectionism. Journal of Clinical Psychology, 66, 681– 697.
.1037/a0017958
Pinheiro, J., & Bates, D. DebRoy, S., Sarkar, D., & the R Development Received October 8, 2015
Core Team. (2013). nlme: Linear and nonlinear mixed effects models Revision received February 16, 2016
[Computer software manual]. Retrieved from http://cran.r-project.org/ Accepted February 17, 2016 䡲

View publication stats

Das könnte Ihnen auch gefallen