Choosing Assessment and Evaluation Tools For Direct Practice

University of Minnesota, Twin Cities, USA 
School of Social Work 
SW 8602 Direct Practice Evaluation 
Jane F. Gilgun, Ph.D., LICSW 
December 2009 
Choosing Assessment and Evaluation Tools for Direct Practice  
  Assessment and evaluation tools can contribute to practice effectiveness if social 
service professionals choose them well.  In this essay I provide guidelines for choosing 
tools for practice. The first section discusses standardized instruments; that is, instruments 
that have known psychometric properties of reliability and validity.  The second section is 
on instruments that practitioners construct themselves or that they help clients construct.  
The third section is brief but points out some of the complicated issues involved in 
practitioner use of instruments. In the discussion, I state the importance of practitioner 
involvement in the development, use, and modification of any tools that agencies may 
require and also point out that funders prefer to sponsor programs that demonstrate 
effectiveness. 
Are they Useful? 
  Usefulness is the most important question to ask about practice tools.  If you use 
these tools, will they help you do your job better?  Tools that are useful have the following 
characteristics.  I’ve arranged them in rough order of importance for social work practice. 
• They have good face validity.  Face validity is the most important validity in 
assessment and evaluation tools. Face validity means that when knowledgeable 
professionals read the tools, they find that the tools cover important areas of 
practice. 
• They have good content validity. Content validity is an estimate of whether 
Gilgun letter
Page 2 of 9
instruments cover relevant areas.  It is similar to face validity in that experts decide 
whether tools have adequate coverage. There is no index for content validity. 
Drawing upon multiple sources of data helps to ensure content validity.  In social 
work and other applied disciplines, content validity is more likely when the sources 
of items are research, theory, and practice wisdom that draws upon direct 
experience with clients and their issues. Sometimes representatives of client groups 
contribute to the ideas and items of a tool. Item total analysis often helps streamline 
instruments because it helps to eliminate items that quantitative analysis shows are 
unrelated to other items in the tool.  In item total analysis, the score on each item is 
correlated with the total score.  Items with very low correlations are eliminated.  If 
many items have high correlations—above .9—tool developers then inspect these 
items for redundancy and eliminate those that duplicate others. 
o They are culturally sensitive. Instruments that are useful draw upon 
information that is culturally sensitive.  Practitioners can check for cultural 
sensitivity by finding information about the samples on which instrument 
developers draw for the ideas and items that compose the instrument. These 
samples ideally match the culture, social class, and other important social 
identities of the individuals who compose practitioners’ caseloads.  If the 
sample differs, the instruments may still be useful if practitioners modify 
them in consultation with knowledgeable persons.  Cultural sensitivity is part 
of content validity. 
• They are practice guidelines. Good tools provide practice guidelines in the sense 
that they help you keep important things about clients in mind. It is only human to 
Gilgun letter
Page 3 of 9
have our own favorite ideas about what is important.  Useful tools alert you to 
things that you might not otherwise have thought about.  
• They help you formulate treatment goals.  Tools that provide practice guidelines 
can do this. Treatment goals, in turn, can help you gauge whether you work is 
helping clients. If you use the tools periodically, they will also keep you focused on 
important practice principles. Of course, as you work with clients, you may 
formulate new treatment goals and find some goals that tools helped you develop 
are not appropriate for particular clients. 
• They are short, easy to use, and modifiable.  Most useful tools have these qualities.  
If they are long and cumbersome, practitioners may not want to use them because 
they take time away from direct client contact. Useful tools are modifiable in the 
sense that when some items do not work, practitioners can modify them to fit their 
practice. It is better to modify them in consultation with other knowledgeable 
professionals in case you are missing something important that the tools provide. 
• They have good indices of internal consistency, which is sometimes called 
reliability.  The index should reach or come close to .90 in clinical assessment and 
evaluation tools.  Some tools can have lower indices of reliability or none at all and 
still be useful if they have face validity and are useful in other ways. Cronbach’s 
alpha is the most common index of internal consistency, which gauged from a scale 
of 0‐1.  A good alpha and good face validity suggest a potentially useful tool. 
• They have good indices of interrater reliability, which is another indicator of 
consistency.  When an instrument has an inter‐rater reliability score, this means 
that two or more practitioners have completed the instrument on the same client or 
Gilgun letter
Page 4 of 9
clients. If there are two raters, then the number of clients should be at least 15‐20. If 
there are 15‐20 raters or more, then the rating can be done on fewer clients. The 
higher the index, on a scale from 0‐1, the more reliable the scale is. The closer to 1 
the rating is, the more the raters have agreed. A scale with a high rating or one with 
a low rating may have poor face validity, and practitioners decide not to use it. High 
face validity and high inter‐rater reliability are good indicators of potential 
usefulness.   An issue with inter‐rater reliabilities, however, is that practitioners 
who fill out the instrument may have different perspectives, ideas, and training on 
the concepts that underlie the instruments.  Raters, therefore, should understand 
the theory, research, and practice wisdom on which the tools are based.  An 
excellent tool could receive a low inter‐rater reliability score because the raters did 
not understand the concepts on which the tool is based. 
• They have adequate testretest reliability (TRR).  Test‐retest reliability arises 
when a group of practitioners fills out an instrument on a group of clients and days 
or weeks later fills out the same instrument on the same group of clients. The scores 
on the two different occasions are correlated.  The indices that are closest to 1 are 
those that indicate the best TRRs. Test‐retest reliabilities cannot be done if the 
clients are receiving services because any intervention could affect the second set of 
scores. Face validity in combination with the reliabilities already discussed suggests 
a potentially useful instrument. 
• They have indices of construct validity, which help researchers and practitioners 
understand what the tools measure.  To evaluate for construct validity, researchers 
have practitioners fill out two instruments that are thought to measure the same 
Gilgun letter
Page 5 of 9
things.  One of the instruments already has known psychometric properties of 
reliability and validity.  The scores of the two instruments are correlated.  The 
higher the score, the more valid the construct is thought to be.  An instrument with 
face validity, construct validity, and good reliabilities is potentially useful. 
• When the issue is prediction, they have good predictive validity, which is useful in 
some tools, such as risk assessments, whose purpose is to identify individuals at risk 
for some conditions. Their predictive usefulness is based upon how well they 
predict future behaviors.  Child abuse risk assessments are examples.  These can be 
useful tools because they typically are based upon research and theory and 
practitioner expertise.  They can provide practice guidelines that help practitioners 
formulate treatment goals that, if met, can reduce the risk for the targeted behaviors 
to occur. These kinds of instruments have scores from 0‐1, like the other indices of 
reliability and validity.  They are one of two types of criterion‐related validities.  The 
other is concurrent validity.  “Criterion” refers to the idea that the instrument is 
correlated with another external instrument. 
• When concurrent validity is an issue, they have good concurrent validity, which is 
a score that researchers calculate when they correlate two or more instruments that 
they administer at the same time, with one assumed to be a predictor of another.  
This test is not much used in direct practice, and it is not the same thing as construct 
validity.  It is used more to get as complete a picture as possible of whatever 
administrators of the instruments want to know about future performance. 
• Factor analysis indicators can be helpful in some cases.  Factor analyses are 
similar in some ways to Cronbach’s alpha in that the results of a factor analysis 
Gilgun letter
Page 6 of 9
indicate which items of the instruments correlate with each other.  Those items that 
clump together are factors that researchers named based on which items belong to 
which cluster or factor.  
SelfConstructed Instruments 
In some cases, practitioners may find self‐constructed instruments to be helpful to 
their practice. Self‐constructed instruments typically have anchors on both sides of a 
continuum and therefore are often called self‐anchored scales.  Some call them 
individualized rating scales. 
One of the main advantages of self‐constructed instruments is that they are by 
definition tailor‐made to fit particular, individual treatment situations.  Practitioners 
construct them to evaluate themselves, to evaluate clients, and to evaluate any influences 
on the relationship between clients and practitioners. Often when practitioners evaluate 
clients, they base their evaluations of clients’ behaviors while in the presence of 
practitioners as well as client reports of their behaviors in other settings.  Practitioners can 
also help clients to construct instruments that track clients’ progress on goals. 
Anchors typically have the least desirable behavior on the left side of a continuum 
and the most desirable behavior on the other.  The items themselves can range from very 
concrete to very general.  When clients construct their own instruments, they also choose 
their own treatment goals.  
In group treatment in a woman’s prison, a woman wanted to stop threatening other 
women when she felt they threatened her.  She had some insight that her threatening 
behavior was based on fear, beliefs, and trauma that she had experienced in the past. 
Although she had the beginnings of an understanding of the complexity of her issues 
Gilgun letter
Page 7 of 9
related to threatening others, the behavior she chose to monitor was specific and concrete.  
This is the self‐constructed instrument she designed for herself. 
When I felt threatened this past week I 
Hit   Swore   Imagined  Imagined  Got Angry  Talked to 

      Hitting   Swearing  Got Over It  Someone 
   
The woman reviewed this simple scale at the beginning of each group.  It provided a way 
for her to focus on the behavioral manifestation of a complex issue.  She or the group 
facilitators could have made a rating scale out of this instrument, starting with 0 at “Hit” 5 
for “Talked to Someone.” These scores could be graphed to show any changes over time. 
Such graphing is not necessary.  What is helpful is the focus that the simple scale provided 
to the client and the deep roots of such a simple scales.   
  There are other ways to construct instruments tailored to particular clients and 
practice settings.  The references at the end of this essay provide more information. 
Importance of Practitioner BuyIn 
Direct practitioners will not use tools that do not help in their practice.  If their 
administrators insist they use tools that do not help them, they will comply with directives 
to fill out the tools but the ideas of the tools may not have much effect.  The chances for 
practitioner buy‐in are increased when practitioners 
• see the value of the tool for their practice effectiveness, such as helping them 
set goals, give direction for interventions, and gauge progress on goals; 
o see that the tools fit their practice.  For example, if practitioners are 
involved in dealing with crises and being concerned that clients do not 
have basic life skills such as knowing how to brush their teeth, it is 
Gilgun letter
Page 8 of 9
unlikely that they will find tools to be helpful when the tools  
encourage skill development that is beyond what their clients are able 
to attain;   
• have input into the items of the instruments, how they use the instruments, 
and whether and how the instruments are modified to better fit practice; 
• have training on the ideas and concepts on which the tools are based; 
• are not swamped with paperwork demands that they find cuts down on the 
time they have for direct client contact.  
These issues are stated in simple terms, but they are complex and require much 
thought and planning on the part of administrators in consultation with front‐line 
practitioners. 
Discussion 
This essay provides information on how to choose assessment and evaluation tools 
in social work direct practice. Standardized and self‐constructed instruments have many 
advantages, but social workers will not use them if they do not find the tools helpful.  
Administrators have the responsibility to involve front‐line workers in the construction, 
modification, and procedures for using instruments.  They also must allow for training of 
practitioners so that they have an appreciation of the research, theory, and practice 
wisdom on which tools are based. Finally, practitioners require time to use the tools and to 
interpret the information that the tools produce.  If the practitioners experience the 
instruments as add‐ons to an already heavy caseload and to which they have few if any 
involvement and investment, the tools will be of little use. 
At their best, assessment and intervention tools provide practice guidelines useful in 
Gilgun letter
Page 9 of 9
understanding the complexities of clients’ lives, information on what is working and not 
working, focus for clients on client‐selected goals, insight for practitioners on what they are 
doing and how they can do better, and provide evidence that the efforts of social workers 
have outcomes that can be shared with others.  Funders prefer to sponsor programs that 
show effectiveness.  
References 
APA Taskforce on Evidence‐Based Practice (2006). Evidence‐based practice in 
psychology. American Psychologist, 61(4), 271‐185. 
Gilgun, Jane F. (2005). The four cornerstones of evidence‐based practice in social work. 
Research on Social Work Practice, 15(1), 52‐61. 
Bloom, Martin, Joel Fischer, & John G. Orme (2009). Evaluating practice: Guidelines 
for the accountable professional.  Boston: Pearson. 
Bordelon, Thomas D. (2006).  A qualitative approach to development an instrument 
for assessing MSW students’ group work performance.  Social Work with Groups, 29(4), 75‐
91.  
Gilgun, Jane F. (2005). The four cornerstones of evidence‐based practice in social work. 
Research on Social Work Practice, 15(1), 52‐61. 
Gilgun, Jane F. (2004).  Qualitative methods and the development of clinical 
assessment tools.  Qualitative Health Research, 14(7), 1008‐1019. 
Mokuau, Noreen et al (2008). Development of a family intervention for native 
Hawaiian women with cancer: A pilot study. Social Work, 53(1), 9‐19. 
Wenbron, Jennifer et al (2008). Assessing the reliability and validity of the Pool 
Activity Level (PAL) Checklist for use with older people with dementia.  Aging and Mental 
Health, 12(2), 202‐211.  
 
 
About the Author 
 
Jane F. Gilgun, Ph.D., LICSW, is a professor, School of Social Work, University of Minnesota, 
Twin Cities, USA. See Professor Gilgun’s other articles, children’s books, and articles on 
Amazon Kindle, scribd.com/professorjane, and stores.lulu.com/jgilgun. 

Choosing Assessment and Evaluation Tools For Direct Practice

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Choosing Assessment and Evaluation Tools For Direct Practice

Hochgeladen von

Copyright:

Verfügbare Formate

University of Minnesota, Twin Cities, USA

Hit   Swore   Imagined  Imagined  Got Angry  Talked to

Das könnte Ihnen auch gefallen

Choosing Assessment and Evaluation Tools For Direct Practice

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Choosing Assessment and Evaluation Tools For Direct Practice

Hochgeladen von

Copyright:

Verfügbare Formate

University of Minnesota, Twin Cities, USA

Hit Swore Imagined Imagined Got Angry Talked to

Das könnte Ihnen auch gefallen

University of Minnesota, Twin Cities, USA 

Hit   Swore   Imagined  Imagined  Got Angry  Talked to