Sie sind auf Seite 1von 55

Behavior Rating Scales

Definition  Types  Construction Issues  Weaknesses  Strengths  Selection Considerations  Specific Scales: Conners; CBCL; BASC; others


Definition


Rating Scale: any paper and pencil device where by one (usually a care taker such as a parent or teacher, though not excluding peers) assesses the behavior of that individual based on his or her observations of the child or adolescent over an extended period of time (usually more than a month)

Martin, Hooper & Snow, 1986

Types of Rating Scales


 

Range of constructs from general functioning to concrete behaviors Personality: Personality Inventory for Children-Revised (PIC-2); Minnesota Multiphasic Personality InventoryAdolescent (MMPI-A) Behavior Checklists: Child Behavior Checklist (CBCL); Conners Rating Scales-Revised; Behavior Assessment System for Children (BASC); Devereux Scales of Mental Disorders Specific Disorders- Childrens Manifest Anxiety Scales; Beck Depression Inventory; Childrens Depression Inventory

Summary of Construction Issues


Checklist vs Dichotomy vs Continuum  Item choice  Ability to Sum Scores  Anchors  Description of Behavior/Construct


Checklist vs Dichotomy vs Continuum




Checklists: rater checks the item of the behavior exists; can be used in screening for specific DSM-IV disorders Dichotomy: rater indicates of the behavior exists or does not exist; forced dichotomy; Yes/No Continuum: 1 2 3 4 5 Increases reliability with more steps (plateau after 11 steps with little gain); Odd number allows for a neutral, middle step, but can create a response set

Item Choice


  

 

Subjectivity of instrument is a function of the level of analysis; type of item; manner scaled Sufficient number of items to sample the construct Face validity of items Specificity of behavior: Is delinquent vs Lies; steals; violates curfew Too specific may lead to trivial information, excessive length Time frame identified, e.g. Within the last two weeks

Various strategies used to develop items and scales


 

Factor analysis: placing in a factor items that cluster together Empirical keying: using selected items to distinguish one group from another Theoretical constructs: using selected items to measure the theoretical constructs underlying the construction of the test Content analysis: using experts to select items to measure the trait or diagnostic category of interest

Ability to Sum Scores


Construction of some tests allows for sum scores across scales which increases the reliability of the instrument  Broad band factors have higher reliability than narrow band, e.g. Internalizing & Externalizing have higher reliabilities than individual scales such as Social Withdrawal or Aggression


Anchors
    

End points on a scale Numerical (Likert scale) Degrees of agree/disagree Adjectives such as good/bad; carefree/anxious; impulsive/reflective Actual behavior to typify a type of attitude such as religion: attends church 1 time per months; 2 times per months; weekly; biweeklyThis may be specific to the construct; may not represent equal intervals; may be difficult to find discreet specific behaviors Comparison to norm or product scales

Description of Behavior/Construct
Scales need to be defined  Based on theory  Behaviors which fall under one construct on one test, may be utilized on another construct in another test


Summary of Weaknesses
Disadvantages  Considerations for Misuse  Safeguards


Disadvantages


Four areas of variation on assessment data which summarize the disadvantages of rating scales: source variance, setting variance, temporal variance, and instrument variance (Martin, Hooper and Snow)

Source Variance
Primary source of error in rating scale data is the informant  Knowledge of subject for at least 2 months  Perceptions of rater  Tolerance of behavior  Stress level of respondent  Choice of informant may slant results  Internalizing behaviors or low rate behaviors may not be observed  May not recognize the usefulness of the scale  Reading level of informant (30-40% of the population does not read at a fifth grade level)  Response Bias

Response Bias
 

Science identifies truth as the convergence of data Respondents may differ in perception, normative life experiences (e.g. urban/suburban; poverty/wealth), response style, and desired outcome: teacher may want the child in a program; teacher/parent may not have objective view in relation to normal peers; parent may have ulterior motive such as custody, monetary benefits Respondents sometimes are biased without awareness

Reasons for Inadvertent Bias


Complexity of the mental processes required for response lead to bias (Cooper, 1981) 1. Observation of the action 2. Observation encoding, aggregation, & storage in shortterm memory 3. Short-term memory decay 4. Transfer to long-term storage and aggregation 5. Long-term memory decay Above can be influenced by expectation of respondent

Reasons for Inadvertent Bias (cont.)


6. Presentation of categories to be rated 7. Observation and impression retrieval from long-term storage 8. Recognition of observations and impressions relevant to rating category. 9. Comparison of observations and impressions to raters standards 10.Incorporation of extraneous considerations 11.Making the rating-weighing the behavior

Types of Response Bias


May be due to respondents intentions or characteristic way of responding to an item regardless of content  Halo Effect  Leniency or severity  Central tendency or range restriction  Response acquiescence  Response deviance  Social desirability

Halo Effect


A raters failure to discriminate among distinct and independent aspects of a ratees behavior (Saal, 1980) Cognition: rate child positively in emotional or behavioral issues because they are smart Socially adept: child must be emotionally or cognitively adept because of positive social behaviors (always helpful, smiles) Other raters may report conflicting information

Leniency or severity


Occurs when ratings are consistently higher or lower than are warranted Inferred when a rater uses predominantly one extreme or the other on the scale Cannot be verified unless an independent observation or other party disagrees, e.g. parent sees child as hyperactive while few others see him as such

Central tendency or range restriction




Rater restricts range of all ratings to average or above or below (may revert to leniency or severity bias) Rater may choose middle response since they feel they do not know all the universe of possible occurrences of the behavior (e.g. I dont know how he is with his friends; I only see him at school/home) therefore cannot rate as Always True/False, etc.

Response acquiescence & response deviance


Response acquiescence tends to agree with each item  Response deviance tends to respond in a deviant, unfavorable, uncommon, or unusual way


Social desirability
Interpret the test responses to provide the most favorable view of the child  Rater may not be aware of the tendency to underrate problematic responses  Rater may hesitant to endorse items that suggest the presence of a particular disorder (e.g. Beck Depression Inventory)


Methods to minimize bias


 

 

 

Use a lie scale or faking good scale Switch left and right for positive responses Use bipolar adjectives Response scaling: many problem behaviors occur in all children, dichotomy is not adequate (most children yell, cry, hit at least sometimes) Provide clear instructions Limit number of response categories to reduce confusion, lack of focus, length Identify at the beginning what the scales mean and time frame for rating

Setting Variance
Interaction with the environment can affect results, i.e. home/school/ clinic  Interventions used  Consider if instrument is sensitive across settings or specific to one setting


Temporal Variance
Change in behavior over time  Medication issues  Intervention  Maturation  Significant events: deaths, divorce, illness, trauma


Instrument Variance
Sloppy construction  Definition of construct  Qualitative technical aspects  Quantitative: depth of information as well as breadth


Considerations for Misuse




May be convenient and efficient for assessor, but may not be for the informant Provide feedback and explain the instrument Inappropriate use of instrument for screening, diagnosis, intervention development, program evaluation Choice of an instrument to sway identification of a specific condition

Safeguards
Aggregate principle: collect data on same construct over varied settings with varied instruments to increase reliability by controlling the sources of variance  Test over several time periods  Use several instruments  Use several raters Multi-setting, Multi-source, Multiinstrument Design Variations in responses may be due to setting, activity, or rater Can lead to hypothesis development

Strengths
Rating scale is a derivative of the unstructured interview, an evolution of the interview in the direction of increasing structure  The interview has more variability in interviewers; does not cover all areas; problems may be missed; clients are not always willing and articulate inaccurate reporting; reliability and validity may be poor  Rating scale can identify strengths and weaknesses  Validate referents concern  Evaluate the severity and range of the concern  Assess atypical patterns  Part of multi-source, multi-method evaluation

Strengths (cont.)
Several assumptions allow for the comparison of raters responses:
1) 2) 3)

4)

Informants can describe or rate the child Items have the same or similar meaning for all respondents Respondents report their thoughts, feelings, & behaviors openly and honestly Measures have adequate reliability and validity

Strengths (cont.)


Rating scales can tap behaviors you may not be able to quantify in other tests Convenience: time-and costefficient for assessor, multiple viewpoints Comprehensive scales can ensure touching range of problem areas unlike interviews which may delve into one problem but miss others Structured response format and operationalizing behavior can reduce subjectivity Increase ecological validity of the assessment, normal environment

Strengths (cont.)


Teacher ratings have high predictive power; teacher has formal training, structure setting, comparison to other children Biases evidenced between settings or individuals can be used in assessment and intervention, identify the real problem (child or referent), parenting style differences, influence of setting Some rating scales ask informant to identify the most problematic/concerning problem Child may not be able to interact/respond to assessment, e.g. infants, severely impaired

Strengths (cont.)


Use of caretaker as informant is strength in parents have observed child since birth; parents are motivated; part of natural environment More objective and reliable than projective and interview; can be less biased than self-report Can provide information on strengths as well as concerns

Selection Considerations


 

Technical considerations: Norms, validity, reliability, constructs sampled, test construction Informant, situation, time, client Scope of instrument: Narrow and/or broad category of behaviors; Choose for what you need and want; strengths (competencies) and weaknesses Purpose or use: screening, diagnosis, placement, intervention; program evaluation Clinical Utility: ease of administration; useful clinical information; sensitive to effects of intervention

Specific Scales
BASC  CBCL  Conners  others


BASCBASC-Behavior Assessment System for Children


Teacher Rating Scale Preschool 4-5 yrs (109 items) Child 6-11 yrs (148 items) Adolescent 12-18(138 items) Parent Rating Scale Preschool 4-5 yrs (105 items) Child 6-11 yrs (138 items) Adolescent 12-13 (126 items) Self-Report Scale Child 8-11 yrs (152 items) Adolescent 12-18 yrs (186 items) Each takes about 30 minutes to complete

BASC (cont.)
Scores Teacher and Parent have 4point response (never, sometimes, often, almost always) Self-Report has true/false T scores and %ile ranks Scored by hand on carbonless forms or computer

BASC (cont.)
Standardization 2,084 Children ages 6-11 and 1,090 adolescents 12-18 for parent scale 1,259 children ages 6-11 and 809 adolescents 12-18 for teacher scale 5,413 children ages 8-11 and 4,448 adolescents ages 12-18 for SelfReport Collected 1988-1991, matching 1986 U.S. Census Separate norms for males, females, and clinical samples About 70% of clinical samples were males with dx of conduct or behavior disorder

BASC (cont.)
Reliability Internal consistency reliabilities for the 3 scales in the school age sample range from .62 to .95 for TRS; .58 to .94 for PRS, and .61 to .89 for Self-Report Interrater reliabilities: PRS are generally low, .35 to .73; TRS from .29 to .70 for preschool and .44 to .93 for schoolage; none available for adolescents Test-retest: PRS for 2 to 8 week interval range from .41 to .94; TRS .59 to .95; Self-Report .57 to .81 for children and .67 to .81 for adolescents

BASC (cont.)
Validity Construct validity for internalizing and externalizing dimensions of the BASC scales is supported by factor and structural equation analyses Criterion-related validity is satisfactory for the 3 scales, as show by acceptable correlations with other similar measures

BASC (cont.)
 

  

Integrative approach across multiple informants Strength is in assessment of children ages 6 to 11 years, particularly in externalizing behaviors Separation of Attention & Hyperactivity; Depression & Anxiety Limited psychopathology and personality domains Comparison across child and adult forms is difficult Readability of Self-Report may be too high

Child Behavior Checklist (CBCL) Teachers Report Form (TRF) & Youth Self-Report (YSR) SelfParent Rating Preschool 2-3 yrs (99 items) School-age 4-18 yrs (120 items) Teacher Rating Form Caregiver/Teacher 2-5 yrs (99 items) School Age 6-18 yrs (120 items) Youth Self-Report Ages 11-18 yrs (119 items) Requires 5th grade reading level about 30 minutes to complete  Parent and Teacher form take 10-15 minutes to complete

CBCL, TRF & YSR (Cont.) Scores 3-point response (not true, somewhat true or sometimes true & , often true) T scores and %ile ranks Scored by templates, scannable answer sheets, or computer

CBCL, TRF & YSR (Cont.)


Standardization 1,200 males and females ages 411 and 1,168 adolescents 12-18 for parent scale 713 children ages 5-11 and 678 adolescents 12-18 for teacher scale 637 Males and 678 females for Self-Report Collected 1989, matching 1990 U.S. Census Separate norms for males&females

CBCL, TRF & YSR (Cont.)


Reliability Internal consistency reliabilities for the parent from .56 to .92; for teacher.63 to .96; and .59 to .90 (males) & .59 to .89 (females)for Self-Report Interrater reliabilities: Parent.26 to .86; Teacher from -.05 to .81; none available for adolescents Test-retest: Parent for 1 week interval range from .63 to .97; Teacher .82 to .95for males & .43 to .99 for females; Self-Report .47 to .81 for 50 children ages 11 to 18

CBCL, TRF & YSR (Cont.)


Validity Concurrent validity for parent, teacher, and YSR forms is satisfactory, acceptable correlations with Conner Discriminant validity for parent and teacher forms is acceptable and satisfactory for YSR shown by significant differences in scores between referred and nonreferred samples

CBCL, TRF & YSR (Cont.) Does not provide validity scales  Support cross-informant assessment  Low levels of reliability, suggesting caution in their interpretation and application  Broad-based screening measure rather than a precise measure of disorder


Conners Rating ScalesScales-Revised


Parent and teacher versions are designed for ages 3-17 Self-report is for ages 12-17 years Short forms (@ 27 items) and long forms (59-87 items) are available

Conners Rating ScalesScales-Revised


Scores 4-point response (not true at all, just a little true, pretty much true, & very much true) T scores Scored by self-scoring sheet or computer scored with interpretive report

Conners Rating ScalesScales-Revised


Standardization 8,000 individuals drawn from 1993 to 1996 from 45 U.S. states and 10 Canadian provinces. Norms are provided separately for males and females by age levels Does not match U.S. Census as there are more Euro-Americans than in general population

Conners Rating ScalesScales-Revised


Reliability Internal consistency reliabilities for the parent and teacher from .73 to .96; for adolescent .75 to .92 Test-retest: Parent and teacher forms are variable for long and short forms, with better reliabilities for the short form over a 6-8 week retest; self-report form ranges from .72 to .89 between the two forms.

Conners Rating ScalesScales-Revised


Validity Construct validity is satisfactory based on factor analysis used to construct the scales Convergent validity is good, high correlations between long and short forms Criterion validity is good, high correlations between various versions of the scales Discriminant validity for parent and teacher forms is good significant differences in scores between referred and nonreferred samples

Conners Rating ScalesScales-Revised




Improvement over previous scales Standardization samples are small for any age group or gender Adequate to good reliability and adequate validity, with informant versions strong in evaluating externalizing problems Self-report is useful for measuring general distress

Others


Devereux Scales of Mental Disorders: Good reliability but limited validity; limited in its evaluation of psychopathology; some items include content that is difficult for parents and teachers to evaluate; not clearly aligned to DSM-IV, although this was an objective Scales specific to ADHD or other diagnosis: many have limited sample size and limited utility

References


Knoff, H. M. (2002). Best practices in personality assessment. In A. Thomas & J. Grimes (eds) Best practices in school psychology IV, Vol. 2. Bethesda, MD: National Association of School Psychologists Martin, R., Hooper, S., & Snow,J. (1986). Behavior rating scale approaches to personality assessment in children and adolescents. In H. Knoff (Ed.) The assessment of child and adolescent personality. New York: Guilford Press. Sattler, J.M. (2002). Assessment of children: Behavioral and clinical applications, (4th ed.). San Diego: Jerome M. Sattler, Publisher, Inc.

Das könnte Ihnen auch gefallen