3 13 2003

Behavior Rating Scales
Definition Types Construction Issues Weaknesses Strengths Selection Considerations Specific Scales: Conners; CBCL; BASC; others

Definition

Rating Scale: any paper and pencil device where by one (usually a care taker such as a parent or teacher, though not excluding peers) assesses the behavior of that individual based on his or her observations of the child or adolescent over an extended period of time (usually more than a month)
Martin, Hooper & Snow, 1986
Types of Rating Scales

Range of constructs from general functioning to concrete behaviors Personality: Personality Inventory for Children-Revised (PIC-2); Minnesota Multiphasic Personality InventoryAdolescent (MMPI-A) Behavior Checklists: Child Behavior Checklist (CBCL); Conners Rating Scales-Revised; Behavior Assessment System for Children (BASC); Devereux Scales of Mental Disorders Specific Disorders- Childrens Manifest Anxiety Scales; Beck Depression Inventory; Childrens Depression Inventory
Summary of Construction Issues

Checklist vs Dichotomy vs Continuum Item choice Ability to Sum Scores Anchors Description of Behavior/Construct

Checklist vs Dichotomy vs Continuum

Checklists: rater checks the item of the behavior exists; can be used in screening for specific DSM-IV disorders Dichotomy: rater indicates of the behavior exists or does not exist; forced dichotomy; Yes/No Continuum: 1 2 3 4 5 Increases reliability with more steps (plateau after 11 steps with little gain); Odd number allows for a neutral, middle step, but can create a response set
Item Choice

Subjectivity of instrument is a function of the level of analysis; type of item; manner scaled Sufficient number of items to sample the construct Face validity of items Specificity of behavior: Is delinquent vs Lies; steals; violates curfew Too specific may lead to trivial information, excessive length Time frame identified, e.g. Within the last two weeks
Various strategies used to develop items and scales

Factor analysis: placing in a factor items that cluster together Empirical keying: using selected items to distinguish one group from another Theoretical constructs: using selected items to measure the theoretical constructs underlying the construction of the test Content analysis: using experts to select items to measure the trait or diagnostic category of interest
Ability to Sum Scores

Construction of some tests allows for sum scores across scales which increases the reliability of the instrument Broad band factors have higher reliability than narrow band, e.g. Internalizing & Externalizing have higher reliabilities than individual scales such as Social Withdrawal or Aggression

Anchors

End points on a scale Numerical (Likert scale) Degrees of agree/disagree Adjectives such as good/bad; carefree/anxious; impulsive/reflective Actual behavior to typify a type of attitude such as religion: attends church 1 time per months; 2 times per months; weekly; biweeklyThis may be specific to the construct; may not represent equal intervals; may be difficult to find discreet specific behaviors Comparison to norm or product scales
Description of Behavior/Construct
Scales need to be defined Based on theory Behaviors which fall under one construct on one test, may be utilized on another construct in another test

Summary of Weaknesses
Disadvantages Considerations for Misuse Safeguards

Disadvantages

Four areas of variation on assessment data which summarize the disadvantages of rating scales: source variance, setting variance, temporal variance, and instrument variance (Martin, Hooper and Snow)
Source Variance
Primary source of error in rating scale data is the informant Knowledge of subject for at least 2 months Perceptions of rater Tolerance of behavior Stress level of respondent Choice of informant may slant results Internalizing behaviors or low rate behaviors may not be observed May not recognize the usefulness of the scale Reading level of informant (30-40% of the population does not read at a fifth grade level) Response Bias
Response Bias

Science identifies truth as the convergence of data Respondents may differ in perception, normative life experiences (e.g. urban/suburban; poverty/wealth), response style, and desired outcome: teacher may want the child in a program; teacher/parent may not have objective view in relation to normal peers; parent may have ulterior motive such as custody, monetary benefits Respondents sometimes are biased without awareness
Reasons for Inadvertent Bias

Complexity of the mental processes required for response lead to bias (Cooper, 1981) 1. Observation of the action 2. Observation encoding, aggregation, & storage in shortterm memory 3. Short-term memory decay 4. Transfer to long-term storage and aggregation 5. Long-term memory decay Above can be influenced by expectation of respondent
Reasons for Inadvertent Bias (cont.)

6. Presentation of categories to be rated 7. Observation and impression retrieval from long-term storage 8. Recognition of observations and impressions relevant to rating category. 9. Comparison of observations and impressions to raters standards 10.Incorporation of extraneous considerations 11.Making the rating-weighing the behavior
Types of Response Bias

May be due to respondents intentions or characteristic way of responding to an item regardless of content Halo Effect Leniency or severity Central tendency or range restriction Response acquiescence Response deviance Social desirability
Halo Effect

A raters failure to discriminate among distinct and independent aspects of a ratees behavior (Saal, 1980) Cognition: rate child positively in emotional or behavioral issues because they are smart Socially adept: child must be emotionally or cognitively adept because of positive social behaviors (always helpful, smiles) Other raters may report conflicting information
Leniency or severity

Occurs when ratings are consistently higher or lower than are warranted Inferred when a rater uses predominantly one extreme or the other on the scale Cannot be verified unless an independent observation or other party disagrees, e.g. parent sees child as hyperactive while few others see him as such
Central tendency or range restriction

Rater restricts range of all ratings to average or above or below (may revert to leniency or severity bias) Rater may choose middle response since they feel they do not know all the universe of possible occurrences of the behavior (e.g. I dont know how he is with his friends; I only see him at school/home) therefore cannot rate as Always True/False, etc.
Response acquiescence & response deviance

Response acquiescence tends to agree with each item Response deviance tends to respond in a deviant, unfavorable, uncommon, or unusual way

Social desirability
Interpret the test responses to provide the most favorable view of the child Rater may not be aware of the tendency to underrate problematic responses Rater may hesitant to endorse items that suggest the presence of a particular disorder (e.g. Beck Depression Inventory)

Methods to minimize bias

Use a lie scale or faking good scale Switch left and right for positive responses Use bipolar adjectives Response scaling: many problem behaviors occur in all children, dichotomy is not adequate (most children yell, cry, hit at least sometimes) Provide clear instructions Limit number of response categories to reduce confusion, lack of focus, length Identify at the beginning what the scales mean and time frame for rating
Setting Variance
Interaction with the environment can affect results, i.e. home/school/ clinic Interventions used Consider if instrument is sensitive across settings or specific to one setting

Temporal Variance
Change in behavior over time Medication issues Intervention Maturation Significant events: deaths, divorce, illness, trauma

Instrument Variance
Sloppy construction Definition of construct Qualitative technical aspects Quantitative: depth of information as well as breadth

Considerations for Misuse

May be convenient and efficient for assessor, but may not be for the informant Provide feedback and explain the instrument Inappropriate use of instrument for screening, diagnosis, intervention development, program evaluation Choice of an instrument to sway identification of a specific condition
Safeguards
Aggregate principle: collect data on same construct over varied settings with varied instruments to increase reliability by controlling the sources of variance Test over several time periods Use several instruments Use several raters Multi-setting, Multi-source, Multiinstrument Design Variations in responses may be due to setting, activity, or rater Can lead to hypothesis development
Strengths
Rating scale is a derivative of the unstructured interview, an evolution of the interview in the direction of increasing structure The interview has more variability in interviewers; does not cover all areas; problems may be missed; clients are not always willing and articulate inaccurate reporting; reliability and validity may be poor Rating scale can identify strengths and weaknesses Validate referents concern Evaluate the severity and range of the concern Assess atypical patterns Part of multi-source, multi-method evaluation
Strengths (cont.)
Several assumptions allow for the comparison of raters responses:
1) 2) 3)
4)
Informants can describe or rate the child Items have the same or similar meaning for all respondents Respondents report their thoughts, feelings, & behaviors openly and honestly Measures have adequate reliability and validity
Strengths (cont.)

Rating scales can tap behaviors you may not be able to quantify in other tests Convenience: time-and costefficient for assessor, multiple viewpoints Comprehensive scales can ensure touching range of problem areas unlike interviews which may delve into one problem but miss others Structured response format and operationalizing behavior can reduce subjectivity Increase ecological validity of the assessment, normal environment
Strengths (cont.)

Teacher ratings have high predictive power; teacher has formal training, structure setting, comparison to other children Biases evidenced between settings or individuals can be used in assessment and intervention, identify the real problem (child or referent), parenting style differences, influence of setting Some rating scales ask informant to identify the most problematic/concerning problem Child may not be able to interact/respond to assessment, e.g. infants, severely impaired
Strengths (cont.)

Use of caretaker as informant is strength in parents have observed child since birth; parents are motivated; part of natural environment More objective and reliable than projective and interview; can be less biased than self-report Can provide information on strengths as well as concerns
Selection Considerations

Technical considerations: Norms, validity, reliability, constructs sampled, test construction Informant, situation, time, client Scope of instrument: Narrow and/or broad category of behaviors; Choose for what you need and want; strengths (competencies) and weaknesses Purpose or use: screening, diagnosis, placement, intervention; program evaluation Clinical Utility: ease of administration; useful clinical information; sensitive to effects of intervention
Specific Scales
BASC CBCL Conners others

BASCBASC-Behavior Assessment System for Children

Teacher Rating Scale Preschool 4-5 yrs (109 items) Child 6-11 yrs (148 items) Adolescent 12-18(138 items) Parent Rating Scale Preschool 4-5 yrs (105 items) Child 6-11 yrs (138 items) Adolescent 12-13 (126 items) Self-Report Scale Child 8-11 yrs (152 items) Adolescent 12-18 yrs (186 items) Each takes about 30 minutes to complete
BASC (cont.)
Scores Teacher and Parent have 4point response (never, sometimes, often, almost always) Self-Report has true/false T scores and %ile ranks Scored by hand on carbonless forms or computer
BASC (cont.)
Standardization 2,084 Children ages 6-11 and 1,090 adolescents 12-18 for parent scale 1,259 children ages 6-11 and 809 adolescents 12-18 for teacher scale 5,413 children ages 8-11 and 4,448 adolescents ages 12-18 for SelfReport Collected 1988-1991, matching 1986 U.S. Census Separate norms for males, females, and clinical samples About 70% of clinical samples were males with dx of conduct or behavior disorder
BASC (cont.)
Reliability Internal consistency reliabilities for the 3 scales in the school age sample range from .62 to .95 for TRS; .58 to .94 for PRS, and .61 to .89 for Self-Report Interrater reliabilities: PRS are generally low, .35 to .73; TRS from .29 to .70 for preschool and .44 to .93 for schoolage; none available for adolescents Test-retest: PRS for 2 to 8 week interval range from .41 to .94; TRS .59 to .95; Self-Report .57 to .81 for children and .67 to .81 for adolescents
BASC (cont.)
Validity Construct validity for internalizing and externalizing dimensions of the BASC scales is supported by factor and structural equation analyses Criterion-related validity is satisfactory for the 3 scales, as show by acceptable correlations with other similar measures
BASC (cont.)

Integrative approach across multiple informants Strength is in assessment of children ages 6 to 11 years, particularly in externalizing behaviors Separation of Attention & Hyperactivity; Depression & Anxiety Limited psychopathology and personality domains Comparison across child and adult forms is difficult Readability of Self-Report may be too high
Child Behavior Checklist (CBCL) Teachers Report Form (TRF) & Youth Self-Report (YSR) SelfParent Rating Preschool 2-3 yrs (99 items) School-age 4-18 yrs (120 items) Teacher Rating Form Caregiver/Teacher 2-5 yrs (99 items) School Age 6-18 yrs (120 items) Youth Self-Report Ages 11-18 yrs (119 items) Requires 5th grade reading level about 30 minutes to complete Parent and Teacher form take 10-15 minutes to complete
CBCL, TRF & YSR (Cont.) Scores 3-point response (not true, somewhat true or sometimes true & , often true) T scores and %ile ranks Scored by templates, scannable answer sheets, or computer
CBCL, TRF & YSR (Cont.)

Standardization 1,200 males and females ages 411 and 1,168 adolescents 12-18 for parent scale 713 children ages 5-11 and 678 adolescents 12-18 for teacher scale 637 Males and 678 females for Self-Report Collected 1989, matching 1990 U.S. Census Separate norms for males&females

Reliability Internal consistency reliabilities for the parent from .56 to .92; for teacher.63 to .96; and .59 to .90 (males) & .59 to .89 (females)for Self-Report Interrater reliabilities: Parent.26 to .86; Teacher from -.05 to .81; none available for adolescents Test-retest: Parent for 1 week interval range from .63 to .97; Teacher .82 to .95for males & .43 to .99 for females; Self-Report .47 to .81 for 50 children ages 11 to 18

Validity Concurrent validity for parent, teacher, and YSR forms is satisfactory, acceptable correlations with Conner Discriminant validity for parent and teacher forms is acceptable and satisfactory for YSR shown by significant differences in scores between referred and nonreferred samples
CBCL, TRF & YSR (Cont.) Does not provide validity scales Support cross-informant assessment Low levels of reliability, suggesting caution in their interpretation and application Broad-based screening measure rather than a precise measure of disorder

Conners Rating ScalesScales-Revised

Parent and teacher versions are designed for ages 3-17 Self-report is for ages 12-17 years Short forms (@ 27 items) and long forms (59-87 items) are available

Scores 4-point response (not true at all, just a little true, pretty much true, & very much true) T scores Scored by self-scoring sheet or computer scored with interpretive report

Standardization 8,000 individuals drawn from 1993 to 1996 from 45 U.S. states and 10 Canadian provinces. Norms are provided separately for males and females by age levels Does not match U.S. Census as there are more Euro-Americans than in general population

Reliability Internal consistency reliabilities for the parent and teacher from .73 to .96; for adolescent .75 to .92 Test-retest: Parent and teacher forms are variable for long and short forms, with better reliabilities for the short form over a 6-8 week retest; self-report form ranges from .72 to .89 between the two forms.

Validity Construct validity is satisfactory based on factor analysis used to construct the scales Convergent validity is good, high correlations between long and short forms Criterion validity is good, high correlations between various versions of the scales Discriminant validity for parent and teacher forms is good significant differences in scores between referred and nonreferred samples

Improvement over previous scales Standardization samples are small for any age group or gender Adequate to good reliability and adequate validity, with informant versions strong in evaluating externalizing problems Self-report is useful for measuring general distress
Others

Devereux Scales of Mental Disorders: Good reliability but limited validity; limited in its evaluation of psychopathology; some items include content that is difficult for parents and teachers to evaluate; not clearly aligned to DSM-IV, although this was an objective Scales specific to ADHD or other diagnosis: many have limited sample size and limited utility
References

Knoff, H. M. (2002). Best practices in personality assessment. In A. Thomas & J. Grimes (eds) Best practices in school psychology IV, Vol. 2. Bethesda, MD: National Association of School Psychologists Martin, R., Hooper, S., & Snow,J. (1986). Behavior rating scale approaches to personality assessment in children and adolescents. In H. Knoff (Ed.) The assessment of child and adolescent personality. New York: Guilford Press. Sattler, J.M. (2002). Assessment of children: Behavioral and clinical applications, (4th ed.). San Diego: Jerome M. Sattler, Publisher, Inc.

3 13 2003

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

3 13 2003

Hochgeladen von

Copyright:

Verfügbare Formate

Behavior Rating Scales

Martin, Hooper & Snow, 1986

Types of Rating Scales

Summary of Construction Issues

Checklist vs Dichotomy vs Continuum

Various strategies used to develop items and scales

Ability to Sum Scores

Reasons for Inadvertent Bias

Reasons for Inadvertent Bias (cont.)

Types of Response Bias

Central tendency or range restriction

Response acquiescence & response deviance

Methods to minimize bias

Considerations for Misuse

BASCBASC-Behavior Assessment System for Children

CBCL, TRF & YSR (Cont.)

CBCL, TRF & YSR (Cont.)

CBCL, TRF & YSR (Cont.)

Conners Rating ScalesScales-Revised

Conners Rating ScalesScales-Revised

Conners Rating ScalesScales-Revised

Conners Rating ScalesScales-Revised

Conners Rating ScalesScales-Revised

Conners Rating ScalesScales-Revised

Das könnte Ihnen auch gefallen