Beruflich Dokumente
Kultur Dokumente
Definition Types Construction Issues Weaknesses Strengths Selection Considerations Specific Scales: Conners; CBCL; BASC; others
Definition
Rating Scale: any paper and pencil device where by one (usually a care taker such as a parent or teacher, though not excluding peers) assesses the behavior of that individual based on his or her observations of the child or adolescent over an extended period of time (usually more than a month)
Range of constructs from general functioning to concrete behaviors Personality: Personality Inventory for Children-Revised (PIC-2); Minnesota Multiphasic Personality InventoryAdolescent (MMPI-A) Behavior Checklists: Child Behavior Checklist (CBCL); Conners Rating Scales-Revised; Behavior Assessment System for Children (BASC); Devereux Scales of Mental Disorders Specific Disorders- Childrens Manifest Anxiety Scales; Beck Depression Inventory; Childrens Depression Inventory
Checklists: rater checks the item of the behavior exists; can be used in screening for specific DSM-IV disorders Dichotomy: rater indicates of the behavior exists or does not exist; forced dichotomy; Yes/No Continuum: 1 2 3 4 5 Increases reliability with more steps (plateau after 11 steps with little gain); Odd number allows for a neutral, middle step, but can create a response set
Item Choice
Subjectivity of instrument is a function of the level of analysis; type of item; manner scaled Sufficient number of items to sample the construct Face validity of items Specificity of behavior: Is delinquent vs Lies; steals; violates curfew Too specific may lead to trivial information, excessive length Time frame identified, e.g. Within the last two weeks
Factor analysis: placing in a factor items that cluster together Empirical keying: using selected items to distinguish one group from another Theoretical constructs: using selected items to measure the theoretical constructs underlying the construction of the test Content analysis: using experts to select items to measure the trait or diagnostic category of interest
Anchors
End points on a scale Numerical (Likert scale) Degrees of agree/disagree Adjectives such as good/bad; carefree/anxious; impulsive/reflective Actual behavior to typify a type of attitude such as religion: attends church 1 time per months; 2 times per months; weekly; biweeklyThis may be specific to the construct; may not represent equal intervals; may be difficult to find discreet specific behaviors Comparison to norm or product scales
Description of Behavior/Construct
Scales need to be defined Based on theory Behaviors which fall under one construct on one test, may be utilized on another construct in another test
Summary of Weaknesses
Disadvantages Considerations for Misuse Safeguards
Disadvantages
Four areas of variation on assessment data which summarize the disadvantages of rating scales: source variance, setting variance, temporal variance, and instrument variance (Martin, Hooper and Snow)
Source Variance
Primary source of error in rating scale data is the informant Knowledge of subject for at least 2 months Perceptions of rater Tolerance of behavior Stress level of respondent Choice of informant may slant results Internalizing behaviors or low rate behaviors may not be observed May not recognize the usefulness of the scale Reading level of informant (30-40% of the population does not read at a fifth grade level) Response Bias
Response Bias
Science identifies truth as the convergence of data Respondents may differ in perception, normative life experiences (e.g. urban/suburban; poverty/wealth), response style, and desired outcome: teacher may want the child in a program; teacher/parent may not have objective view in relation to normal peers; parent may have ulterior motive such as custody, monetary benefits Respondents sometimes are biased without awareness
Halo Effect
A raters failure to discriminate among distinct and independent aspects of a ratees behavior (Saal, 1980) Cognition: rate child positively in emotional or behavioral issues because they are smart Socially adept: child must be emotionally or cognitively adept because of positive social behaviors (always helpful, smiles) Other raters may report conflicting information
Leniency or severity
Occurs when ratings are consistently higher or lower than are warranted Inferred when a rater uses predominantly one extreme or the other on the scale Cannot be verified unless an independent observation or other party disagrees, e.g. parent sees child as hyperactive while few others see him as such
Rater restricts range of all ratings to average or above or below (may revert to leniency or severity bias) Rater may choose middle response since they feel they do not know all the universe of possible occurrences of the behavior (e.g. I dont know how he is with his friends; I only see him at school/home) therefore cannot rate as Always True/False, etc.
Social desirability
Interpret the test responses to provide the most favorable view of the child Rater may not be aware of the tendency to underrate problematic responses Rater may hesitant to endorse items that suggest the presence of a particular disorder (e.g. Beck Depression Inventory)
Use a lie scale or faking good scale Switch left and right for positive responses Use bipolar adjectives Response scaling: many problem behaviors occur in all children, dichotomy is not adequate (most children yell, cry, hit at least sometimes) Provide clear instructions Limit number of response categories to reduce confusion, lack of focus, length Identify at the beginning what the scales mean and time frame for rating
Setting Variance
Interaction with the environment can affect results, i.e. home/school/ clinic Interventions used Consider if instrument is sensitive across settings or specific to one setting
Temporal Variance
Change in behavior over time Medication issues Intervention Maturation Significant events: deaths, divorce, illness, trauma
Instrument Variance
Sloppy construction Definition of construct Qualitative technical aspects Quantitative: depth of information as well as breadth
May be convenient and efficient for assessor, but may not be for the informant Provide feedback and explain the instrument Inappropriate use of instrument for screening, diagnosis, intervention development, program evaluation Choice of an instrument to sway identification of a specific condition
Safeguards
Aggregate principle: collect data on same construct over varied settings with varied instruments to increase reliability by controlling the sources of variance Test over several time periods Use several instruments Use several raters Multi-setting, Multi-source, Multiinstrument Design Variations in responses may be due to setting, activity, or rater Can lead to hypothesis development
Strengths
Rating scale is a derivative of the unstructured interview, an evolution of the interview in the direction of increasing structure The interview has more variability in interviewers; does not cover all areas; problems may be missed; clients are not always willing and articulate inaccurate reporting; reliability and validity may be poor Rating scale can identify strengths and weaknesses Validate referents concern Evaluate the severity and range of the concern Assess atypical patterns Part of multi-source, multi-method evaluation
Strengths (cont.)
Several assumptions allow for the comparison of raters responses:
1) 2) 3)
4)
Informants can describe or rate the child Items have the same or similar meaning for all respondents Respondents report their thoughts, feelings, & behaviors openly and honestly Measures have adequate reliability and validity
Strengths (cont.)
Rating scales can tap behaviors you may not be able to quantify in other tests Convenience: time-and costefficient for assessor, multiple viewpoints Comprehensive scales can ensure touching range of problem areas unlike interviews which may delve into one problem but miss others Structured response format and operationalizing behavior can reduce subjectivity Increase ecological validity of the assessment, normal environment
Strengths (cont.)
Teacher ratings have high predictive power; teacher has formal training, structure setting, comparison to other children Biases evidenced between settings or individuals can be used in assessment and intervention, identify the real problem (child or referent), parenting style differences, influence of setting Some rating scales ask informant to identify the most problematic/concerning problem Child may not be able to interact/respond to assessment, e.g. infants, severely impaired
Strengths (cont.)
Use of caretaker as informant is strength in parents have observed child since birth; parents are motivated; part of natural environment More objective and reliable than projective and interview; can be less biased than self-report Can provide information on strengths as well as concerns
Selection Considerations
Technical considerations: Norms, validity, reliability, constructs sampled, test construction Informant, situation, time, client Scope of instrument: Narrow and/or broad category of behaviors; Choose for what you need and want; strengths (competencies) and weaknesses Purpose or use: screening, diagnosis, placement, intervention; program evaluation Clinical Utility: ease of administration; useful clinical information; sensitive to effects of intervention
Specific Scales
BASC CBCL Conners others
BASC (cont.)
Scores Teacher and Parent have 4point response (never, sometimes, often, almost always) Self-Report has true/false T scores and %ile ranks Scored by hand on carbonless forms or computer
BASC (cont.)
Standardization 2,084 Children ages 6-11 and 1,090 adolescents 12-18 for parent scale 1,259 children ages 6-11 and 809 adolescents 12-18 for teacher scale 5,413 children ages 8-11 and 4,448 adolescents ages 12-18 for SelfReport Collected 1988-1991, matching 1986 U.S. Census Separate norms for males, females, and clinical samples About 70% of clinical samples were males with dx of conduct or behavior disorder
BASC (cont.)
Reliability Internal consistency reliabilities for the 3 scales in the school age sample range from .62 to .95 for TRS; .58 to .94 for PRS, and .61 to .89 for Self-Report Interrater reliabilities: PRS are generally low, .35 to .73; TRS from .29 to .70 for preschool and .44 to .93 for schoolage; none available for adolescents Test-retest: PRS for 2 to 8 week interval range from .41 to .94; TRS .59 to .95; Self-Report .57 to .81 for children and .67 to .81 for adolescents
BASC (cont.)
Validity Construct validity for internalizing and externalizing dimensions of the BASC scales is supported by factor and structural equation analyses Criterion-related validity is satisfactory for the 3 scales, as show by acceptable correlations with other similar measures
BASC (cont.)
Integrative approach across multiple informants Strength is in assessment of children ages 6 to 11 years, particularly in externalizing behaviors Separation of Attention & Hyperactivity; Depression & Anxiety Limited psychopathology and personality domains Comparison across child and adult forms is difficult Readability of Self-Report may be too high
Child Behavior Checklist (CBCL) Teachers Report Form (TRF) & Youth Self-Report (YSR) SelfParent Rating Preschool 2-3 yrs (99 items) School-age 4-18 yrs (120 items) Teacher Rating Form Caregiver/Teacher 2-5 yrs (99 items) School Age 6-18 yrs (120 items) Youth Self-Report Ages 11-18 yrs (119 items) Requires 5th grade reading level about 30 minutes to complete Parent and Teacher form take 10-15 minutes to complete
CBCL, TRF & YSR (Cont.) Scores 3-point response (not true, somewhat true or sometimes true & , often true) T scores and %ile ranks Scored by templates, scannable answer sheets, or computer
CBCL, TRF & YSR (Cont.) Does not provide validity scales Support cross-informant assessment Low levels of reliability, suggesting caution in their interpretation and application Broad-based screening measure rather than a precise measure of disorder
Improvement over previous scales Standardization samples are small for any age group or gender Adequate to good reliability and adequate validity, with informant versions strong in evaluating externalizing problems Self-report is useful for measuring general distress
Others
Devereux Scales of Mental Disorders: Good reliability but limited validity; limited in its evaluation of psychopathology; some items include content that is difficult for parents and teachers to evaluate; not clearly aligned to DSM-IV, although this was an objective Scales specific to ADHD or other diagnosis: many have limited sample size and limited utility
References
Knoff, H. M. (2002). Best practices in personality assessment. In A. Thomas & J. Grimes (eds) Best practices in school psychology IV, Vol. 2. Bethesda, MD: National Association of School Psychologists Martin, R., Hooper, S., & Snow,J. (1986). Behavior rating scale approaches to personality assessment in children and adolescents. In H. Knoff (Ed.) The assessment of child and adolescent personality. New York: Guilford Press. Sattler, J.M. (2002). Assessment of children: Behavioral and clinical applications, (4th ed.). San Diego: Jerome M. Sattler, Publisher, Inc.