Beruflich Dokumente
Kultur Dokumente
Content
Develop a pool of items that fully measure the construct Example: Depression What items should be included in the pool?
Format
Dichotomous (true false) Polychotomous (multiple choice) Likert scales (degree of agreement) many others
Data Collection
Clinicians are concerned with the accuracy of their collected data Degree of accuracy is reflected in whether test results are consistent and in the degree to which it measures the correct construct
Reliability
Consistency of the observations or measurements Reliability is inversely related to the degree of error in the instrument. High measurement error translates to low reliability Low measurement error translates to high reliability
What !?
What does this mean!? High measurement error translates to low reliability Easy Example: A broken scale There will be high measurement error on a broken scale, correct? How consistent are the weights likely to be on a broken scale? Is a broken or working scale going to have more error? Is the broken or working scale going to be more reliable?
Types of Reliability
Inter-rater reliability (relevant to observational systems and psychological assessments requiring ratings or judgment) Test-retest reliability Internal consistency (split-half)
Note: Each form of reliability is not equally important for every assessment method
Inter-rater Reliability
Degree of correspondence between two raters
Inter-rater reliability of diagnoses based on DSM criteria improved with DSM-III and the development of operational criteria for most of the mental disorders
Test-Retest Reliability
The consistency of results over periods of time. The consistency of the results for a test given at two different time periods
Validity
A test can be reliable (consistently give the same results) but not valuable. Why? If the test does not measure the correct construct, then it is not useful even if the results are consistent.
Validity
The degree to which a test measures what it is designed or intended to measure.
Types of Validity
Face validity Content validity Criterion validity (predictive and concurrent) Discriminant Construct validity
Face Validity
A judgment about the relevance of test items A type of validity that is more from the perspective of the test taker as opposed to the test user Example: Personality tests Introversion-Extroversion test will be perceived as a highly (face) valid measure of personality functioning The inkblot test may not be perceived as a (face) valid method of personality functioning
Content Validity
Degree to which the measure covers the full range of the (personality) construct. and Degree to which the measure excludes factors that are not representative of the construct
Criterion Validity
The degree to which the test results (from your measure) are correlated with another related construct. WHAT!? For example: the degree to which scores on an intelligence test are correlated with school performance or achievement.
For example: Concurrent: the correlation of SAT score with G.P.A. at the time of taking the SAT in high school. Predictive: the correlation of SAT score taken in high school with final G.P.A. upon graduating from college
Discriminant Validity
The degree to which the score on a measure of a personality trait does not correlate with scores on measures of traits that are unrelated with the trait under investigation.
For example: (from text) Trait being measured: phobia Unrelated trait: intelligence You would not expect the score on your phobia scale to be correlated with the score on an intelligence test
Construct Validity
The degree to which the measure reflects the structure and features of the hypothetical construct that is being measured Measured by combining all these aspects of validity.
Exercise: Reliability and Validity applied to the Edinburgh Postnatal Depression Scale (EPDS)
Lets consider reliability and validity in the context of a real measure: the EPDS
Handout Questions