Sie sind auf Seite 1von 20

1

Please use this handout for the following: o Introduction to Key Contributors to Intelligence Theory and Testing: Spearman, Binet, Terman, Wechsler, Cattell, Horn, Gardner, & Sternberg o History of Intelligence Testing o An overview of Wechslers Intelligence Scales o Wechslers concept of intelligence & Wechslers key contributions o Key Concepts in Intelligence Testing (e.g., mental age, deviation IQ) o WISC-IV o Standardisation (including important ideas such as item difficulty, item discriminability) o Criticism of the Wechsler Scales

WISC-IV and Related Ideas! Dr. Anuradha J. Bakshi Introduction to David Wechsler (1896-1981) (American psychologist)

Education Career Army Psychologist assigned to Camp Logan, Texas (1917) Sent by the Army to the University of London to work with Spearman and Pearson (1918) Clinical Psychologist at the Bureau of Child Guidance, New York City (1922-1925) Clinical psychology private practice (1925-1932) Chief Psychologist, Bellevue Psychiatric Hospital (1932-1967) Columbia University, M.A. (1917) Columbia University, Ph.D. in experimental psychology ( 1925)

Testing Career: Published several assessment instruments, including: Wechsler-Bellevue Scale of Intelligence (1939) Wechsler Memory Scale (WMS) (1945/1997) Wechsler Intelligence Scale for Children (WISC) (1949/2003) Wechsler Adult Intelligence Scale (WAIS) (1955/1997) Wechsler Primary and Preschool Scale of Intelligence (WPPSI) (1967/2002)

History of Intelligence Testing

Charles Edward Spearman (1863-1945) (English psychologist) Spearman (1904) emphasized a single, underlying construct of intelligence as largely responsible for an individuals performance on all mental tasks. He identified this construct as the g-factor. All specific factors (e.g., numerical reasoning, vocabulary, mechanical skill) were positively correlated and explained by g, reflecting a common pool of mental energy. Early intelligence tests emphasized the classification of individuals based on their overall level of cognitive functioning.

Alfred Binet (1857-1911) (French psychologist) In 1905, Binet and Simon published an intelligence scale in response to a French government commission that was aimed at developing methods to identify children who would not benefit from regular education. Children who needed to learn how to learn. Binet developed tasks to measure judgment, attention, and reasoninghe believed these were expressions of intelligence. He decided to use these tasks to measure general mental ability. The Binet-Simon test consists of a variety of items intended to reflect knowledge and skills the average French school child of a given age would have (http://users.ipfw.edu/abbott/120/IntelligenceTests.html). These items are graded in difficulty according to age, so that, for example, items the average twelve-year-old would be able to answer, a younger child would tend to miss (http://users.ipfw.edu/abbott/120/IntelligenceTests.html). The scoring of the test produces a number called the child's mental age. The mental age reflects the level at which the child performed on the testif the child performed at the level of the average 10-year-old, for example, then the child would be assigned a mental age of 10, regardless of the child's chronological age (physical age) (http://users.ipfw.edu/abbott/120/IntelligenceTests.html). One compares the child's mental age to his or her chronological age. If the mental age is the same as the chronological age, then the child is average. If the mental age is higher than the chronological age, then the child is mentally "advanced" or gifted. If the mental age is lower than the chronological age, then the child is mentally "retarded," or behind his or her peers in intellectual development. (http://users.ipfw.edu/abbott/120/IntelligenceTests.html) The test is administered individually, one-on-one, by a person trained to do so, and requires upwards of two hours to complete.

Examples of items (1905 version): Item 14 required a person to define familiar objects such as a fork. Item 30 (the most difficult item) required a person to define and distinguish between paired abstract terms (e.g., sad and bored).

Examples of items (1908 version): Age level 3 Age level 12 Point to various parts of the face. Repeat two digits forward. Repeat seven digits forward. Provide the meaning of pictures.

The Binet-Simon test and its successors measure intelligence by assessing intellectual skills and knowledge. It is assumed that the individual has had the opportunity to learn these skills and knowledge; if the person had the opportunity to learn them and did not, then this is assumed to reflect a deficit in intelligence. On the other hand, if the person has not had the exposure needed to learn these things, the failure to demonstrate knowledge of them says nothing about the person's intelligence. Ignoring this truth has led to some unwarranted conclusions being drawn based on test results. (http://users.ipfw.edu/abbott/120/IntelligenceTests.html) This first intelligence test, referred to today as the Binet-Simon scale, became the basis for the intelligence tests still in use today (http://psychology.about.com/od/psychologicaltesting/a/int-history.htm). However, Binet himself did not believe that his psychometric instruments could be used to measure a single, permanent and inborn level of intelligence (Kamin, 1995). Binet stressed the limitations of the test, suggesting that intelligence is far too broad a concept to quantify with a single number. Instead, he insisted that intelligence is influenced by a number of factors, changes over time and can only be compared among children with similar backgrounds. (Siegler, 1992).

Lewis Terman (1877-1956) (American psychologist) Lewis Terman and his colleagues at Stanford (Terman, 1916) introduced a wellstandardized revision and extension of the Binet-Simon scale to USA. This adapted test, first published in 1916, was called the Stanford-Binet Intelligence Scale and soon became the standard intelligence test used in USA. The Stanford-Binet intelligence test used a single number, known as the intelligence quotient (or IQ), to represent an individual's score on the test.

This score was calculated by dividing the test taker's mental age by their chronological age, and then multiplying this number by 100. For example, a child with a mental age of 12 and a chronological age of 10 would have an IQ of 120 (12 /10 x 100). The primary focus of intelligence testing continued to be aimed at identifying intellectual deficiency.

Army Alpha and Beta The entry of USA into WW I created a need to screen recruits. The Army Alpha, which included a large verbal component, was created in 1917. The Army Alpha was a group-administered test that measured verbal ability, numerical ability, ability to follow directions, and knowledge of information (http://officialasvab.com/history_coun.htm). The limited literacy of some recruits presented the need for a nonverbal measure of intelligence, leading to the development of the Army Beta in 1918. The Army Beta was a nonverbal counterpart to the Army Alpha. It was used to evaluate the aptitude of illiterate, unschooled, or non-English speaking draftees and volunteers (http://official-asvab.com/history_coun.htm).

Sample Items: Army Alpha http://official-asvab.com/armysamples_coun.htm 1. A company advanced 6 miles and retreated 2 miles. How far was it then from its first position? 2. A dealer bought some mules for $1,200. He sold them for $1,500, making $50 on each mule. How many mules were there? 3. Thermometers are useful because 1. They regulate temperature 2. They tell us how warm it is 3. They contain mercury 4. A machine gun is more deadly than a rifle, because it 1. Was invented more recently 2. Fires more rapidly 3. Can be used with less training For these next two items, examinees first had to unscramble the words to form a sentence, and then indicate if the sentence was true or false. 5. happy is man sick always a 6. day it snow does every not The next two items required examinees to determine the next two numbers in each sequence.

7. 3 4 5 6 7 8 8. 18 14 17 13 16 12 A portion of the Army Alpha required examinees to solve analogies. 9. shoe foot. hat kitten, head, knife, penny 10. eye head. window key, floor, room, door In these next two examples, examinees were required to complete the sentence by selecting one of the four possible answers. 11. The apple grows on a shrub, vine, bush, tree 12. Denim is a dance, food, fabric, drink Other portions of the test required examinees to follow instructions in performing paperand-pencil tasks.

Sample Items: Army Beta In the items below, examinees were asked to identify what was missing from each picture.

Over the course World War I, some 1.5 million recruits were given these tests to identify those who were capable of serving, to classify them into military jobs, and to select those who appeared to be candidates for leadership positions (http://officialasvab.com/history_coun.htm). Lewis Terman and Robert Yerkes and others developed the Army Alpha and Beta tests. It appeared that the average recruit had a mental age of around 13--a mild level of retardation (http://users.ipfw.edu/abbott/120/IntelligenceTests.html). The reason for this had to do mainly with the level of education of the recruits rather than low native intelligence, but Yerkes and others concluded incorrectly that the intelligence deficit was real, sounding alarm bells about the "menace of the feeble-minded (http://users.ipfw.edu/abbott/120/IntelligenceTests.html).

Wechsler Scales

Addressing the need for verbal and nonverbal measures of intelligence, Wechslers original intelligence test, the Wechsler-Bellevue Intelligence Scale (Wechsler-Bellevue; Wechsler, 1939), yielded scores for both verbal and performance scales in addition to an overall composite score.

Advancements in the 1950s: Raymond B. Cattell (1905 1998) & John L. Horn (1929 2006) As the special education system began to expand in the 1950s, so did the need to identify and diagnose the nature of learning disabilities n children. Concurrent advances in factor-analytic techniques were applied to measures of mental abilities to further clarify the nature of intelligence. Intelligence testing began to focus on measuring more discrete aspects of an individuals cognitive functioning. Cattell, a student of Spearman, introduced a theory that intelligence was composed of two general factors, fluid intelligence (Gf) and crystallized intelligence (Gc; Cattell, 1941, 1957). Fluid intelligence: Abilities that allow us to reason, think, and acquire new knowledge. Or: Abilities that allow us to learn and acquire new information. Crystallized intelligence: The knowledge and understanding that we have acquired. The actual learning that has occurred. Horn later expanded Cattells original Gf-Gc theory to include the following factors: Visual perception Short-term memory Long-term storage and retrieval Speed of processing Auditory processing ability Quantitative ability Reading and writing ability

More Recent Work Much of the debate over intellectual assessment during the last 60 years has focused on whether there is an underlying, global aspect of intelligence that influences an individuals performance across cognitive domains. One set of psychologists (e.g., David Wechsler) advocates a hierarchical structure of intelligence with specific abilities under a global, higher-order intelligence. Another set of psychologists denies the existence of an underlying, global intelligence. This set led by scholars such as Howard Gardner and Robert Sternberg advocates multiple intelligences.

Wechslers Concept of Intelligence

Wechsler (1944) defined intelligence as the capacity of the individual to act purposefully, to think rationally, and to deal effectively with his environment (p. 3). He avoided defining intelligence in purely cognitive terms because he believed that these factors only comprised a portion of intelligence. He believed that noncognitive aspects also contributed to intelligent behavior: planning and goal awareness enthusiasm field dependence and independence impulsiveness anxiety persistence Such attributes are not directly tapped by standardized measures of intellectual ability, yet they influence a childs performance on these measures and his or her effectiveness in daily living and meeting the world and its challenges (Wechsler, 1975). Wechsler (1975) noted: What we measure with tests is not what tests measurenot information, not spatial perception, not reasoning ability. These are only a means to an end. What intelligence tests measure is something much more important: the capacity of an individual to understand the world about him and his resourcefulness to cope with its challenges (p. 4).

Wechslers Position on Intelligence (the following are his conclusions) Despite continuing debate over the existence of a single, underlying construct of intelligence, the results of factor-analytic research converge in the identification of 8 to 10 broad domains of intelligence. However, the trend toward an emphasis on multiple, more narrowly defined cognitive abilities has not resulted in rejection of an underlying, global aspect of general intelligence. Currently, intelligence is widely viewed as having a hierarchical structure, with more specific abilities comprising several broad cognitive domains.

Contributions of Wechsler Just two years after the monumental revision of the Stanford-Binet Scale, the WechslerBellevue Intelligence Scale made a phenomenal mark in the field of intelligence testing, challenging the supremacy of the Binet-inspired scale. With so many different and varied abilities associated with intelligence, Wechsler objected to the single score offered by the 1937 Stanford-Binet Scale. Although Wechslers test did not directly measure nonintellective factors, it took these factors into careful account in its underlying theory. Wechsler concluded that the items on the Stanford-Binet scale lacked validity for use with adults.

10

He correctly noted that the Stanford-Binet Scales emphasis on speed, with timed tasks scattered throughout the scale, unduly handicapped older adults (Kaplan & Saccuzzo, 2005, p. 254). The mental age norms did not apply to adults (Kaplan & Saccuzzo, 2005, p. 254). Point scale concept: On the earlier Stanford-Binet Scale, if a person was required to correctly respond to three out of four tasks in order to receive credit, then correctly responding to only two tasks produced no credit. In a point scale, a person is credited for each item to which (s)he correctly responds. Scoring content areas: By arranging items according to content and assigning specific points to each item, Wechsler devised a test that permitted not only a total overall score but also scores for each content area. The performance scale concept: The early Stanford-Binet Scale had been criticized for its dependence on language and verbal skills. Wechsler included an entire performance scale (items requiring a person to copy symbols, point to a missing detail rather than only orally answer questions). The Deviation IQ: The Deviation IQ replaces the need to compute a mental age and therefore, circumvents all the problems associated with mental age computations. (In other words, it replaces a ratio IQ.) A deviation IQ allows for comparison across age levels. The deviation IQ scores have a mean of 100 and a standard deviation of 15. Thus, a child or an adult who has a deviation IQ of 130 has an intelligence level which is 2 standard deviations above the mean of children of the same age and sex from the standardisation sample.

Statistics

11

The Deviation IQ Percentiles Percentile Ranks In test theory, the percentile rank of a raw score is the percentage of examinees in the norm group who scored below the score of interest (Crocker & Algina, 1986).

12

The evolution of the Wechsler Intelligence Scale for ChildrenFourth Edition (WISCIV) began with the original Wechsler-Bellevue. Wechsler based this test on the premise that intelligence is a global entity because it characterizes the individuals behaviour as a whole, and it is also specific because it is composed of elements or abilities that are distinct from each other. Based on his clinical expertise, Wechsler selected and developed subtests that highlighted the cognitive aspects of intelligence he thought were important to measure: Verbal comprehension Abstract reasoning Perceptual organization Quantitative reasoning Memory Processing speed

Nature of Test Individual (not a group test) Clinical instrument Measures intelligence from ages 6 through 16 years, 11 months. 15 subtests Yields 4 major indices, and a full-scale IQ Items are arranged in order of difficulty Items are grouped by content Use of teaching items that are not scored Test instructions are such that the examiners have more latitude to encourage a child to perform as well as possible. Testing Time 50% of the normative sample completed the test within 67 minutes, 90% within 94 minutes 50% of children identified as intellectually gifted completed the core subtests within 79 minutes, 90% within 104 minutes. 50% of children diagnosed with mental retardation completed the core subtests within 50 minutes, 90% within 73 minutes

13

Test Indices and Test Structure The WISC-IV yields five scores, a global intelligence score, and four index scores: FSIQ: Full Scale IQ VCI: Verbal Comprehension Index PRI: Perceptual Reasoning Index WMI: Working Memory Index PSI: Processing Speed Index There are 10 core subtests and five optional subtests as follows: VCI has three core subtests and two optional subtests Similarities Vocabulary Comprehension Information Word Reasoning PRI has three core subtests and one optional subtest Block Design Picture Concepts Matrix Reasoning Picture Completion WMI has two core subtests and one optional subtest Digit Span Letter-Number Sequencing Arithmetic PSI has two core subtests and one optional subtest Coding Symbol Search Cancellation The VCI is composed of subtests measuring verbal abilities utilizing reasoning, comprehension, and conceptualization. The PRI is composed of subtests measuring perceptual reasoning and organization. The WMI is composed of subtests measuring attention, concentration, and working memory. The PSI is composed of subtests measuring the speed of mental and graphomotor processing.

14

VCI Core Subtests VCI: Similarities The child is presented with two words that represent common objects or concepts and describes how they are similar. VCI: Vocabulary For Picture Items, the child names pictures that are displayed in the Stimulus Book. For Verbal Items, the child gives definitions for words that the examiner reads aloud. VCI: Comprehension The child answers questions based on his or her understanding of general principles and social situations. VCI Optional Subtests VCI: Information The child answers questions that address a broad range of general knowledge topics. VCI: Word Reasoning The child identifies the common concept being described in a series of clues.

15

PRI Core Subtests PRI: Block Design While viewing a constructed model or a picture in the Stimulus Book, the child uses redand-white blocks to recreate the design within a specified time limit. PRI: Picture Concepts The child is presented with two or three rows of pictures and chooses one picture from each row to form a group with a common characteristic. PRI: Matrix Reasoning The child looks at an incomplete matrix and selects the missing portion from five response options. PRI Optional Subtest PRI: Picture Completion The child views a picture and then points to or names the important part missing within the specified time limit. WMI Core Subtests WMI: Digit Span For Digit Span Forwards, the child repeats numbers in the same order as presented aloud by the examiner. For Digit Span Backwards, the child repeats numbers in the reverse order of that presented aloud by the examiner. WMI: Letter Number Sequencing The child is read a sequence of numbers and letters and recalls the numbers in ascending order and the letters in alphabetical order. WMI Optional Subtest WMI: Arithmetic The child mentally solves a series of orally presented arithmetic problems within a specified time limit. PSI Core Subtests PSI: Coding The child copies symbols that are paired with simple geometric shapes or numbers. Using a key, the child draws each symbol in its corresponding shape or box within a specified time limit. PSI: Symbol Search The child scans a search group and indicates whether the target symbol(s) matches any of the symbols in the search group within a specified time limit. PSI Optional Subtest PSI: Cancellation The child scans both a random and a structured arrangement of pictures and marks target pictures within a specified time limit. Scoring and Interpretation Careful scoring of each item using a point scale, for each subtest. Totalling item scores across a subtest This yields a raw score for each subtest. Converting raw scores to scaled scores (using the Appendices in the test manual).

16

Summing up scaled scores to yield five Sums of Scaled Scores (VCI, PRI, WMI, PSI, FSIQ). Converting scaled scores to composite scores These composite scores are Deviation IQ scores with a Mean of 100 and a Standard Deviation of 15. Composite Scores have percentile ranks for further easy interpretation.

Standardisation: Conceptual Development Extensive literature reviews Intelligence theory Intellectual assessment Cognitive development Cognitive neuroscience Marketing research Expert guidance from professionals from diverse fields such as School psychology Clinical psychology Psychological assessment Psychometrics Clinical neuropsychology An Advisory Panel composed of nationally recognized experts in school psychology and clinical neuropsychology worked with the research team throughout the project. Clinical Measurement Consultants from The Psychological Corporation also assisted the research team. The Advisory Panel, Clinical Measurement Consultants, and Examiners guided each of the major stages of research Pilot National tryout Standardization In the preliminary stage, 45 assessment professionals from 8 major cities met in a focus group with members of a marketing research firm to assist in the formulation of the scales working blueprint. A telephone survey (n=308) was conducted with users of the WISC-III as well as with professionals in child and adolescent development. The research team, advisory panel, and clinical measurement consultants reviewed the feedback from the focus groups and telephone survey. Based on their findings, the working blueprint was established and the first research version of the scale was developed for the initial pilot study.

17

Semi-structured surveys of other experts and examiners were conducted at all stages of test development. These surveys allowed experts and examiners to rate the research versions of the scale on the following qualities: Developmental appropriateness User-friendliness Clinical utility

Standardisation: Pilot Stage The main goal of the pilot stage was to produce a version of the scale for use in the national tryout stage. Five pilot studies were conducted (n=255, 151, 110, 389, and 197). Each of these pilot studies used a research version of the scale which included retained subtests of WISC-III and new, experimental subtests. Research questions focused on aspects such as: Content and relevance of items Adequacy of subtest floors and ceilings Clarity of instructions to the examiner and child Identification of response processes Administration procedures Scoring criteria Item bias

Standardisation: National Tryout Stage The national tryout utilized a version of the scale with 16 subtests. Data were obtained from a stratified sample of 1270 children, who reflected key demographic variables in the national population. Age Sex Race Parent educational level Geographic region Research questions from the pilot phase were reexamined, and additional issues were addressed. Sequencing of items Item difficulty estimations Exploratory and confirmatory factor analyses Data collected from a number of special groups Children identified as intellectually gifted Children with mental retardation or learning disorders Children with ADHD

18

Oversample of 252 African American children and 186 Hispanic children was collected to examine item bias

Standardisation: Standardisation Stage After reviewing the accumulated evidence from the pilot and national tryout studies, a standardization edition of the WISC-IV was created. All preceding research questions were reexamined Additional research questions Derivation of norms Provision of evidence of Validity Reliability Clinical Utility The WISC-IV normative data was established using a sample collected from August 2001 to October 2002. The sample was stratified on key demographic variables (i.e., age, sex, race/ethnicity, parent education level, and geographic region) according to the March 2000 U.S. Census data. Data were obtained from a stratified sample of 2,200 children aged 6:0-16:11 200 children x 11 age groups (4 month span) 100 girls and 100 boys in each group Data were collected from samples of children from various special groups Standardisation: Final Scale Members of the research team, the advisory panel, and clinical measurement consultants again evaluated the psychometric results of the standardization studies along with reviews completed by experts and examiners. Based on the cumulative evidence from the entire research program, the final test framework was determined in order to assemble and evaluate the final, published version of the scale.

Standardisation: A word on Item Analysis To guarantee that a sufficient number of quality items remained after item selection, WISC-IV research editions included more items than were necessary for the final subtests. Items were evaluated throughout the development process and retained, modified, or deleted. Thus, the subtest item sets had already been evaluated on several occasions prior to the standardization of the scale. The final item selection decisions were based on data from the standardization sample Redundant items were dropped Items that were too difficult or too easy were dropped More on Item Analysis (Janda, 1998) Item difficulty

19

Because psychological tests are designed to measure individual differences, it is important that each item has a difficulty level greater than 0 and less than 100 %. Often denoted as p, difficulty level is simply the proportion of people who answer a question in the correct, or keyed direction. The ideal p value is halfway between the expected lowest and highest scores on a test. Generally, p values are optimally .50, but range from .30 to .70. In multiple-choice format items with four options, optimal p values are .62. Why? (The expected lowest score on a multiple-choice format item is not 0.) In contrast, sometimes, the purpose may be such that a p value of .25 is optimal. This is to make fine discriminations at higher levels of ability.

Item discriminability Who is likely to answer it correctly (in the keyed direction) and who is likely to answer it incorrectly (in the non-keyed direction)? The index of discrimination D: First, identify two groups of peoplelow scorers on the test as a whole. That is, 27% of the test-takers with lowest scores (A), 27% with the highest (B). D = B A. D can range from -1 to 1. Higher the D, better the item discriminability Item discriminability can be ascertained through contrasting groups MMPI authors administered a large pool of items to a normative group (people with no history of a psychological disorder) and various criterion groups (e.g., those diagnosed with schizophrenia). MMPI authors selected items based on their ability to distinguish a normative group from a particular pathological group.

Uses and Application of the Wechsler Intelligence Scales The scales have demonstrated their clinical utility for such purposes as the Identification of mental retardation Identification of learning disabilities Placement in specialized programs Clinical intervention Neuropsychological evaluation

Wechsler Adult Intelligence Scale (WAIS-III) The WAIS-III standardization sample consisted of 2450 adults divided into 13 age groups from 16-17 to 85-89. Stratified sample based on 1995 US Census data. Wechsler Preschool and Primary Scale of Intelligence (WPPSI-III)

20

Age range: 2 years 6 months to 6 years of age

Criticism of Wechslers Intelligence Scales Incorporates modern multidimensional theories of intelligence. It does not incorporate the concept of multiple intelligences: For example, Howard Gardners Multiple Intelligences: Linguistic intelligence Logical-mathematical Body-kinesthetic Spatial Musical Interpersonal Intrapersonal Naturalistic Philosophical/existential

Another example of contemporary intelligence theory not acknowledged in the Wechsler Scale is Robert Sternbergs Triarchic Model (analytical/academic intelligence, practical intelligence, and creative intelligence) or Sternbergs WICS model (Wisdom Intelligence Creativity Synthesized). The Wechslers intelligence scales are an excellent measure of academic or analytical intelligence.

Das könnte Ihnen auch gefallen