Sie sind auf Seite 1von 7

This article was downloaded by: [University of Otago] On: 31 December 2014, At: 11:16 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK The Journal of Experimental Education Publication details,

The Journal of Experimental Education

Publication details, including instructions for authors and subscription information:

The Advanced Raven’s Progressive Matrices

Steven M. Paul a a University of California, Berkeley Published online: 16 Apr 2014.

To cite this article: Steven M. Paul (1986) The Advanced Raven ’ s Progressive Matrices, The Journal of Experimental Education, 54:2, 95-100, DOI: 10.1080/00220973.1986.10806404

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Downloaded by [University of Otago] at 11:17 31 December 2014

The Advanced Raven's .Progressive Matrices: Normative Data for an American University Population and an Examination of the Relationship with Spearman's g

STEVEN M. PAUL

University of California, Berkeley

ABSTRACT

Normative data for the Advanced Raven's Pro- gressive Matrices are presented based on 300 University of California, Berkeley, students. Correlationswith the Wechsler Adult Intelligence Scale and the Terman Con- cept Mastery Test are reported. The relationship be- tween the Advanced Raven's Progressive Matrices and Spearman's g is explored.

THE RAVEN'S PROGRESSIVE MATRICES (RPM) are the best known and most widely used culture reduced tests of mental ability. British geneticist Lionel Penrose and British psychologist 5. C. Raven were the first to present perceptual analogy 'and inductive reasoning problems in the form of a matrix. In their matrices the perceptual analogies simultaneously involve both hori- zontal and vertical transformations. The variety of fig- ures, relationships, and transformations are virtually limitless. Figures may increase or decrease in size, ele- ments may be added or subtracted, shaded or unshaded, flipped, rotated, mirror imaged, or show many other progressive changes in pattern. In each case, the lower right corner of the total matrix is missing, and the sub- ject must select the best one of the six or eight multiple- choice alternatives to fill the empty corner.

Raven described the Progressive Matrices as "a test of a person's present capacity to form comparisons, rea- son by analogy, and develop a logical method of think- ing, regardless of previously acquired information'? (Raven, 1938, p. 12). He was responsible for publishing the first Progressive Matrices Test and its subsequent improvements and extensions (Raven, 1938, 1947, 1960). There are three forms of the RPM now in use:

Standard Progressive Matrices (SPM), Colored Pro- gressive Matrices (CPM), and Advanced Progressive Matrices (APM). Considerable research has been con- ducted involving the SPM and CPM, but little informa- tion is available concerning the APM. Adequate stan- dardization norms are lacking in the United States. Re- search is notably absent in relation to university stu- dents, the type of population the APM is best suited to measure. The Standard Progressive Matrices consists of 60 items grouped in five sets (A, B, C, D, E) of 12 items each. Each set involves different principles of matrix transformation and within each set the items become progressively more difficult. It was designed to cover the widest possible range of mental ability and to be equally useful with persons of all ages, whatever their education, nationality, or physical condition. The scale is intended to cover the entire range of intellectual devel- opment starting with the time a child is able to grasp the idea of finding a missing piece to complete a pattern. It is sufficiently long to assess a person's maximum capac-

Downloaded by [University of Otago] at 11:17 31 December 2014

JOURNAL OF EXPERIMENTAL EDUCATION

ity to form comparisons and reason by analogy without being overly taxing or unwieldy. A person's total score provides an index of his intellectual ability. The scores obtained by adults tend to cluster in the upper half of the scale. The Colored Progressive Matrices, Sets A, Ab, and B, were devised as a test for young children and old peo- ple, for anthropological studies, and for clinical work. It can be used with people who, for whatever reason, cannot understand or speak the English language, suffer from physical disabilities, are intellectually subnormal, or have deteriorated. To make the test independent of verbal instructions, the problems are printed on colored backgrounds and the scale is arranged so that it can be presented in the form of illustrations printed in a book or as boards with movable pieces. Success in Set Ab de- pends on the comprehension of discrete figures as spa- tially related "wholes" and in combination with Sets A and B adequately covers the cognitive processes of which children under 11 years of age are usually capable.

The Advanced Progressive Matrices, Sets I and 11, were constructed as a test of intellectual efficiency that can be used with people of more than average intellec- tual ability and that will differentiate clearly between in- dividuals of even superior ability. The difficulty level of the APM is such as to make it unsuitable for persons scoring below a raw score of about 50 on the SPM. For the general adult population the APM has too small a range of scores to be useful. The APM is intended for intellectually superior youths and adults, university stu- dents, and others for whom the SPM is too easy. The APM was originally created in 1943for use'at the War Office Selection Boards. In 1947 a revision was prepared for general use as a nonverbal test of intellec- tual efficiency with which a person is able to form com- parisons between figures and to develop a logical method of reasoning. Based on the experimental work

with the 1947 edition of Foulds (Foulds & Raven,

and an item analysis carried out by Forbes (1964), the 1962 edition of the APM dropped 12 problems that made no contribution to the score distributions for adults of more than average intellectual ability from Set I1 and arranged the remaining problems in order ac- cording to the frequency with which they were solved as the total score on the revised set increased from 0 to 36. Raven arranged the 1962 edition so that it could be used without a time limit, in order to assess a person9s total capacity for observation and clear thinking, or with a time limit, in order to assess the examinee's in- tellectual efficiency. It consists of two sets of tests. In Set I there are 12 problems designed to introduce a per- son to the method of working and cover all the intellec-

tual processes needed for success in Set 11. The 36 prob- lems in Set I1 are identical in presentation and argument with those in Set I. They only increase in difficulty more The mean total score for the sample of 300 students

steadily and become considerably more complex.

15 minutes was given to com-

the SPM. Subjects were instructed to put some answer down for every question and were given a loose time limit of 1 hour. If the subject was not finished in an

Each subject was tested individually. The basic pro- cedure of the matrices test was explained by the ex- perimenter using examples (problems A1 and C5) from

To assess a person's total capacity for observation and clear thinking, Raven suggests that the examinee be shown the problems of Set I as examples to explain the principle of the test. The subject can then be allowed to work through Set I1 at his own speed from beginning to end without interruption. To assess a person's intellec- tual efficiency, Set.1can be given as a short practice test followed by Set 11 as a speed test. The most common time limit is 40 minutes. Examination of the literature reveals a preference for the administration of the APM without a time limit. Yates, in particular, states that even the shorter 1962 edition has not overcome the problem of power and speed contamination when given a 40-minute time limit. In a study involving 960 freshman university students, he found that the number of persons not attempting to solve problems increases with the later items of the test. Consequently, the difficulty levels of the items cannot be determined (Yates, 1966). Unlimited working time practically eliminates the number of items not attempted and enables a determination of the true difficulty of each item, unconfounded by differences in speed of working. This present study was undertaken to provide nor- mative information about the APM administered to an American university population. Comparisons to other mental ability tests are presented and the relationship between the APM and Spearman's g is explored.

Method

Subjects

Three hundred students (190 female, 110 male) from the University of California, Berkeley, served as sub- jects. Their average age was 252 months (21 years) with a standard deviation of 32 months.

Procedure

1950)

hour

an additional 10 to

plete the test. A subject's score was the total number of items answered correctly. One hundred fifty of the subjects were also individu- ally given the Terman Concept Mastery Test (CMT), a high level test of verbal ability. A different set of 62 sub- jects out of the 300 were also individually administered the Wechsler Adult Intelligence Scale (WAIS).

was 27.0 with a standard deviation of 5.14. The median

Downloaded by [University of Otago] at 11:17 31 December 2014

PAUL

97

total score was also 27.0. The mean total score of the normative group of 170university students presented by Raven (1965) was only 21 (SD = 4). Gibson (1975) also found data on the APM which were significantly higher than the published universitynorms. The mean total score of 281 applicants to a psychology honors course at Hat- field Polytechnic in Great Britain was 24.28 (SD = '4.67). Table 1 presents the absolute frequency, cumulative frequency percentile, t score, and normalized t score for the total APM score values based on the sample of 300 students. The 95th percentile corresponds to a total score between 34 and 35 for this sample. The 95th per- centile value based on Raven's normative group with similar ages is between 23 and 24. The Berkeley sample scored much higher overall than the normative sample of Raven's 1962 edition of the APM. The internal consistency reliability based on the Kuder-Richardson formula (KR-20) is .83. That is, ap- proximately 83% of the variance in total test scores is attributable to true score variance, i.e., to what the APM is actually testing. There is strong agreement between the rank order of the items, according to the frequency with which they are solved, presented by Raven and those determined for this sample (r = .94). However, there is one note- worthy exception. The item Raven ranked 13th turned out to be much more difficult for the Berkeley students than would have been expected. It ranked as only the 22nd most frequently solved item. The item involves changes in three variables: object shape (diamond, square, circle), number of internal lines (one, two, three), and slant of internal lines (45", 90°, 135"). The majority of subjects who did not choose the correct response (#2) were attracted to a distractor (#5) that ignored the necessary change in the slant of the internal lines. Information beyond what is provided by just total score values can sometimes be found in an examination of the incorrect responses to the APM (Thissen, 1976). Selection of distractor items, incorrect multiple-choice alternatives, for each of the problems of the APM was examined to determine if patterns developed that would aid in the discrimination between subjects. Two sub- groups of the total sample of 300 were formed. The low group came from the bottom 24th percentile receiving total scores less than or equal to 23 (n = 72). The high group comprised those in the top 26th percentile who scored greater than or equal to 31 (n = 78). A com- parison was made between the two groups to see if dis- tractors chosen by the high group were different from or perhaps better (i.e., closer to the correct response) than the incorrect responses chosen by the low group. No dif- ferences between the two groups were found. Unlike most studies of the Raven's Progressive Ma- trices, a significant difference (a = .05) was found be- tween the average total score of males and females. In

TABLE 1-Absolute

Frequency, Cumulative Frequency Percentile,

t Score, and Normalized t Score for Total APM Score Values (N = 300)

 

Cumulative

Total

Absolute

frequency

Normalized

score

frequency

percentile

t score

t score

this sample the males (M = 28.40, SD = 4.85, n = 110) outscored the females (M = 26.23, SD 5.11, n = 190). Four percent of the variance in APM total scores can be explained by the differences in sexes. The sex differ- ences occasionally reported in the literature are thought to be attributable to sampling errors. No true sex dif- ferences have been reliably demonstrated (Court & Ken- nedy, 1976). One hundred fifty of the Raven's testees were also in- dividually given the Terman Concept Mastery Test. There was a moderate positive relationship (r = .44) be- tween the total scores on the two tests (APM: M = 27.24, SD = 5.14; CMT: M = 81.69, SD = 32.80). Sixty-two of the subjects were also administered the WAIS. Full Scale IQ scores of the WAIS correlated .69 with the APM total scores. Correcting this correlation for restriction of range, based on the population WAIS IQ SD of 15, by the method given by McNemar (1949,

p. 127), the correlation becomes. 84 (APM: M = 28.23,

SD = 5.08; WAIS: M = 122.84, SD = 9.30). In a similar study, McLauren et al. (1973) reported a correlation of .55 (.74 corrected for restriction of range) between the APM and the WAIS based on 131 students at the University of Alabama in Birmingham.

These results indicate that the APM, CMT, and the WAIS are tapping some of the same general ability. The possible nature of that ability is examined in the follow- ing section.

Downloaded by [University of Otago] at 11:17 31 December 2014

98

JOURNAL OF EXPERIMENTAL EDUCATION

Spearman's g

One of the most solidly established phenomena in psychology is that scores on all mental ability tests, no

matter how diverse the mental skills or areas they cover, are positively intercorrelated when they are obtained in

a representative sample of the general population. It

was Spearman who first hypothesized that there is some "general factor" of mental ability that is measured in common by all of the intercorrelated mental tests. He gave the label "g" to this general factor. Spearman developed the mathematical method known as factor analysis which enabled him to extract the g from all the intercorrelations among a collection of di- verse tests and show the correlation between each test and the hypothetical general ability factor. The correla- tion of a particular test with the g factor common to all tests in the analysis is called the test's g loading. The

square of a test's g loading indicates the proportion of the total variance in the scores on the test that is due to individual differences in this general ability. It is important to note that the g factor may not show up on some tests given to highly selected groups, such as the often tapped pool of university students, although these tests show moderate g loadings when given to the general population. The explanation is that these groups have already been highly selected on g-loaded tests, such

as college entrance exams, and therefore their scores in-

dicate less individual variation on the g factor. This limits the intercorrelations among the various tests and thereby prevents the g factor from showing up strongly in a factor analysis of the matrix of intercorrelations. Spearman originally hypothesized that each test meas- ures only g plus some specific ability, s, which is tapped

only by the particular test. This theory that any given test score is composed of only g + s, as well as measure- ment error, was soon refuted by the finding that there

are other common factors besides g in many mental ability tests. However, they cannot be considered gen- eral factors because they do not enter into all tests, as does g, but do enter only into certain groups of tests. In

a factor analysis of a large number of various mental

tests, the first unrotated factor (or principal component)

is g or general mental ability. It usually accounts for

almost half of the total variance in a large battery of di-

verse tests. The several other smaller factors, the group factors, show highly differential loadings on tests that are often characterized as verbal, numerical, spatial, or involving memory. Factor analysis by itself does not and cannot explain the basis for the existence of g. Spearman himself stated that factor analysis cannot reveal the essential nature of g but only reveals where to look for it. Examination of the characteristics of a wide variety of tests in connec- tion with their g loadings can provide some descriptive generalizations about the common features that charac-

tests that have relatively high g loadings as com-

pared with tests that have relatively low g loadings. Spearman originally tried to get at the psychological nature of g by factor analyzing more than 100 tests, each fairly homogeneous in content, and then compar- ing their g loadings (Spearman & Jones, 1950). He char- acterized the most g-loaded tests essentially as those re- quiring "the eduction of relations and correlates," that is, perceiving relationships, inducing the general from the particular, and deducing the particular from the general. Such tests require inductive or inventive as con- trasted to reproductive or rule-applying behavior. The most g-loaded test in the whole battery was the Raven's Progressive Matrices (RPM), which, as previously men- tioned, depends almost entirely on perceiving key features and relationships and discovering the abstract rules that govern the differences among the elements in the matrix. There is much more test material available now than was available to Spearman more than 50 years ago. This had led to broader generalizations about g. The g factor is manifested in tests to the degree that they involve mental manipulation of the input elements, choice, deci- sion, invention in contrast to selection, meaningful memory in contrast to rote memory, long-term memory in contrast to short-term memory, and distinguishing relevant information from irrelevant information in solving complex problems (Jensen, 1979). Task comple~rityand the amount of conscious mental manipulation required seem to be the most basic deter- minants of the g loading of a task. There are many ex- amples in which a slight increase in task complexity is accompanied by an increase in the g loading of the task. Virtually any task involving mental activity that is com- plex enough to be recognized as involving some kind of conscious mental effort is substantially g loaded. It is the task's complexityrather than its content that is most related to g. An almost infinite variety of test items, re- gardless of sensory modality, substantive or cultural content, or the form of effector activity involved in the required response, is capable of measuring g. This ob- servation led to Spearman's principle of "the indiffer- ence of the indicator," meaning that the manifestation of g is not limited to any particular types of information or item types. Previous research suggests that the Raven's Pro- gressive Matrices administered to the general population measures g and little else (Burlte, 1958). The occasional loadings found on other factors, independently of g, are mostly trivial and inconsistent from one analysis to another. Although many other tests measure g to a similar extent, unlike the Raven, they also have loadings on the major group factors such as verbal, numerical, spatial, and memory. The RPM does not measure per- ceptual ability or spatial-visualization ability as is com-

terize

Downloaded by [University of Otago] at 11:17 31 December 2014

PAUL

99

monly believed. In fact, the Raven has very small load- ings on these factors, when g is excluded. Factor analysis of the RPM at the item level should result in only a single factor. Some investigations that have found more than one factor have employed im- proper orthogonal rotations of the principal com- ponents. This method can artificially create the ap- pearance of several factors even in correlation matrices that are artificially constructed so as to contain only one factor plus random error (Jensen, 1980). Some of the small spurious factors that emerge from factor analysis of the inter-item correlations are not really ability fac- tors at all but are "difficulty" factors, due to varying degrees of restriction of variance on items of widely dif- fering difficulty levels and to nonlinear regression of item difficulties on age and ability (McDonald, 1965). When these psychometric artifacts are taken into ac- count, the RPM seems to measure only a single factor of mental ability, which can be termed g. The APM test results of this study were factor analyzed at the item level. The first principal factor of the inter- correlation matrix of the 36 items, scored correct or in- correct, accounts for 15% of the total inter-item vari- ance. A factor loading correlation matrix was created from this first principal factor and subtracted from the original inter-item correlation matrix. The resulting residual matrix was tested and found not significantly different from zero at a! = .05. Therefore,the APM can be considered to measure only one factor. That this fac- tor only accounts for 15% of the total inter-item vari- ance indicates that the variance of each item is due mostly to uniqueness, that is, item specificity and error. The items are not highly intercorrelated. However, what they do have in common may indeed be Spearman's g. The first principal factor should not be considered just a difficulty factor. The correlation between the item loadings on the first principal factor and item difficulty levels (percent passing) is - .36. (Only 13% of the vari- ance in the first principal factor loadings can be explained by differences in item difficulty.) The correlation be- tween the item loadings on the first principal factor and item variance is .41. (Sixteen percent of the variance in the first principal factor loadings can be explained by differences in item variance.) The correlation between item difficulty and item variance is - 37. When item variance is held constant, i.e., partialled out, the cor- relation between item loadings on the first principal fac- tor and difficulty levels is .01. The loadings of each item with the first principal fac- tor and the correlations of each item with total score are shown in Table 2. The total score on the APM can be considered a reasonable measure of general mental abil- ity. This notion is at the very least intuitively appealing. There is near perfect agreement between the correlations of each item with total score and the correlations of each item with the hypothesized g factor. The correlation be-

tween the 36 item by point biserial correlations and the 36 g loadings with the effect of item variance partialled out is .99. This evidence supports the claim that the first principal factor of this analysis is not just a difficulty factor and that it measures general mental ability. Further evidence that the APM measures g comes from a closer look at the relationship between the APM and the WAIS. Two subgroups consisting of 13 items each and matched on item variance were created from the 36 APM test items. The high g group had an average g loading of .46 and an average item variance of .15. The low g group had an average g loading of .27 and an average item variance of .IS. Correlations were obtained

TABLE 2-APM

Principal Factor (N = 300)

Item Correlations with Total Score and First

 

First

First

 

Total

principal

Total

principal

Item

score

factor

Item

score

factor

1

.08

.04

19

.30

.26

2

.20

.24

20

.27

.23

3

.25

.29

21

.50

.49

4

.38

.43

22

.41

.36

5

.34

.37

23

.56

.57

6

.17

.17

24

.53

.50

7

.24

.24

25

.37

.30

8

.35

.37

26

.36

.30

9

.34

.41

27

.50

.44

10

.34

.37

28

.45

.38

11

.30

.34

29

.50

.44

12

.42

.46

30

.45

.38

13

.24

.18

31

.47

.41

14

.30

.29

32

.39

.33

15

.24

.20

33

.43

.37

16

.45

.47

34

.53

.51

17

.39

.36

35

.49

.45

18

.35

.31

36

.43

.37

based on the 62 subjects who took both tests. The cor- relation between the high g items and WAIS Full Scale IQ scores was .67. Low g items and WAIS Full Scale IQ scores correlated 56. Although the two correlations are not significantly different from each other (t = 1.39, a = .05), a trend was apparent. Despite the fact that the items of the APM and the WAIS are drastically dif- ferent in content, those items correlating highest with the hypothesized g factor derived from the APM show a stronger relationship to WAIS Full Scale IQ scores than those items with a low correlation with the hypothesized APM g. Since WAIS Full Scale IQ scores have been shown to be highly g loaded in previous research (Matarazzo, 1972; Jensen, 1980),the pattern found here can be interpreted to indicate that the hypothesized g of the APM is the same g that is measured by the WAIS. In summary then, the distribution of scores for a large cross section of University of California students on the Advanced Raven's Progressive Matrices is markedly higher than the estimated score distribution of

Downloaded by [University of Otago] at 11:17 31 December 2014

JOURNAL OF EXIPERXMENTALEDUCATION

university students that accompanies the 1962 version of the test. A moderate 'positive association exists between the APM and the Terman Concept Mastery Test. There is an even stronger positive relationship between the APM and the Wechsler Adult Intelligence Scale. Ex- amination of the internal factor structure of the items of the test indicate that the APM measures only one factor. This factor is not just a difficulty factor. The results support the notion that the APM provides a measure of Spearman's g.

REFERENCES

Burke, H. R. (1958). Raven's Progressive Matrices: A review and crit- ical evaluation. Journal of GeneticPsychology, 93, 199-228. Court, J. H., & Kennedy, R. J. (1976). Sex as a variable in Raven's Standard Progressive Matrices. Proceedings of the 21st Interna- tional Congress of Psychology, Paris, France. Forbes, A. R. (1964). An item analysis of the Advanced Matrices. British Journal of Educational Psychology, 34, 1-14. Foulds, G. A., & Raven, J. C. (1950). An experimental survey with Progressive Matrices (1947). British Journal of Educational Psy- chology, 20, 4-10. Gibson, H. B. (1975). Relations between performance on the Ad- vanced Matrices and the EPI in high-intelligence subjects. British Journal qf Social and Clinical Psychology, 14, 363-369.

Jensen, A. R. (1979). g: Outmoded theory or unconquered frontier? Creative Science and Technology,2, 16-29. Jensen, A. R. (1980). Bias in mental testing. New York: The Free Press. Matarazzo, J. D. (1972). Wechsler's measurement and appraisal of adult intelligence (5th ed.). Baltimore: Williams & Wilkins. McDonald, R. P. (1965). Difficulty factors and nonlinear factor anal- ysis. British Journal of Mathematical and Statistical Psychology, 18, 11-23. McLaurin, W. A., Jenkins, J. F., Farrar, W. E., & Rumore, M. C. (1973). Correlation of IQ's on verbal and nonverbal tests of in- telligence. Psychological Reports, 33, 821-822. McNemar, Q. (1949). Psychological statistics. New York: Wiley. Raven, J. C. (1938). Progressive Matrices: A perceptual test of intel- ligence, 1938, Individual form. London: H. K. Lewis. Raven, J. C. (1947). Coloured Progressive Matrices. London: H. K. Lewis. Raven, J. C. (1960). Guide to the Standard Progressive Matrices. London: H. K. Lewis. Raven, J. C. (1965). Advanced Progressive Matrices Sets I and II. London: H. K. Lewis. Spearman, C., & Jones, L. L. W. (1950). Human ability. London:

Macmillan. Thissen, D. M. (1976). Information in wrong responses to the Raven Progressive Matrices. Journal of Educational Measurement, 13,

201-214.

Yates, A. J. (1966). A note on Progressive Matrices (1962). Australian Journal of Psychology, 18, 281-283.