Sie sind auf Seite 1von 9

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/301354312

The Development of a Test on Critical Thinking

Conference Paper · August 2008

CITATIONS READS
0 322

2 authors:

Fe Josefa Nava Jaime Jose Guanzon Nicdao


University of the Philippines Ateneo de Manila University
31 PUBLICATIONS   12 CITATIONS    1 PUBLICATION   0 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

ICT IN EDUCATION View project

criterion-related and construct-related validity of a parental involvement scale View project

All content following this page was uploaded by Fe Josefa Nava on 18 April 2016.

The user has requested enhancement of the downloaded file.


The Development of a Test on Critical Thinking: A Test Development Project*

Jaime Jose G. Nicdao, MA


Ateneo de Manila University

Fe’ Josefa G. Nava, Ph.D.


Cesar Baltazar
Sara Ma
University of the Philippines - Diliman

The measurement of psychological traits of students such as critical

thinking is important to educational practice.

Critical thinking is described as purposeful, reasoned, and goal directed

thinking – the kind of thinking involved in solving problems, formulating

inferences, calculating likelihoods, and making decisions (Halpern, 1989).

Experts agree that “critical thinking is a purposeful, self-regulatory

judgment which results in interpretation, analysis, evaluation, and inference, as

well as explanations of the evidential, conceptual, methodological, criteriological,

or contextual considerations upon which judgment is based. It is essential as a

tool of inquiry (American Philosophical Association, 1990).

It is a liberating force in education and a powerful resource and in one’s

personal and civic life (Facione, 1998). In this sense, critical thinking cuts

across specific subjects or disciplines. It has applications in all areas of life and

learning (Halpern, 1989).

This paper describes the procedures followed in the development of a

Critical Thinking Test.

Through a review of literature important components of critical thinking

were identified.
st
*Paper presented in the 1 National Conference on Educational Measurement and Evaluation (NCEME) –
Developing a Culture of Assessment in Learning Organizations, August 6-7 2008, College of St. Benilde –
International Conference Center
The Delphi Report, Critical Thinking: A Statement of Expert Consensus

for Purposes of Educational Assessment and Instruction (Facione, 1990), cited

what skills constitute the core of critical thinking. It explained that experts were

virtually unanimous (N>95%) on including analysis, evaluation, and inference as

central to critical thinking. It said that strong consensus (N>87%) exists that

interpretation, explanation and self-regulation are also central to critical thinking.

On the other hand, the California Critical Thinking Skills Test (Facione, P., et. al.

1990, 1998 ) provided scores on sub-scales named analysis, inference,

evaluation, deductive reasoning and inductive reasoning.

The current study has adopted the California Critical Thinking Skills

Test categories for a reason. It seems that the skill categories cover analysis,

inference, and evaluation which provide alternative ways of dealing with the

challenging situations of today while at the same time it includes the time-tested

ways of reasoning which are deduction and induction.

In this context, the CCTST Form 2000 defines the skill categories in

this way: Inductive reasoning happens when a person decides that the evidence

at hand means that a given conclusion is probably true. Deductive reasoning

happens when a person decides that, no matter what, it is impossible that the

conclusion he is considering is false, given that the evidence at hand means that

a given conclusion is probably true. Analysis is pulling apart arguments and

points of view to show why a person thinks what he or she thinks. It is separating

the premises and the assumptions a person is using from the claim or the

conclusion that the person is reaching. Inference happens when a person draws

conclusions based on reasons and evidence. The person might be using his

deductive reasoning inference skills or his inductive reasoning skills. He can

2
apply to all sorts of things including beliefs, opinions, facts, conjectures,

principles, and assumptions. Evaluation happens when a person decides how

strong or how weak another person’s arguments are, or when he determines the

believability of a given statement.

The Table of Specifications contained five subparts namely Analysis,

Inference, Evaluation, Deduction, and Induction. Analysis was subdivided into a)

able to distinguish relevant information from irrelevant information to solve a

problem b) able to identify supporting details for the main idea, and c) able to

identify cause and effect relationship. Inference was subdivided into a) able to

see the implications of a position someone is advocating, or drawing out meaning

from the elements in a situation b) able to determine a person’s trait or emotion

based on details given, and c) able to make appropriate prediction based on the

information given. Evaluation was subdivided into a) able to judge if a given idea

or opinion is relevant or applicable b) able to make fair judgments as an impartial

observer, and c) able to assess alternatives or courses of action. Deduction was

subdivided into a) able to apply general rules to specific problems to come up

with logical answers b) able to reach a valid conclusion by examining orderly

relationships among terms (linear ordering), and c) able to reach a valid

conclusion by examining contingency relationships in “if – then” statements.

Induction was subdivided into a) able to combine separate pieces of information,

or specific answers to problems, to form general rules or conclusions b) able to

come up with logical explanation for why seemingly unrelated events occur

together, and c) able to determine if conclusion is incorrect by examining given

premises.

A test on critical thinking was developed using multiple-choice

3
format. From a review of literature on the domains of critical thinking, an item

pool of more than 20 questions were developed. The number of items for each

main category was set at 4 for a total of 20 items. Four scenarios were created

out of which the 20 questions were generated. The scenarios depicted situations

that grade school level children would typically encounter in school, at home, and

in the community.

The following is an example of an Analytic Question based on a given

scenario:

After school, Roger played with his neighbors. On his way home, he
discovered that the expensive ball pen given to him as a gift by his mother
was missing.

Which of the following information could best help him find his pen?

A. The names of his neighbors


B. The time he played with his neighbors
C. The time he last used the pen
D. The places at school where he went to

Here is an example of an Evaluative Question based on a given scenario:

During a test in class, Brian asked his seatmate Eric to pass a piece of
paper to Greg. Seeing Eric do it, the teacher got hold of the paper and tore
it to pieces.

If Brian is a fair student, which of the following statements would best


describe the action he will take?

A. He will keep quiet about what happened.


B. He will apologize to the teacher.
C. He will apologize to Greg.
D. He will pick up the pieces of paper.

The first pilot testing of the instrument was conducted on seventy-six (n =

76) Grade 6 students in an exclusive male private school in Metro Manila. The

reliability of the test was at a low level with a Cronbach’s Alpha value of 0.123.

4
An additional sample from a more heterogeneous group was selected. Seventy-

six (n = 76) Grade 6 pupils from a public elementary school responded to the

test.

One of the most important conditions of good psychological measures is

reliability. Reliability is the characteristic of a test that indicates the extent to

which it measures the trait with consistency. The reliability of a test is a

necessary precondition to validity. If a test is not reliable then it may not be able

to measure the trait as it should.

Reliability can be estimated using different techniques. Two easy

methods involve giving the same test twice (test-retest or stability) or giving

parallel versions of the same test (equivalence). However, these approaches

require two test administrations. In contrast, internal consistency procedures

require only one test administration.

The variability for the sample of the public school pupils can be explained

by a large total score range, standard deviation, and variance. This indicates the

heterogeneity of the public school sample.

After the test was administered to the public school students, the resulting

reliability index increased to 0.674.

Variability is an important aspect that affects measurement reliability

(Anastasi, 1982). It is the nature of the sample or the range of individual

differences on which reliability is measured. Reliability is influenced by the

degree to which respondents or examinees vary. It is often recommended that

reliability estimates are computed on a heterogeneous sample. Fan and Yin

(2001) made the following conclusions in their study: 1. sample variability with

5
regard to the trait being measured has obvious effect on measurement reliability,

with measurement reliability being reduced by group variability restriction, and 2.

group performance level also appears to affect measurement reliability, and

measurement error tends to be smaller for high-performance samples than for

low performance samples.

Validity is another vital aspect of an instrument. Validity is the

characteristic of a test that indicates the extent to which it measures what it is

supposed to measure (Anastasi, 1982).

One type of validity evidence is content validity, which is essentially an

expert’s qualitative evaluation of the items vis-à-vis the table of specifications. A

professor in psychology reviewed the items and gave comments and suggestions

to improve the instrument. Another way of establishing validity is through factor

analysis. It is a method developed to identify psychological traits that would lead

to construct validity (Anastasi, 1982). It is a refined statistical technique for

analyzing the interrelationships of behavior data. Coefficient Alpha reliability is

used to provide evidence for unidimensionality. If a test does not measure a

single construct, it cannot measure the particular construct it is intended to

measure (Cunningham, 1998).

Following the subparts in the table of specification, a 5-factor solution did

not clearly distinguish among the factors. Hence, a 4-factor solution was applied.

The four factors were: Factor 1 – Inferential Thinking (9 items), Factor 2 –

Evaluative Thinking (4 items), Factor 3 – Logical Thinking (3 items), and Factor 4

– Analytical Thinking (4 items).

Two items (Item 10 & Item 11) were deleted. These items were chosen on

the basis of their effect on the reliability; items which yielded the greatest

6
increase in the reliability coefficient when deleted. Thus the final reliability index

for the 18 items was 0.755. The final form measures four dimensions (Table 1).

Table 1. Subparts of the Final Critical Thinking Test

Subparts Objective Item # # of Items


(Factors)
1. Analytical 1a 15, 1, and 4 3
Thinking 1b 20 1
2. Inferential 2a 18, 13, and 3 3
Thinking 2b 7 and 16 2
2c 12, 14, and 2 3
3. Evaluative 3b 9 1
Thinking 3c 8 and 19 2
4. Logical Thinking 4a 6 1
4c 17 1
5a 5 1
TOTAL (Alpha = 0.755) 18

7
References

American Philosophical Association. (1990). Critical Thinking: A Statement of


Expert Consensus for Purposes of Education Assessment and Instruction.
“The Delphi Report” Committee on Pre-College Philosophy. (ERIC
document 315-423).

American Philosophical Association. (1990). Critical Thinking: A Statement of


Expert Consensus for Purposes of Educational Assessment and
Instruction. (ERIC document ED 315-423).

Anastasi, Anne. (1982). Psychological Testing 5th Edition. New York: MacMillan
Publishing Co., Inc.

Cunningham, George K. (1998). Constructing and Interpreting Tests. London:


Falmer Press.

Facione, P.A., et.al. The California Critical Thinking Skills Test. CCTST, Forms A
and B; and the CCTST Test Manual. Millbrae, CA: California Academic
Press. 1990, 1998.

Facione, P. A. & Facione, N. C. (2000). The Disposition toward Thinking: Its


Character, Measurement, and Relationship to Critical Thinking Skills.
Santa Clara CA: California:Academic Press.

Fan, Xitao & Yin, Ping. Sample Characteristics and Measurement


Reliability: An Empirical Exploration. Paper presented at the Annual
Meeting of the American Educational Research Association (Seattle, WA,
April 10-14, 2001).

Halpern, D. F. (1989). Thought and Knowledge: An Introduction to Critical


Thinking. Hillsdale, NJ: Lawrence Erlbaum Associates Publishers.

McMillan, James H. (2001). Classroom Assessment, Principles and Practice for


Effective Instruction. MA: Allyn and Bacon.

View publication stats

Das könnte Ihnen auch gefallen