Beruflich Dokumente
Kultur Dokumente
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms
Springer is collaborating with JSTOR to digitize, preserve and extend access to Educational
Technology Research and Development
This content downloaded from 202.92.130.213 on Thu, 24 Jan 2019 04:37:25 UTC
All use subject to https://about.jstor.org/terms
Validity in Quantitative Content Analysis
Liam Rourke
Terry Anderson
Over the past 15 years, educational 0 The primary role of networked computers in
technologists have been dabbling with a higher education has shifted from presenting
research technique known as quantitative structured, preprogrammed learning materials
content analysis (QCA). Although it is to facilitating communication. In turn, the role of
characterized as a systematic and objective educational technology researchers has
procedure for describing communication, expanded to include the role of communication
readers find insufficient evidence of either researcher. In the late 1980s, studies began to
quality in published reports. In this paper, it is appear that incorporated new perspectives, new
argued that QCA should be conceived of as a methods, and new techniques. One of the most
form of testing and measurement. If this promising was quantitative content analysis
argument is successful, it becomes possible to (QCA).
frame many of the problems associated with Berelson (1952) defined QCA as "a research
QCA studies under the well-articulated rubric technique for the systematic, objective, and
of test validity. Two sets of procedures for quantitative description of the manifest content
developing the validity of a QCA coding of communication" (p. 18). In this context,
protocol are provided, (a) one for developing a description is a process that includes segment-
protocol that is theoretically valid and (b) one ing communication content into units, assigning
for establishing its validity empirically. The each unit to a category, and providing tallies for
paper is concerned specifically with the use of each category. Bullen (1998), for instance, stud-
QCA to study educational applications of ied participation and critical thinking in an onl-
computer-mediated communication. ine discussion by counting the number of times
each student contributed to the discussion and
by assigning each contribution to one of three
categories of critical thinking.
By 1999, enough of these types of studies had
been conducted in the field of educational tech-
This content downloaded from 202.92.130.213 on Thu, 24 Jan 2019 04:37:25 UTC
All use subject to https://about.jstor.org/terms
6 ETR&D, Vol. 52, No 1
This content downloaded from 202.92.130.213 on Thu, 24 Jan 2019 04:37:25 UTC
All use subject to https://about.jstor.org/terms
VALIDITY IN QUANTITATIVE CONTENT ANALYSIS 7
definition
simply written down content that wasis deliberately
already general because
stored the
in memory in more or less the form presented here. (p.
class of things encompassed by the term test is
13)
diverse. It subsumes an assortment of proce-
dures and aims that range from standardized
In this argument, it becomes apparent that achievement tests to teacher-made multiple-
assessing the surface characteristics of written
choice quizzes, from published personality
composition-something the writing teacher inventories to researchers' questionnaires, and
does-and measuring the cognitive processes beyond.
that underlie written composition-something a
One specific form of testing that Crocker and
cognitive psychologist might want to do-are
Algina (1986) depicted is "a standard schedule
two different things. Bereiter and Scardemalia
and a list of behaviors that can be used by an
(1987) rejected the possibility of learning about
one through descriptions of the other. For the observer who codes behavior displayed by sub-
content analyst engaged in a purely descriptive jects in a naturalistic setting" (p. 4). This depic-
on how many times they formulate a proposition nitions, which they were to look for in the mes-
that proceeds from previous statements. sages that students posted to an educational
Psychometricians would disagree. Much of computer conference. Similarly, the cognitive
test theory and the day-to-day practice of testing skills section of Henri's (1991) coding protocol
consists of 18 behaviors that coders seek in the
and measuring is the attempt to make accurate
judgments about unobservable constructs based paragraphs within students' postings to an onl-
ine educational discussion (Hara, Bonk, &
on observable behavior: Candidates' potential
for success in graduate school is gauged through Angeli, 2000). Each of the elements and pro-
their scores on paper-and-pencil aptitude tests; cesses in Crocker and Algina's depiction of test-
experimental subjects' attitudes are assessed ing is apparent in these two examples. The
using researchers' questionnaires; and appli- naturalistic setting in both examples is the online
class discussion. The lists of behaviors are the
cants' suitability for jobs is predicted using per-
sonality measures, for instance. 16- and 18-item lists that are purportedly indica-
Test theory accepts Bereiter and Scarde- tive of critical thinking and cognitive skills as
these constructs manifest themselves in online
malia's (1987) argument as fundamental, and
prescribes that if there is a gulf between that discussions. These lists are provided to the
which one wishes to study and that which is observers or, as they referred to in QCA, coders
directly observable, then some sort of or raters; and the standard schedules that accom-
correspondence between the two must be estab- pany these protocols are the individual mes-
lished before inference begins. The standards of sages or paragraphs within the messages in the
testing and measurement corollary to Messick's online discussion. Note that physical or syntacti-
(1989) definition of validity set out the steps for cal units of analysis (e.g., conference messages
accomplishing this. To begin our discussion of or paragraphs within the messages) replace tem-
these steps, we will first show that QCA is a poral units (e.g., 30-sec intervals) when observa-
form of testing and measurement. tion moves from the face-to-face classroom to
A test according to Crocker and Algina (1986) is cific process that Crocker and Algina portrayed.
"a standard procedure for collecting a sample of Add to this the corollary process of measure-
behavior from a specified domain" (p. 4). Their ment-assigning numbers to properties of objects
This content downloaded from 202.92.130.213 on Thu, 24 Jan 2019 04:37:25 UTC
All use subject to https://about.jstor.org/terms
8 ETR&D, Vol. 52, No. 1
[The messages posted by] the students were sorted known how to proceed in a valid manner. Locat-
into one of three categories of critical thinking. The cat- ing QCA under the rubric of testing and mea-
egories and corresponding scores were as follows: (3)- surement provides a well-articulated model of
extensive use of critical thinking skills (2)-moderate how to move, in a defensible manner, from fre-
use of critical thinking skills (1)-minimal use of critical
quency counts of directly observable behavior to
thinking skills. (p. 14)
insights about the complex constructs that they
allegedly signify. The basic elements of this
The objects or events that Bullen focused on model are presented in the next section.
were the messages posted by undergraduate
students to their computer conference. The rules
that were used to regulate the assignment of DEVELOPING A THEORETICALLY VALID
numbers to these messages included three PROTOCOL
things: (a) an overarching theory of critical
thinking (Norris & Ennis, 1989) with which The steps to developing a theoretically valid
observers were familiarized, (b) a 16-item set of
protocol, are:
behaviors indicative of how three levels of criti-
* Identifying the purpose of the coding data
cal thinking manifest themselves in text-based
online discussion, and (c) an incremental num- * Identifying behaviours that represent the
construct
bering system corresponding to the hierarchical
conceptualization of critical thinking. The man- * Reviewing the categories and indicators
ner in which numbers were assigned to the
* Holding preliminary tryouts
students' messages reflected differences in the
types of critical thinking, and subsequent fre- * Developing guidelines for administration,
quency counts of each of the categories reflected scoring, and interpretation of the coding
differences in the amount of critical thinking scheme
This content downloaded from 202.92.130.213 on Thu, 24 Jan 2019 04:37:25 UTC
All use subject to https://about.jstor.org/terms
VALIDITY IN QUANTITATIVE CONTENT ANALYSIS 9
pretation
nitive, and metacognitive models; it isof
dimensions simply
com-conventional.
However, there are implications for both of
munication in computer conferences.
these decisions and therefore they should be
Identifying the purpose of the data informs
made thoughtfully.
decisions about scaling, score interpretation
The purpose of
models, and the types of validity the coding data
evidence thatalso deter-
mines the
are required. QCA protocols usedtypes in
of validity
CMC evidence
stud-that need to
be gathered.
ies typically use nominal scales ofIn measurement.
keeping with the previous dis-
cussion, Cronbach
In these scales, numbers are used to (1990) distinguished between
categorize
segments of transcripts using a test tothe
with describe and to make decisions
numbers
about a person.
reflecting nothing about the segments other Deciding which candidate
than
they are different. Using should receive a scholarship or
Gunawardena, which applicants
Lowe,
and Anderson's (1997) scheme should befor
admitted to a graduate
coding socialprogram is
knowledge construction, consequential
messages and necessitates
are coded a thorough
as
investigation of of
1 (statement of opinion), 2 (statement several elements of an assess-
agreement),
3 (corroborating example), 4ment procedure, including
(clarifying its relevance,
detail), or 5 utility,
social consequences,
(problem definition). Statement of opinion value(1)
implications,
does and
construct validity.
not mean less social construction In the context of basic
of knowledge
research,(2);
than statement of agreement in contrast,
the the researcher may neglect
difference
between 1 and 2 and 3 andsome
4 of these criteria
does without placing hopeful
not represent
students
an equal difference in level of in any peril.
social In this latter situation,
knowledge
construction; and 0 does not represent the(1971)
Messick (1989) and Cronbach abso-encouraged
test developers
lute absence of social knowledge to direct finite resources into a
construction.
systematic investigation of construct validity.
The purpose of the data also influences the
More will be said about this type of validity in
selection of score interpretation models. Crite-
subsequent sections.
rion-referenced score interpretations focus on
the classification of test takers-pass-fail, mas-
ter-nonmaster-in conventional testing. Norm-
referenced interpretations focus
Identifying onThat
Behaviours relative
Represent
the Construct
comparisons of test takers-average, above
average, below average. The former model is
prevalent in applied decision-making con-
Once the purpose of coding data has been deter-
texts-selection, placement, and so forth, and
mined, the next step is to identify behaviors that
requires established criteria and justified cut-
represent the construct. The goal in this step is to
scores with which one can judge mastery. So far,
ensure that a coding protocol neither leaves out
criteria have not been proposed
behaviors that for the
should be issues
included, nor includes
that educational technologists study with QCA
behaviors that should be left out. This is particu-
(e.g., interaction, participation, procession
larly important in QCA because the technique is
through the problem-solving process).
essentially observational; therefore, in an opera-
tionist and behaviorist
Norm-referenced interpretations sense, the construct, at
character-
ize the whole of QCA studies in to
one level, comes our domain.
be defined by observable
Generally, one transcript or one
behavior. segment
The precariousness ofenterprise
of this a
transcript is positioned relative to with
can be illustrated another. For One
a simple example.
instance, Chou (2001) construct
compared that has receivedlevels of
continued attention
learner-learner interaction in
from CMCsynchronous ver-
researchers is participation, which, in
sus asynchronous communication modes, and
QCA studies, is regularly defined by a single
concluded, in a norm-referenced
representativefashion,
behavior-postingthata message to a
there was a higher percentage
conference. If of social-emo-
this specific behavior is interesting
tional interaction in synchronous mode
in itself, perhaps than in interac-
in a human-computer
asynchronous mode. It is not logically
tion study, necessary
then this operational definition is sat-
This content downloaded from 202.92.130.213 on Thu, 24 Jan 2019 04:37:25 UTC
All use subject to https://about.jstor.org/terms
10 ETR&D, Vol. 52, No. 1
This content downloaded from 202.92.130.213 on Thu, 24 Jan 2019 04:37:25 UTC
All use subject to https://about.jstor.org/terms
VALIDITY IN QUANTITATIVE CONTENT ANALYSIS 11
This content downloaded from 202.92.130.213 on Thu, 24 Jan 2019 04:37:25 UTC
All use subject to https://about.jstor.org/terms
12 ETR&D, Vol. 52, No. 1
or discarded,andand (c)
conducting validity indicato
studies for the final
ceptually misaligned were
form of a coding protocol. This step signals the
appropriate categories. Simila
conclusion of the first stage of establishing valid-
accrued with Henri's (1991) instrument ity in QCA studies and the first section of this
although no changes have been formally
article. In this section we translated some of the
This content downloaded from 202.92.130.213 on Thu, 24 Jan 2019 04:37:25 UTC
All use subject to https://about.jstor.org/terms
VALIDITY IN QUANTITATIVE CONTENT ANALYSIS 13
more traditional
the power of the coding scheme to method, a 5-item semantic
provide a dif-
ferential scale
tally of salutations, compliments, anchored
and by theindi-
other positive adjec-
cators in the protocol. It is with the
tives warm, inference
friendly, that close,
trusting, disinhibiting,
and personal
what the raters have observed, (Gunawardena & Zittle,
categorized, and 1998;
counted is communicationShort,
that Williams, & Christie, critical
supports 1976). The authors
discourse. Further investigation is required
found that correlations to
between the frequency of
warrant such an interpretation.
the 15 indicators and the students' ratings of
socialseveral
Messick (1989) discussed presence weretypes
weak (r = 0.4,
ofapproxi-
mately).be
investigations that should Furthermore,
conductedsignificant correlations
to
weretest.
establish the validity of any observedOfonly these,
between a subset
three of the indi-
are particularly germane to cators and a subset
the of the social presence
development ofdimen-
coding protocols whose sions represented is
purpose on the
tosemantic
collectdifferential
scale.
descriptive data and generate inferential infor-
mation in research contexts. These are:
These results point to difficulties in at least
1. Correlational analyses two areas of validity. First, content representa-
2. Examinations of group differences, and tiveness-the coding scheme was not measuring
all of the dimensions of social presence. Second,
3. Experimental or instructional interventions. content relevance-some of the indicators did
In this section we will examine all three. To illus- not correlate with any of the dimensions of
trate the discussion, we will draw primarily onsocial presence. Ultimately this presents a
examples from our own work with which we are challenge to the appropriateness of the infer-
most familiar and which provides some of theences that one would want to make from the
few available examples. coding data; that is, that observing, categorizing,
and counting the occurrence of the 15 indicators
would allow one to infer how well students
This content downloaded from 202.92.130.213 on Thu, 24 Jan 2019 04:37:25 UTC
All use subject to https://about.jstor.org/terms
14 ETR&D, Vol. 52, No. 1
Group Differences
nity college students enrolled in upgrading
classes scored lowest, and the remaining two
groups-high school tutors working with men-
Coding schemes for constructs such as critical tally retarded students and education students
thinking, group problem solving, or social com- studying statistics-scored in the middle.
munication are essentially embodiments of what
their authors think the ideal process would look
like. This belief gives rise to a type of study in
which an ideal group engaging in the target pro- Experimental and Instructional
Intervention
cess is compared to a group that is much less
ideal. One hopes the instrument distinguishes
between them appropriately. There are two In the group differences scenario, naturally oc-
forms of this type of study, cross-sectional and curring criterion groups are identified that are
longitudinal. In a cross-sectional study, an effec- expected to differ with respect to the construct
tive protocol for coding social communication being measured (e.g., teachers vs. students on
should distinguish between a cohort group in moderating ability). In experimental or instruc-
the final year of their program and a zero-his- tional intervention studies, researchers delib-
tory group. In a longitudinal study, the same erately manipulate the groups or their
protocol should be able to distinguish between environment to induce an alteration in the con-
the initial weeks and the final weeks of a com-
struct under study. An attempt is made to mod-
puter conference. Demonstrating this ability is ify behavior in theoretically predicted ways to
an important early step in validating a QCA pro-determine whether the coding protocol is sensi-
tocol.
tive to these changes. For example, Jonassen and
Anderson et al. (2001) were able to provide Kwon (2001) studied students' problem-solving
this type of evidence for their teaching presence skills using a content analysis protocol devel-
instrument. Composed of the categories directoped by Poole and Holmes (1995). If valid, such
instruction, instructional design, and facilitatingan instrument should be sensitive to instruc-
discourse, their instrument was able to distin-tional interventions such as training students in
guish between weeks of discussion led by stu- the problem-solving process or providing expert
dents and weeks of discussion led by the assistance while students are engaged in prob-
instructor. As would be hypothesized by their lem-solving tasks. The following year, Jonassen
model, instructor-led discussion contained moreand Kwon (2002) conducted such a study.
evidence of direct instruction and instructional Undergraduate economics students were
divided into small teams and asked to resolve
design while student-led discussion contained
more evidence of facilitating discourse (Rourke economics problems online. To communicate
& Anderson, 2002). with each other, some teams used a standard
This content downloaded from 202.92.130.213 on Thu, 24 Jan 2019 04:37:25 UTC
All use subject to https://about.jstor.org/terms
VALIDITY IN QUANTITATIVE CONTENT ANALYSIS 15
headlong
aise of epic proportions" (9 23). We have argued into the tangle of issues disc
merely that attention to test validity will this paper. Gall, Borg, and Gall's (1996
tion, located in the context of a broader
strengthen the claims of QCA studies and
increase their information yield. sion of research design issues, is imme
more appropriate: "Consider employin
Our argument is relevant to the content ana-
ing system that has been used in p
lyst engaged in a purely descriptive study, but it
research" (p. 359). Jonassen and Kw
is directed specifically at those who include
2002) took this approach in their studies
inferential processes in the collection, analysis,
and interpretation of their data. Some research-
lem solving in CMC. Rather than po
their study while they undertook an
ers continue to use QCA in the manner in which
process of instrument development an
it was originally conceived. They systematically
tion, the authors proceeded directly w
identify, categorize, and count the objective ele-
investigation after selecting an instrum
ments of communication and provide audiences
had
with a summary of this data. The procedure is been developed by Poole and
(1995).
sound, the analysis leaves little room for counter
interpretation, and the results of descriptiveUnfortunately, few researchers appe
studies are valuable, especially when they con-
interested in conducting their studies w
cern relatively new educational phenomena
ing instruments. Those who do accomp
such as the use of CMC in teaching and learning.
eral things: They contribute to the accu
This content downloaded from 202.92.130.213 on Thu, 24 Jan 2019 04:37:25 UTC
All use subject to https://about.jstor.org/terms
16 ETR&D, Vol. 52, No. 1
validity of an REFERENCES
existing proced
compare their results witha gr
normative Anderson, T.,
data, and Rourke, L., Garrison,
leapfrog D.R., & Archer, W. o
ment (2001). Assessing process.
construction teaching presence in a computer
In o
conferencing environment. Journal of Asynchronous
gram, we dedicated two years and a
Learning Networks, 5 (2). Retrieved, March 6, 2002,
considerable proportion of our research funds to
from http://www.aln.org/alnweb/joumal/jaln-
the development of three QCA protocols (Com-vol5issue2v2.htm
munity of Inquiry, 2002). Only then were we
Bakeman, R., & Gottman, J.M. (1997). Observing interac-
able to begin using the protocols to investigatetion: An introduction to sequential analysis (2nd ed.).
New York: Cambridge University Press.
the phenomena that had originally captured our
Bereiter, C., & Scardemalia, M. (1987). The psychology of
attention. In our case, the trade-off was justified
written composition. Hillsdale N.J: Lawrence Erlbaum
because instrument development was a central
Associates.
current studies. This is an exceptional case. Bullen, M. (1998). Participation and critical thinking in
online university distance education. Journal of Dis-
The purpose of this discussion has been to
tance Education, 13(2), 1-32.
prompt some reflection about what is required
Chou, C. (November, 2001). A model of learner-centered
before one can make inferences from frequencycomputer-mediated interaction for collaborative distance
counts of communicative behavior. The discus- learning. Paper presented at the Annual Meeting of
the Association for Educational and Communica-
sion is incomplete. We have talked about corre-
tions Technology, Atlanta, GA.
lational analyses, but have not mentioned factor
Community of Inquiry. (2002). Critical thinking in a
analysis (or path analysis, structural equation
text-based environment: Computer conferencing in
modeling, or hierarchical linear modeling). Wehigher education. Retrieved March 6, 2002, from
have touched on protocol analysis and chro-http:/ /www.atl.ualberta.ca/cmc
nometric analysis but not computational or
Crocker, L., & Algina, J. (1986). Introduction to classical
mathematical modeling. Of Crocker and and modern test theory. Toronto: Harcourt Brace
Jovanovich College Publishers.
Algina's (1986) 10 steps of test construction, we
Cronbach, L. (1971). Test validation. In R.L. Thorndike
have discussed only 6. And we have drawn min- (Ed.), Educational measurement (2nd ed.). Wash-
imally from Messick's (1989) 102-p. chapter. ington, D.C.: American Council on Education.
There is a rich body of literature that researchers Cronbach, L. (1990). Essentials of psychological testing
can refer to when developing a protocol and (5rd ed.). New York:Harper & Row.
Curtis, D., & Lawson, M. (2001). Exploring collabora-
making inferences from coding data. Some pre-
tive learning online. Journal ofAsynchronous Learning
liminary considerations are outlined in this Networks, 5(1). Retrieved, March 6, 2002, from
paper but much work remains. - http://www.aln.org/alnweb/journal/Vol5_issuel/Curtis
/curtis.htm
Dalkey, N., & Helmer, 0. (1963). An experimental
Liam Rourke [lrourke@ualberta.ca] is a Ph.D. application of the delphi method to the user of
candidate in the Department of Educational experts. Management Science, 9(3), 458-467.
Psychology at the University of Alberta, Edmonton
Embretson, S. (1983). Construct validity: Construct
AB.
representation versus nomothetic span. Psychological
Terry Anderson [terrya@athabascau.ca] is Professor
Bulletin, 93, 179-197.
and Canada Research Chair in Distance Education at
Ericsson, K., & Simon, H.(1993) Protocol analysis: Verbal
Athabasca University in Alberta. reports as data. Cambridge, Mass: MIT Press.
Fahy, P. (2001). Addressing some common problems
in transcript analysis. International Review of Research
in Open and Distance Learning, 1(2). Retrieved, March
20, 2002, from the World Wide Web at http:
/ /www.irrodl.org / content / vl1.2 / research.html
Fahy, P. (2002a). Epistolary and expository interac-
tions patterns in a computer conference transcript.
Retrieved, March 1, 2002, from
http:/ /cde.athabascau.ca /softeval/report
jde.pdf
Fahy, P. (2002b). Evaluating critical thinking in a com-
This content downloaded from 202.92.130.213 on Thu, 24 Jan 2019 04:37:25 UTC
All use subject to https://about.jstor.org/terms
VALIDITY IN QUANTITATIVE CONTENT ANALYSIS 17
This content downloaded from 202.92.130.213 on Thu, 24 Jan 2019 04:37:25 UTC
All use subject to https://about.jstor.org/terms
18 ETR&D, Vol, 52, No. 1
This content downloaded from 202.92.130.213 on Thu, 24 Jan 2019 04:37:25 UTC
All use subject to https://about.jstor.org/terms