Sie sind auf Seite 1von 17

Poetics 20 (1991) 559-575 North-Holland

559

The empirical study of literary reading: Methods of data collection *


Gerard J. Steen

Empirical

research on literary current

reading

is dependent journal

on the systematic and controlled In the empirical data. Verbal

collection to a the has

of data. In this article, sample of publications most popular

techniques

of data collection SPIEL. verbal

are discussed with reference study of literature.

from the German

techniques

by far are those collecting

data can be divided

according to three criteria: their on-line or off-line over their structure, provided and their written research from the publications in SPIEL, on literary

nature, the degree of control a researcher Examples

or spoken nature. reading.

of each of these aspects are are made with the reliability

and some connections

and validity of empirical

1. Introduction

Literary reading is a slippery object of research. Traditionally it has not been studied as a phenomenon by itself. Instead it has been used in diverging ways as a tool for the investigation of literary texts. This long-standing line of research is not concerned with literary reading as an object for scientific investigation, but with the ascribed intentions, meanings, or effects of the literary text. Since the arrival of reception-theory, however, the literary reading process has been accorded much more attention of its own. Even then, ideas and approaches have differed widely (cf. e.g. Holub 1984). Indeed, there is a fundamental division in reception-research to be seen today, i.e. between empirical and other approaches to literary reading. It is true that the latter may be variously designated as analytical, hermeneutical, semiotic, and pragmatic, but they have one thing in common: they are not primarily concerned with the controlled testing of their claims.
Correspondence to: Gerard
Universiteit,
l

J. Steen, Dept. rewritten

of General

Literary

Studies, Faculty of Letters, I presented

Vrije

P.O. Box 7161, in Lancaster

1007 MC Amsterdam, 1991).

The Netherlands. at the 11th PALAand the Empirical stage,

This article

is an extensively

version of the paper

Conference particularly comments:

(September

which was held on Stylistics by the audience, to Jan Hoeksma

Study of Literature. aim of the present

I am grateful paper.

for the reactions indebted

and, at a later for

so for suggestions by Mick Short and Cees van Rees, who have helped me clarify the I am also highly his provocative myself. there will be fewer errors now for which I will have to take responsibility

0304-422X/92/305.00

1992 - Elsevier

Science Publishers

B.V. All rights reserved

Why study literary reading in an empirical fashion? There are several answers that could be given to this query, but I will single out two. First. literary reading is one instance of reading in general. and it is desirable that the scientific agenda for research on reading includes such an important and specific variant of text-processing as literary reading. Research on the way people read is carried out by psychologists, of course. but psychologists have not manifested a spectacular interest in literary reading. Still. presuming that literary reading is a well-identifiable object of research. it would be interesting to see how it differs from other kinds of reading when it comes to actual processing strategies and their results. Literary reading thus has to be studied empirically because it is an interesting and important part of the range of observable reading behaviours. A beginning of this kind of interest in literary reading from the side of psychology is represented by Colin Martindale, Art Graesser. Sara Davis, Steen Larsen. Norbert Groeben, L&z16 Hal&z and others. all of whom have contributed to the Second International Conference for the Empirical Study of Literature (Ibsch et al. 1991). The second answer to the question Why study literary reading empirically has to do with the quality of scientific knowledge. This aspect of the issue is highlighted when we reconsider the traditional haven of literary reading, i.e. literary studies. If empirical knowledge is defined as testable knowledge, then the presumption is that testable knowledge is preferable to other kinds of knowledge. And if this presumption is generally held in all scientific research, it has a funny position in literary studies. On the one hand, many literary historians and critics formulate their claims in such a way that they can be tested in principle; moreover, literary scholars often argue with such enthusiasm about those claims that the presumption of a principle of testability is vividly dramatised. But on the other hand. when it comes to acknowledging this principle as a guideline for conducting research, then most literary scholars retreat to less well-defined positions. There is a decided ambivalence in literary studies about the possibilities and use of attaining testability. There are many explanations for this situation in literary studies. but I will not go into them; a good discussion of most of the important issues can be found in Livingston (19S8). Rather, in this article I will set out from the other end, and attempt to show that empirical research of literary reading is possible, interesting, and fruitful. Fortunately, there are many other researchers of literature who have gone empirical regarding the study of the literary reading process. At the above-mentioned conference, contributions were made by Andringa, Schram, Segers, Van Peer, and Zwaan. to name only the Dutch participants with a psychological orientation (Ibsch et al. 1991). In this attempt to illustrate the possibilities for empirical research on literary reading, I will concentrate on the aspect of data collection. This is

one of the cornerstones of empirical research. Different methods of data collection can be used to investigate diverging aspects of the reading process, and 1 would like to indicate which methods are currently used to which ends. That way, traditional literary scholars may find that some of their prejudices against measurement are based on only one or two ways of going empirical, other techniques of data collection possibly being more attractive to them than they thought. Psychologists of reading, by comparison, may be surprised by the possibilities for asking interesting questions about the literary reading process, and decide that literary reading offers a valuable ground for testing hypotheses about text-processing. In order to stay close to actual practice, I have adopted an empirical procedure. I have chosen to restrict myself to the methods for data collection used by the researchers who have published their work in SPIEL. This is a German-oriented journal which reflects the state of the empirical art on the European Continent. Contributions are in English and German, and have been made by empirical researchers of literature from all over the world. Although this sample of articles (23 in all, from 10 volumes) may contain some bias towards the Continental rather than the American side of empirical research on literature, I do not regard the distortion as very large. Where necessary, I will broaden my scope. 2. General considerations
of data collection

Testability being the criterion for empirical research, first a few words have to be devoted to this important aspect of science. Testability is a characteristic of theories and hypotheses, or in short, of propositions. Although many claims in literary studies may be testable in principle, this does not imply that they are all of the same empirical value. Only those statements that have been subjected to a controlled procedure of testing and subsequent evaluation may be said to express empirically validated knowledge. This controlled procedure of testing and evaluation is known as the empirical cycle. Empirical research begins with problems and ends with propositions which may raise new problems for investigation. For instance, the presumption that literary reading itself is different from other kinds of reading may be subjected to empirical testing. A theory about the difference between literary and other kinds of reading will have to be put forward, in order to make clear which conception of literary reading the researcher is addressing. If the social-systems theory of Schmidt (1980) is adhered to, then literary reading will be defined .in terms of text-processing that abides by the so-called aesthetic and polyvalence conventions. Hypotheses are then derived from
I
do not have space to discuss these conventions here; I presume that their labels speak for

themselves.

562

G.J. Steen / The empirical study of literary reading

this theory about aspects of text-processing as guided by these two conventions; for example, it can be predicted that the text-representations resulting from literary text-processing manifest more aesthetic and polyvalent qualities than usual. In order to test this hypothesis, variables of behaviour will have to be defined and operationalised so that relevant data collection and analysis can follow. To continue our example, the difference between literary and journalistic text-processing may then be operationalised as the difference between the resulting text-representations in memory in terms of their aesthetic and polyvalent nature. An experimental design may then be contructed to the effect that the same text is presumed to be processed aesthetically and polyvalently when it is offered as literary to one group of readers, but that it is processed factually and monovalently when it is offered as journalistic to another group of readers. Next, a series of aesthetic and polyvalent interpretations and a series of factual and monovalent interpretations of the experimental text can be constructed and offered in mixed order as memory recognition items to the two groups of subjects, in order to collect recall data about the text read in the literary and journalistic reading conditions. If data analysis is able to show that the literary group selects more aesthetic and polyvalent interpretations of the text than the journalistic group, i.e. that their recall of the text is more aesthetic and polyvalent, then this is good evidence for the hypothesis that literary text-processing is guided by the aesthetic and polyvalence conventions as opposed to the factual and monovalence conventions prevailing in journalistic discourse. In fact, this example is a simplified version of an experiment carried out by Meutsch (1987). The results of the analysis confirmed the hypothesis that there was a difference between literary and journalistic reading in the intended sense as an effect of mode of text-presentation and concomitant attitude of the reader. From this example, it may be concluded that empirical testing proceeds by measurement. Measurement is not the prerogative of the experimental researcher: it is part of all kinds of empirical research, ranging from observation to correlational and experimental investigation. These differences between kinds of research will be ignored here. Furthermore, measurement does not by necessity suggest quantification: it is a technical term for the controlled observation of the value of a variable. In less intimidating jargon, measurement is the classification of data into a number of distinct categories and classes. In the above example, measurement is simple and consists of assigning each observed case of a selected memory recognition item to the classes of aesthetic and factual, or polyvalent and monovalent interpretation. However, the success of measurement (and of testing) is highly dependent upon the operationalisation of the hypothesis or theory under investigation. Thus, although an operationalisation of reading in terms of memory for the text is relatively simple (but not unproblematic, cf. Meutsch 1987), casting the

G.J. Steen / The empirical srudy of literary reading

563

theoretical difference between literary and journalistic processing in terms of determinate aesthetic and polyvalent interpretations as opposed to factual and monovalent ones is no easy matter. Let me devote some more attention to this aspect of empirical testing. Operationalisation may be circumscribed as follows. When a researcher aims to study a particular problem, he or she has to specify that problem in terms of observable phenomena. Operationalisation thus determines the process of data collection in that it directs the researcher to a particular domain in the world in which one may expect to find facts that can be measured and that are relevant to an evaluation of the hypothesis. Both the use of the term facts and the method of going out into the world are anathema in present-day mainstream literary studies of the kind mentioned in the introduction. However, if the term facts is disconnected from the unnecessary prejudice that it implies some form of observer-independence (cf. Ibsch 19891, and if the literary researcher takes seriously the idea that knowledge of the literary reading process has to be intersubjectively testable, then there is no alternative to the method of going out there in order to gather facts. The systematic and controlled collection of data on the reading process is one foundation for any empirical theory of reading.

3. Verbal and nonverbal

data

Data are of various types, and they are collected to diverging ends through different techniques. The discussion below will be organised according to one crucial distinction between empirical data: verbal and nonverbal data. Verbal data are data in the form of linguistic expressions, and non-verbal data are data in the form of other phenomena. An example of the former kind are thinking-out-loud (TOL) data by readers about texts, and of the latter, reading-time measurements of the duration of readers text-processing activities. The distinction between verbal and nonverbal data is fundamental for two reasons. First, verbal data can generally be elicited much more easily than nonverbal data, which by contrast depend upon the availability of a psychological laboratory and the expertise to handle sophisticated experimental equipment. However, there is a trade-off: many verbal data are difficult to analyse because of their complexity, while reading-time measurements are simple to average and compare. A second reason for the importance of the distinction between nonverbal and verbal data is that laboratory techniques, such as reading-time measurements, are excellent means for the study of reading in progress, whereas most verbal data, for instance in the form of questionnaires and interviews, are better suited to elicitation before or after the actual process of reading, and

hence are used to collect data about attitudes. preferences, knowledge, memories, and so on. Thus nonverbal methods, all of them being on-line methods of data collection, seem to have the advantage of providing a more direct access to the object of study over most of the verbal methods, with the conspicuous exception of thinking out loud (see section 4.2 below). However. there is a trade-off here, too. The task of pressing a button upon finishing a sentence may be suitable for the study of only a small number of aspects of the reading process, such as the cognitive energy spent while reading. Direct access to the object of study, the phrase used somewhat too uniformly just now for referring to literary reading, turns out to relate to a restricted range of specific phenomena when we deal with nonverbal methods. Other aspects of literary reading, such as the actual cognitive. emotive or moral contents of a reading process, are accessible only to verbal methods such as thinkingout-loud. In sum, the two different methodology-types have contrary advantages and disadvantages: verbal data are easier to collect than nonverbal data, but they are more difficult to analyse; and with nonverbal data the researcher has on-line access to the reading process. but their relation to the reading process as a whole is extremely complex. Given this situation, it is perhaps not surprising that many empirical researchers of literature have tended to prefer verbal to nonverbal data. The reasons for this preference may now be easily identified, but they should all be taken with a degree of caution. First of all, verbal data seem easier to collect. However, it should not be overlooked that there are many preconditions to be fulfilled before such data collection is adequate, for instance. that all interview situations are identical between subjects. Another reason for preferring verbal data may be the fact that they do not require the object of study, i.e. the reading process, to be placed in an unnatural environment such as the psychological laboratory. It may be wondered, though. if the kinds of processes that are tapped by reading-time measurements can be negatively affected by such environmental influences. Finally, verbal data can be subjected to familiar methods of linguistic and textual analysis. The drawback here is that such data may be more complex, and hence less reliably analysable from a methodological point of view. The alleged advantages of collecting verbal data should hence at least be relativised somewhat when they are compared to nonverbal data. All the same, the difficulties for each of these three points apparently have seemed less important than their advantages to those empirical researchers who have had their original training in the arts departments. Indeed, my survey of the publications in SPIEL provides a dramatic illustration of the predominance of verbal data in empirical research on literary reading (see table 1). Only two publications out of a total of 23 make use of nonverbal data such as reading-time and response-time measurements, i.e. Oakhill (1989) and

Tuhle

I
on literary Nonverbal reading using verbal and nonverbal data. Verbal 2 Questionnaire Thinking Intetvievv Group discussion Other out loud 15 5 6 7 2 30 et al. (1990) deserves it is excluded from the group of researchers: out loud. and

SPIEL-publications

Method

Reaction times Reading times

Total If The studies figures.

3 review of the Copenhagen studies of reader response by Dollerup had been conducted. and which methods were used. Therefore However. the whole gamut of verbal data is used hy this tests. questionnaires. semantic differentials.

special mention at this point. Unfortunately.

it was not always clear from the summary how many

personality interviews.

reading out loud, thinking

LQszl6 (1986). All of the other data are verbal data, collected by means of well-known social-scientific techniques such as questionnaires and interviews. (The fact that the total of the techniques outlined in table 1 exceeds 23 is explained by the use of more than one technique in a large number of publications.) To my knowledge, other nonverbal techniques than reaction-time measurements have not been used widely in the empirical study of literature. For instance, the recording of eye movements might be highly informative about the pattern of information-processing in reading small but condensed texts, such as poetry (cf. Just and Carpenter 1984). But so far, empirical researchers of literature have clearly tended to prefer verbal data.

4. Verbal data We will begin our detailed discussion of verbal data with a further division of the techniques that are currently in use. This division relates to the link between the moment of data collection and the reading process: does data collection take place before, during, or after reading? This is a common division in the psychology of reading. For example, the German psychologist and empirical researcher of literature Norbert Groeben has organised his overview of the field into two books: one pertaining to the reading process proper (Groeben 19821, and one pertaining to the prediscourse stage of reading motivation and the post-discourse stage of effect (Groeben and Vorderer 1988). Similarly, in a handbook devoted exclusively to verbal data collection and analysis, which is not restricted to the study of reading, Huber

566
Table 2

G.J. Steen / The empirtcul study of literary reading

Perspectives Pre-actional Questionnaire Rating scales Structures interview

upon verbalisations

and degree of structuredness. Post-actional Questionnaire Rating scales

Peri-actional

TOL

upon interruption

Structured

interview TOL recall

by researcher Retrospective Sampling of thoughts concurrent TOL Listing of thoughts stimualted with TOL

Listing of thoughts

Focused interview Narrative Reflection interview in

TOL

upon specific

Focused interview contents Narrative interview in

Reflection natural -

in context

Reflection -

natural context group discussion planning pers. discussion diary entry stands for Thinking Huber and Mandl

natural context group discussion report pers. discussion diary entry

group discussion note-taking self-discussion

a TOL (From

Out Loud 1982: 23; my tr., GS)

and Mandl (1982: 23) present a systematic overview of verbal data techniques that are classified according to this criterion (see table 2). The three columns in table 2 indicate that data collection takes place before, during, or after the process under investigation. The vertical ordering of the techniques from top to bottom suggests another criterion for classification, i.e. the decreasing degree of structuredness of the verbal data collected; we will turn to this aspect in a moment. Note that, according to this overview, the only reasonably controlled instrument for collecting online verbal data is thinking-out-loud. I will argue below that this is incorrect for the study of reading. Having inspected the publications in SPIEL, I arrived at the conclusion that the differences between verbal techniques for data collection may be discussed with more insight through a variant of Huber and Mandls (1982) parameter degree of structuredness of verbal data. For degree of structuredness may also be regarded in terms of the degree of control that the researcher has over the kinds of verbal data to be collected. My alternative division of the techniques presented in table 2 is to be based on three levels of control: maximal control, medium control, and minimal control. Table 3 shows the effect of using this criterion upon the classification of the techniques for data collection. Assignment of the techniques to each of these levels will be motivated in the following subsections.

G.J. Steen / The empirical study of literary reading

567

Table 3 Written and spoken verbal data and reseracher control. Control Maximal Written Questionnaire - rank ordering - rating - closed questions - calibrated test Questionnaire - underlining - cloze procedure - other completion tasks Questionnaire - open questions - free writing Thinking Out Loud - cued - free Interview - structured - semi/unstructured Group discussion Spoken Reading out loud Closed-question answering

Medium

Minimal

As can be seen, an additional distinction is made between the spoken and written nature of the verbal data thus classified. To neglect this feature of verbal data would be highly inappropriate. I will come back to it in section 4.2. The taxonomy presented in table 3 will now be used in the more detailed discussion of verbal techniques in sections 4.1 to 4.3. 4.1. Maximally controlled verbal data I will devote this section to a consideration of the written data only; spoken data will be included in section 4.2. The highest degree of control is achieved by providing the subject with verbal data constructed by the researcher: questionnaires with forced answers in the form of multiple choice items are the prototypical example of this kind of technique. The obvious advantage of such a procedure is the easy comparability among subjects over all items, and over the way they are responded to. Moreover, quantifiability is unproblematic. However, the obvious disadvantage is the off-line nature of the procedure: it can only tap motivation or effect, in Groebens terms, but not the process of literary reading itself. In addition, there is no way the subject can influence the nature of the data to his or her own liking. This implies that much of the validity of this kind of testing depends upon the construction of the questionnaires items.

A specific instance of the highly controlkd qucstionnuiw is the standardised test. such as the intelligence test. However. this kind of test has not been used very often in the empirical study of literature. There arc only three cases in the SPIEL-publications: tests of reading-ability were used by Oakhill (1989); Willenberg and Lange ( 19Y9) applied part of an intelligence test: and Schram and Van Wieringen (1956) made use of a personality test determining authoritarian personality traits. The results on these tests were subsequently used to relate reader characteristics to performance on particular reading tasks. For instance. Schram and Van Wieringen found that readers with a higher score on the authoritarian personality scale manifested a greater aversion to stories depicting a homosexual relationship than readers with a lower score on this scale: such readers also find these stories less important and less eligible for distribution. A more popular form of the questionnaire than the standardised test is the rating scale: subjects are offered materials such as tests or metaphors for assessment according to some predetermined, verbally expressed characteristic on a numerical scale. For example, texts can be judged as to their relative degree of novelty, interest, beauty, and so on. This technique can be used to compare judgments of different groups of readers. More often. hovvever, it is used to study the literary text. This issue lies outside the scope of the present article which concentrates on data collection regarding literary reading. Rating scales may stand on their own or be part of more encompassing questionnaires also containing for example closed and open questions. A well-known instance of the relatively self-contained use of rating scales is the Semantic Differential Technique (cf. Snider and Osgood 1972). The advantage of this kind of scaling over multiple-choice questionnaires is that one receives ordinal data rather than nominal data. which facilitates the use of stronger statistical tests. The same rationale lies at the basis of the rank ordering tasks used by Andringa (1984) and Van Peer (1987). To illustrate from the latter, readers were offered a list of phrases and lines from four poems for ranking according to their degree of importance. This way. verbal data concerning the importance of text passages were collected by using the predetermined but verbal criterion of placing one line above another according to its estimated degree of importance. Rank-ordering of these importance scores was then compared with the rank-order of the same passages according to their degree of foregroundedness as analysed independently by the researcher. It turned out that there was a high degree of correlation. showing that the theoretical criterion of foregrounding was related to the readers experience of textual importance. Note, however, that strictly speaking we are dealing with an aspect of the text, not of reading here (see above). Rank-ordering of texts as a means to compare differences betvveen A-level and undergraduate students was used by Andringa in a much more complicated design.

4.2. Mit~itnally coturolled l,erbal data In order to contrast as sharply as possible the criteria of my classification. I will now first deal with the verbal data exhibiting the lowest degree of researcher control. For written data, this is exhibited by the group of techniques including questionnaires with open questions and free writing. This kind of written data requires the subsequent application of what is usually called content analysis, because the data are not pre-packaged into pertinent units of analysis, as in the previous cases. There is more to content analysis than can be done justice to here, but Van Assche (1991) offers a useful introduction. Returning to data collection rather than data analysis, it is important to realise that the issue about high versus low control over written data is not new. An important aspect in this connection is their presumed relation to the features of reliability and validity. Schon (1990) makes an insightful remark about this point:
The debate regarding open or closed questions In truth. is long-standing procedures however. Closed

[ . ..I.

it also is and will be has not been one of objective) and For a

the debate concerning reliability methodology,

the extent to which closed knowledge problem

achieve their often impressive (quantitative,

at the cost of their validity. but of diverging procedures (Schon

the debate

interests.

open (qualitative) their integration.

are both legitimate

in their own right for each question. between

long time the methodological

has not been the competition

the two. but

1990: 237; my tr., GS)

Apparently researcher control over verbal data has been felt to involve a trade-off between data analysability and manipulability on the one hand. and natural richness and validity on the other. The more control one has over the structure of the verbal data, e.g. in the form of closed questionnaires, the more reliably these data can be analysed, or so the argument goes. Simultaneously it is suspected that the degree of natural richness of the data will be decreased, and that this is directly connected to the alleged lower validity of ones conclusions. As is rightly pointed out by Schon, though, this opposition is misleading. It can be discussed better when we include the role of spoken data. Typically, spoken data are characterised by the total control of the test-subject rather than the researcher over the nature of the verbal data delivered: with the exception of two instances, all SPIEL-studies using
The first is Oakhill 11989). who used a spoken instead of a written form of posing closed

questions

to subjects; this change has to do with the age of her subjects, who were 8-year-olds in comprehension on subsequent techniques tests. The second is Willenberg to record tasks. Neither in the empirical and Lange (1989). who had their reading errors and relate reading. them to is very typical of

participating performance present-day

subjects read out loud the stimulus text in order comprehension of data collection

of these procedures study of literary

570

G.J. Steen / The empirical study of literary reading

spoken data belong in the low-control category of our taxonomy. Even though thinking-out-loud tasks and interviews may be focused, cued, or structured, it still is the readers activity of verbalisation which defines the clarity, aptness, variety, and extent of the response. As a result, this group of techniques may indeed yield a store of rich and natural data. Conclusions drawn from these studies may have a relatively high validity in one sense of the term. A typical example of this position is found in Andringa (1990). She presents an encompassing proposal for TOL-protocol analysis, which is explicitly motivated by honouring the complexity of the object of research through qualitative research, which is therefore tacitly regarded as just as valid at least as quantitative piecemeal research. When we think of the age-long reflections on literary functions and meanings, and of the innumerable text and reader variables which play a role in literary reading, we are confronted with an overwhelming complexity and variety, which seems to be hardly appropriate for strictly quantitative designs (Andringa 1990: 232). I believe, however, that this suggestion is misleading. 3 The point is that validity does not only pertain to imitating the object in all its complexity. In testing, validity refers to the question whether measurement indeed captures what it purports to measure. That way, closed questions can be just as valid as interviews, as long as it is clear what the object of investigation is. Thus, it may indeed be a matter of research questions rather than methodological procedures whether low or high control data facilitate the drawing of valid conclusions: for the question is, conclusions about what? Note that my argument is supported willy-nilly by the sequel to the passage quoted from Andringa (1990: 232) above, for she shifts the argument from the (im)possibility of applying quantitative methods to literary reading to the desirability of investigating literary reading in full: Although it might seem possible to isolate and operationalize certain aspects, it seems at least desirable to study the potential richness and variety of reading processes in a manner as open and natural as possible. Thus the issue turns into priorities for research, something which I will not go into in this article. All the same, we cannot ignore the other important methodological question, namely the one about data analysis, raised before. For valid conclusions do depend upon prior reliable analysis, and this is where lowcontrol verbal data are difficult to handle. 4 In my opinion, part of the problem experienced with the reliable analysis of low-control spoken data lies especially in the felt need to analyse all of the data exhaustively. The

3 Moreover,

such an argument

would invalidate reserach,

all quantitative

research

on human psychology, models of reading For an illuminat-

which cannot be the goal of the author, because she does rely on psychological that are based on quantitative ley (1992). such as Van Dijk and Kintsch (1983). ing discussion of some of the fallacies causing the qualitative-quantitative

divide, see Hammers-

G.J. Steen / The empirical study of literary reading

571

obvious alternative is to restrict the analytical use of these low-control techniques for verbal data collection to just one purpose. Reliability may be improved when subjects are to think out loud or when they are interviewed with an eye to investigating only one aspect of reading at a time. Moreover, when subjects take a whole series of small reading tests by asking them to read a number of short texts rather than one long text, then control may increase exponentially. For example, think of offering a series of quatrains that are varied in one crucial aspect, say different metrical structure: it would be interesting and easy to observe how many readers verbalised that they noticed the difference between the quatrains having structure X vs. those having structure Y, both in thinking out loud and in interviews. Thus hypotheses about the varying role of metrical structure in poems could be tested in a reliable, quantitative, and valid manner. 4.3. Medium-control Llerbal data The last group of techniques that may be distinguished on the axis of researcher control over the nature of the written data collected is not discussed by Huber and Mandl(1982). This is a group of techniques that may be characterised as standing midway between the high-control group, in which all of the verbal data were predetermined by the researcher, and the low-control group, in which it is the subject who has control over the nature of the data collected. In the group in between, however, the nature of the verbal data is partly determined by the researcher, and partly by the subject. I will briefly consider them now. It is easiest to begin with the prototypical example of this group of techniques, which is the so-called cloze-procedure. Unfortunately, it is not represented in our sample of empirical publications in SPIEL. There are, however, at least two empirical researchers of literature who have recently made use of this technique in book-length studies, i.e. Van Peer (1986) and Hoffstaedter (1986). In a cloze-procedure, readers receive a text in which every fifth word has been deleted. The entire content of a text may be covered when five conditions are used in .which alternating series of fifth words are deleted. The range of insertions produced by a group of readers can then be compared with the original word of the text for various purposes. For instance, the degree of comprehensibility and surprisingness of a text may be ascertained with this method; but it is also possible to compare different groups of readers as to their range and nature of response.
4 Moreover, interaction, if think-aloud and group data are provided multiple by one speaker, interaction: these interviews are severe literary involve face-to-face complications reading for process.

discussions,

analysis, because they add layers of social interaction

to the ordinary

There are other completion tasks belonging in this category. but they are less controlled in that they do not require the insertion of simply one word. Van Dijk and Kintsch (1983: 172ff.J. for instance, presented subjects with constructed text-fragments of varying lengths, and added an incomplete new sentence beginning with the personal pronoun Ize or S/W. Thus they were able to investigate which factors determined the readers choice of the antecedent in their verbal completion of the incomplete sentence by manipulating previous passages. That this type of task is less under the control of the researcher is shown by the fact that some of the responses had to be discarded because they were uninterprctable or ambiguous. However, this only involved less than 5% of the data. To my knowledge, this medium-control method has not been used for literary studies. A third technique belonging in this midway category, which is represented in the SPIEL-publications, is the underlining task. The reason that this technique comes in the medium-control category is that. although all of the verbal stimuli are provided by the researcher, it is the act of selection by the reader which determines which parts of the stimuli end up in the pool of data to be analysed. The only example in SPIEL of the utilization of underlining is Van Peers (1987) study of foregrounding mentioned before. By requesting readers to underline passages they found particularly striking. \-an Peer was able to test his analysis of poetic language as foregrounded or not against his informants. The advantage of the midway group of techniques is that they come much closer to the process of reading than a questionnaire can. Hovvever, they do not come as close to the undisturbed reading process as the on-line laboratory techniques discussed before. For this, the degree of conscious and directed activity by the reader is too high. For instance. although the underlining tasks referred to above do show that readers can pick out certain types of language as important upon request, this does not mean that they do so spontaneously, all the time, and with the same degree of avvareness. In other words, underlining tasks do not tap literary text-processing in the automated sense, although they do provide a measure of literary reading in a communicative sense. It should also be observed that this group of methods does provide a medium-controlled alternative to thinking out loud as the allegedly only verbal method with access to on-line processing.

5. Verbal

data in SPIEL

The distribution of all of these techniques in the SPIEL-sample is shown in table 4. It should be noted that four out of the six publications marked for open questions used closed questions as well. Moreover, the fact that there are no rating scale publications is meant to indicate that there are no

G.J. Steen / The empirical study of litentry reading

573

independent rating scale investigations. At least two of the general questionnaire studies did include rating scales of some kind, however. (The mismatch between the totals of table 1 and table 4 is explained by my splitting up of the category of the questionnaire into a number of subcategories which were used as data collection techniques beside each other in the same publications on a number of occasions.) The following trends may be observed. The spoken, low-control technique of thinking-out-loud is used for on-line reading research predominantly. One medium-control, written alternative for this technique is the cloze-procedure and the underlining task. However, there is no high-control technique for verbal data collection during reading, unless it would be the reading-out-loud task which could be used for analysing intonation, reading mistakes and so on. Turning to the techniques used for pre- and post-process data-collection, it may be noted that questionnaires and interviews are most popular here. Questionnaires are suited to more controlled forms of data collection, which is an advantage for precise and quantified testing, while interviews are typically apt for collecting data more fully in control of the reader, providing the researcher with rich data for exploration. However, exploratory research may also be conducted with the help of questionnaires posing open questions and requesting free writing on the part of the subjects, and hypothesis-testing

Table

4 reporting Written Rank ordering Rating scales Closed questions Calibrated test 2 0 5 3 10 Underlining Cloze procedure Other completion task 2 empirical research utilizing different Spoken Reading out load Closed questions I kinds of verbal data.

SPIEL-publications Control Max

Total
Med

Total
Min Open question Free writing Cued TOL Free TOL Struct. interview Semistruct./unstr. interview Group 12 discussion

0
I 4 I 5 2 13

can be executed without difficulty by means of low-controlled long as they are used selectively.

verbal data as

6. Conclusion The empirical study of literary reading requires the controlled and systematic collection of data in order to facilitate hypothesis testing. This paper has discussed a number of different methods for data collection that are current in psychologically oriented research on literary reading. It appears that those methods collecting nonverbal data are not very popular in the empirical study of literary reading. The explanation for this state of affairs is probably to be found in the complex relation of this kind of data to the reading process, and in the institution based preference for verbal data in the arts departments. In how far nonverbal data are inherently unsuitable for the study of the literary reading process remains a moot point. Verbal data may be used for a wide range of purposes in the empirical study of literary reading. Their use depends on the degree of reliability and validity researchers aim for at varying stages of research. It is no general truth that there is an increase of reliability when researchers have more control over the verbal data collected and analysed in research: thinking out loud and interviews may be used to collect or study data on particular aspects of reading which admit of highly reliable analysis. But when data are collected for exploratory purposes, it may be that validity is sened by using low-control verbal data. Choice of a data collection technique thus depends on the goal of the empirical study.

References
Andringa, SPIEL Andringa, Dollerup. Groeben, dorf. Groeben, N. and M., P. Vorderer, 1992. York: Whats 1988. Leserpsychologie: with Lesemotivation Lektiirewirkung. explorations. Miinster: Hammersley, London/New Hoffstaedter, Holub, R.C., Aschendorff. wrong ethnography? Methodological Buske. York: Methuen. Routledge. aus der Sicht des Lesers. Hamburg: introduction. theory: A critical London/New E., 1984. Wandel 3, 27-66. E., 1990. Verbal C., 1. Reventlow data on literary and C. Rosenberg understanding: Hansen. A proposal for protocol analysis on 19, 231-257. 1990. The Copenhagen - Textverstiindlichkeit. studies in reader Sliinster: Aschen9, 413-436. Textverstandnis der literarischen Identifikation. Eine experimentellc Untersuchung.

two levels. Poetics response. SPIEL

N., 1982. Leserpsychologie:

P., 1986. Poetizitiit 1984. Reception

G.J. Strrn / The empirical srudy of literatyv reading

515

Huber. C.L. and H. Mandl. 1982. Verbalisationsmethoden zur Erfassung von Kognitionen im Handlungszusammenhang. In: G.L. Huber and H. Mandl teds.). Verbale Daten. Eine der Erhebung und Auswertung. Einfiihrung in die Grundlagen und Methoden Weinheim/Basel: Belz. Ibsch. E., 1989. Facts in the empirical study of literature: The United States and Germany - A comparison. Poetics 18, 3X9-404. Ibsch, E.. D. Schram and G. Steen teds.). 1991. Empirical studies of literature: Proceedings of the Second ICEL-Conference. Amsterdam 1989. Amsterdam: Rodopi. Just, M.A. and P.A. Carpenter, 1984. Using eye fixations to study reading comprehension. In: D.E. Kieras and M.A. Just feds.), New methods in reading comprehension research. Hillsdale, NJ: Lawrence Erlbaum. LBszl6. J., 1986. Same story with different point of view. SPIEL 5. l-22. Livingston, P., 1988. Literary knowledge: Humanistic inquiry and the philosophy of science. Ithaca, NY: Cornell University Press. Meutsch. D., 1987. Literatur verstehen. Eine empirische Studie. Braunschweig/Wiesbaden: Vieweg. Oakhill, J., 1989. Childrens comprehension difficulties: The nature of the deficit. SPlEL 8, 2.5-48. Schon. E., 1990. Die Entwicklung literarischer Rezeptionskompetenz. Ergebnisse einer Untersuchung zum Lesen bei Kindern und Jugendlichen. SPIEL 9, 229-276. Schmidt, S.J., 1980. Grundril3 der empirischen Literaturwissenschaft. Braunschweig/Wiesbaden: Vieweg. Schram, D.H. and P.C.W. van Wieringen, 1986. Die dogmatische Person und die Rezeption normwidriger Information in als literarisch und als nicht-literarisch prasentierten Testen. SPIEL 5.49-82. Snider, J.G. and C.E. Osgood teds.), 1972. Semantic differential technique: A sourcebook. Chicago, IL: Aldine Publishing Company. Van Assche, A., 1991. Content analysis and experimental methods in literary study: Scientific twins or opponents? In: E. Ibsch, D. Schram and G. Steen feds.), Empirical studies of literature: Proceedings of the Second IGEL-Conference, Amsterdam 1989. Amsterdam: Rodopi. Van Dijk, T.A. and W. Kintsch, 1983. Strategies of discourse comprehension. New York: Academic Press. Van Peer, W., 1986. Stylistics and psychology: Investigations of foregrounding. London: Croom Helm. Van Peer, W., 1987. Empirical studies and their relationship to the theory of literature. SPIEL 6, 146-162. Willenberg, H. and B. Lange, 1989. Verstehensfortschritte von Schiilern nach einem Jahr. Versuch einer neuropsychologisch begriindeten Lerntheorie. SPIEL 8, 49-76.

Das könnte Ihnen auch gefallen