Common Flaws in Research

Common flaws 1
Running head: COMMON FLAWS
Scientific Research in Education:
Common Flaws
John Koetsier
University of British Columbia

Common flaws 2
Abstract
Research studies are difficult to do right and easy to do wrong. There are
many potholes to avoid, and many factors can impact a study’s validity and
reliability. To find and understand some of the common problems, I’m going
to look at three different types of studies, see what the researchers did, how
they did it, and what problems they encountered. The studies are Beck and
Fetherston’s The effects of incorporating a word processor into a year three
writing program (2003), Schweingruber and Brandenburg’s Middle School
Students’ Technology Practices and Preferences: Re-examining Gender
Differences (2001), and Haye’s A comparison of fifth graders’ frequency
using web-based activities versus traditional activities for self-directed
enrichment (2003).
Common flaws 3
In the first study, Natalie Beck and Tony Fetherston studied the effects of
teaching writing with a word processor in primary grades. For six weeks, they
studied both how students felt about using word processing technology
versus paper and pencil, and what effects technology had on the quality of
their writing. As a result, they concluded that students who used word
processors wrote significantly better than students using pencil and paper.
Unfortunately, the quality of the study was severely and negatively
undermined by several design and procedural decisions. Together those flaws
cause it to miss the standard for research that is generalizable to other
settings and can be counted upon when creating programs and curricula.
In brief, the problems with the study include a very small sampling size -
only seven students – which basically eliminates any opportunity for external
validity. The sample cannot possibly be representative enough. And – not
that it matters that much with such a small sample - the researchers used
convenience sampling rather than random sampling.
In addition, the short six-week study ensured that researchers could not
compensate for the effects of novelty … any new technique employed in an
educational setting might result in a temporary bump in performance as the
sheer newness galvanizes student attention and effort. Oddly, in what must
be a rare problem for a study with a novelty issue, maturation was also a
problem, since the students apparently used the word processing software
Common flaws 4
previous to the initiation of the study.
Finally, and perhaps most importantly, the design was pre-experimental.
There was no control group receiving a placebo and equal but different
treatment. The sample group essentially was its own control group.
In the second study, Miller, Schweingruber, and Brandenburg looked at
middle school students’ use of technology in America - specifically at
male/female differences. They administered a 512-question survey to
students in Texas middle schools, and used the results to argue that
historical differences are disappearing as technology – particularly the web -
becomes more prevalent.
The conclusions are valid and supported by subsequent research, but the
methodology (particularly the sampling) could have been significantly
improved. Therefore, the study is not as generalizable as it could have been,
and follow-up research was required.
Problems included a significantly skewed urban/suburban mix that is heavily
weighted in favor of urban students and against rural students, who were
entirely excluded. In addition, ethnicity was a factor that was not addressed
at all in the study, even though the schools from which students were drawn
were in a city and state that over-represented certain racial groups. A final
sampling complication was the fact that schools subjects were drawn from
Common flaws 5
were significantly undersized compared to the average middle school.
In addition to sampling concerns, the author’s assertion that the web is
predominantly responsible for male/female technology preferences becoming
more similar is problematic, as there are many potentially confounding
variables. Finally, high mortality in the course of the study due to data
collection problems adds yet another question mark.
In the third study, teacher Karen Hayse engaged in action research to guide
a school district’s recommended practice with regard to using web-based
versus traditional enrichment resources. Working with a single class of fifth
graders over a period of 10 weeks, Hayse introduced 15 web resources and
15 traditional resources as activities that students could explore and use
during non-graded personal enrichment time every third school day.
Students self-reported which resources they used.
Hayse discovered that students preferred web resources to traditional
resources most of the time, with web resources being the clear leader
initially, trailing off halfway through the 10 weeks, and then regaining
popularity in the final few weeks. Hayse also noticed, anecdotally, that giving
students a choice between different types of enrichment activities seemed to
result in students choosing to engage in enrichment more often, regardless
of which type they chose.

Common flaws 6
There are a number of concerns with this study, starting with sampling.
Specifically, Hayse has apparently used convenience sampling, probably with
her own class. Clearly, there are no guarantees of representativeness.
Another is a lack of pre-testing. It would be important to know whether
before the study started there was already a student preference for
technology and web-based resources, and a simple survey could have
provided helpful insight when interpreting the study data.
A concern I have is that there may have been a novelty effect … that
students who had previously only been exposed to traditional enrichment
activities in the classroom may have chosen web activities simply due to their
newness. A longer study would have reduced any novelty effects that might
be operating.
Hayse mentions that she has controlled for a number of factors, including
types of activities and opportunities to work with peers, but I wonder if the
web resources were as potentially social or perceived as potentially social by
the students as the traditional resources. The spike in non-traditional
resource usage came after a student asked friends to play a trivia game; was
a similar thing possible with the web resources? It’s difficult to say without
being able to examine the actual websites.
A number of other questions suggest themselves: while “neither or both”
were options, as students could choose to use any combination of resources

Common flaws 7
including no resources, they do not show up in the data in Table 1. It seems
unlikely that over 10 weeks these options were never chosen. Also, students
self-reported use of activities at the end of the day: this may not be the most
accurate method of collecting data. And finally, while not necessarily to be
expected in action research, it would still be ideal to have a better study
design than pre-experimental.
Looking over these three studies, it seems clear that sampling is an
enormous challenge and frequent source of external validity concerns. Each
of the studies had, to varying degrees, sampling problems. This probably
shouldn’t be too much of a concern, as finding subjects and convincing them
to participate in studies is difficult, time-consuming, and potentially
expensive. However, it is worth researchers time to expend considerable
time and effort on this specific facet of their studies, since without a good
random or appropriately stratified sample, results are not generalizable
anyways. In other words, garbage in, garbage out.
Secondly, an appropriate degree of control over the variables in the study is
critical. Knowing that students had used the particular type of word
processing software that they were testing should have impelled Beck and
Fetherston to find other subjects. And while I can’t prove it without access to
the resources that Hayse used, I suspect that while equivalent on the
surface, and in terms of topic, they may not have been equivalent in terms of
presentation and use by students.

Common flaws 8
In conclusion, academic research is a difficult process to do well, and pitfalls
exist at every stage. These three studies illuminate some of the common
issues, and provide insight for researchers about what to avoid and minimize
in study design in order to maximize internal and external validity.

Common flaws 9
References
Beck, N., & Fetherston T. (2003). The effects of incorporating a word
processor into a year three writing program. Information Technology
in Childhood Eduction Annual, 2003, 139-161.
Hayse, K. (2003). A comparison of fifth graders’ frequency using web-based
activities versus traditional activities for self-directed enrichment.
Retrieved from
http://www.smsd.org/custom/curriculum/ActionResearch2003/Hayse.h
tm on March 5, 2008.
Miller, L.M., Schweingruber, H., and Brandenburg, C.L. (2001). Middle School
Students’ Technology Practices and Preferences: Re-examining Gender
Differences. Journal of Educational Multimedia & Hypermedia, 10(2),
125-140.

Common Flaws in Research

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Common Flaws in Research

Hochgeladen von

Copyright:

Verfügbare Formate

Common flaws 1

Running head: COMMON FLAWS

Scientific Research in Education:

University of British Columbia

Fetherston’s The effects of incorporating a word processor into a year three

writing program (2003), Schweingruber and Brandenburg’s Middle School

Students’ Technology Practices and Preferences: Re-examining Gender

Differences (2001), and Haye’s A comparison of fifth graders’ frequency

using web-based activities versus traditional activities for self-directed

Unfortunately, the quality of the study was severely and negatively

undermined by several design and procedural decisions. Together those flaws

cause it to miss the standard for research that is generalizable to other

validity. The sample cannot possibly be representative enough. And – not

convenience sampling rather than random sampling.

compensate for the effects of novelty … any new technique employed in an

educational setting might result in a temporary bump in performance as the

previous to the initiation of the study.

Finally, and perhaps most importantly, the design was pre-experimental.

In the second study, Miller, Schweingruber, and Brandenburg looked at

middle school students’ use of technology in America - specifically at

male/female differences. They administered a 512-question survey to

historical differences are disappearing as technology – particularly the web -

becomes more prevalent.

methodology (particularly the sampling) could have been significantly

improved. Therefore, the study is not as generalizable as it could have been,

and follow-up research was required.

Problems included a significantly skewed urban/suburban mix that is heavily

were significantly undersized compared to the average middle school.

In addition to sampling concerns, the author’s assertion that the web is

predominantly responsible for male/female technology preferences becoming

more similar is problematic, as there are many potentially confounding

collection problems adds yet another question mark.

a school district’s recommended practice with regard to using web-based

versus traditional enrichment resources. Working with a single class of fifth

graders over a period of 10 weeks, Hayse introduced 15 web resources and

15 traditional resources as activities that students could explore and use

during non-graded personal enrichment time every third school day.

Students self-reported which resources they used.

Hayse discovered that students preferred web resources to traditional

students a choice between different types of enrichment activities seemed to

result in students choosing to engage in enrichment more often, regardless

of which type they chose.

Specifically, Hayse has apparently used convenience sampling, probably with

her own class. Clearly, there are no guarantees of representativeness.

Another is a lack of pre-testing. It would be important to know whether

technology and web-based resources, and a simple survey could have

provided helpful insight when interpreting the study data.

students who had previously only been exposed to traditional enrichment

web resources were as potentially social or perceived as potentially social by

the students as the traditional resources. The spike in non-traditional

being able to examine the actual websites.

A number of other questions suggest themselves: while “neither or both”

were options, as students could choose to use any combination of resources

including no resources, they do not show up in the data in Table 1. It seems

accurate method of collecting data. And finally, while not necessarily to be

expected in action research, it would still be ideal to have a better study

design than pre-experimental.

Looking over these three studies, it seems clear that sampling is an

enormous challenge and frequent source of external validity concerns. Each

of the studies had, to varying degrees, sampling problems. This probably

shouldn’t be too much of a concern, as finding subjects and convincing them

to participate in studies is difficult, time-consuming, and potentially

expensive. However, it is worth researchers time to expend considerable