Sie sind auf Seite 1von 6

Evaluating the Efficacy of Clicker-Based Peer Instruction Across Multiple Courses

at a Single Institution
Jack Kinne, Eric Misner, Adam S. Carter, and Sharon M. Tuttle
{drk10, em1909, adam.carter, sharon.tuttle}@humboldt.edu
Humboldt State University

ABSTRACT
Peer Instruction (PI) is an evidence-based interactive teaching method
popularized by Harvard Professor Eric Mazur in teaching physics and is seeing increased
adoption in computer science education. This paper examines the efficacy of PI at a
medium-sized university whose primary focus is on undergraduate education. By
employing instruments used in prior studies by other researchers, we are able to situate
our findings within the broader research space. While our data is not as overwhelmingly
positive as what is reported in prior work, we still find that the large majority (83%) of
students find value in PI. Furthermore, we explore the impact that instructor experience,
grading questions on correctness, and the re-polling of questions have on student
satisfaction.

INTRODUCTION
PI has seen use in classrooms across the country and is gaining popularity with
Computer Science educators [4, 6, 7]. Studies have shown that students value PI and have
more learning opportunities when compared against a traditional lecture experience [2, 7,
8]. It is also shown that PI is associated with lower failure rates and increased retention
within the major - something of which the discipline of Computer Science has
historically struggled [7, 8]. Indeed, the US Education Department's National Center for
Education Statistics reports that 69% of bachelor’s degree candidates in a STEM field left
STEM during their course of study [1].
While past results find PI to be both effective and popular among students, a lack
of standardized evaluation makes it difficult to more broadly consider PI in the context of
computing education, thereby making it difficult to achieve more broad conclusions. To
this end, this paper is a partial replication of a SIGCSE Best Paper awardee [5]. As was
the case in Porter et al. [5], we find PI to be a generally successful pedagogical tool. In
addition, this paper examines the impact of re-polling, which we find to positively impact
students' perceived usefulness of PI and negatively impact the perceived difficulty of
clicker questions.

BACKGROUND & RELATED WORK


PI is characterized by asking conceptual questions during a classroom lecture that
allows for a student to have the time to think about the question individually. Once they
commit to an answer, they then have the opportunity to discuss their findings in small
groups and re-answer based on new understanding. This direct element allows an
instructor to either cover the material again highlighting new points or move forward --
allowing immediate feedback for student comprehension.

© CCSC, (2018). This is the author's version of the work. It is posted here by permission
of CCSC for your personal use. Not for redistribution. The definitive version was
published in The Journal of Computing Sciences in Colleges, {34, 1, October 2018),
http://dl.acm.org/.
This paper is a replication and expansion of the work of Porter et al. [5], which
examines student perception and implementation of PI. With the exception of one trial,
the reported results were overwhelmingly positive – an average of 92% of students found
PI to have value. The lowest result (72% average) was hypothesized to be due to the fact
that in that particular trial, the instructor gave participation points based on correctness,
thereby somewhat souring students on PI.

METHODOLOGY
Data was collected from Humboldt State University (HSU) during the fall 2016
and spring 2017 semesters from two instructors. Both instructors have approximately 5
years of experience in using clickers in their respective classrooms and both use clickers
as a means to increase student engagement during class sessions. Responses were
collected for credit (10% of course grade) with partial credit being awarded for incorrect
responses. However, whereas Instructor B was consistent with PI pedagogy across
semesters, Instructor A began repolling after discussion in the Spring 2017 semester. The
courses in which data were collected, along with the instructor responsible for instruction
are listed in Table 1. To assess efficacy of clickers, we employed the same survey used
by Porter et al. [5].

Table 1: Courses considered in this study


Course Description (Term; Instructor) # Course Description (Term; #
responses Instructor) Responses
CS 100 Critical Thinking with 26 CS 111 Computer Science Foundations 36/17
Computers (F; B) (F/S; A)
CS 232 Python (S; B) 18 CS 279 Introduction to Linux Operating 21
Systems (F; B)
CS 325 Database Design (F; A) 39 CS 328 Web Apps Using Databases (S; 32
A)
CS 346 Telecommunications and 34 CS 449 Computer Security (S; B) 32
Networks (F; B)
CS 458 Software Engineering (F; A) 31 CS 461 Computational Models (S; B) 34

RESULTS & DISCUSSION


Table 2 presents the number of students in each class that agreed with each
statement. In keeping with Porter et al. [5], cells highlighted in Table 2 are those that fall
below an 80% agreement threshold. The first eight questions were asked on a 7-point
Likert scale (strongly disagree – strongly agree) with agreement being counted on
responses of 5+. The remaining four questions were asked using either a 3 or 5-point
scale with agreement being counted on the middle response (e.g. "about right"). The final
four columns of Table 2 represent weighted averages for Instructor A, Instructor B, all
courses, and weighted averages reported in Porter et al.
We see strong similarities when comparing our results to those obtained by Porter
et al. In general, students appear to find value in PI pedagogy, with 76% indicating that
they would recommend that other instructors adopt PI. While this is a fairly positive
result, it is still below the 89% agreement recorded by Porter et al. Indeed, we see
reasonable deviation (+/- > 5%) between our findings and Porter et al. on questions 1, 2,
3, 6, 7, 8, 9, and 12. It seems reasonable to deduce a correlative relationship between
these questions: If students are not participating in group discussions, it follows that
Table 2: Student Agreement. Agreement values under 80% are highlighted. Shaded Header Indicates Instructor A.
Question/ Identifier CS CS CS CS CS CS CS CS CS CS CS Inst. Inst. AVG. Porter
100 111 111 232 279 325 328 346 449 458 461 A B AVG
(F) (S) (S) (F) (F) (S) (F) (S) (F) (S) AVG AVG
1. Thinking about clicker questions on my own,
before discussing with people around me, helped 88% 92% 100% 89% 81% 82% 78% 74% 84% 77% 85% 84% 84% 84% 90%
me learn the course material.
2. Most of the time my group actually discusses the
50% 72% 100% 67% 76% 82% 88% 65% 72% 87% 91% 73% 87% 77% 94%
clicker question.
3. The immediate feedback from clickers helped
me focus on weaknesses in my understanding of 88% 81% 94% 89% 85% 79% 84% 68% 78% 84% 88% 82% 84% 83% 92%
the course material.
4. Knowing the right answer is the only important
31% 22% 41% 11% 29% 13% 38% 18% 21% 39% 29% 25% 28% 26% 21%
part of the clicker question.
5. Generally, by the time we finished with a
92% 86% 100% 94% 95% 87% 91% 79% 90% 90% 85% 88% 91% 89% 88%
question and discussion, I felt pretty clear about it.
6. Clickers helped me pay attention in this course
76% 77% 100% 94% 76% 77% 84% 65% 75% 71% 76% 77% 79% 78% 87%
compared to traditional lectures.
7. Clickers with discussion is valuable for my
88% 86% 88% 94% 86% 74% 84% 62% 91% 77% 79% 83% 78% 82% 91%
learning.
8. I recommend that other instructors use this
approach (reading quizzes, clickers, in-class 80% 83% 82% 83% 76% 62% 81% 61% 81% 74% 79% 78% 70% 76% 89%
discussion) in their courses.
9. From the point of helping me learn, the content
84% 88% 64% 83% 85% 97% 68% 79% 78% 90% 85% 74% 75% 74% 85%
of clicker questions was (too hard, okay, too easy)
10. In general, the instructor gave us enough time
to read and understand the questions before the first 100% 92% 82% 94% 76% 90% 72% 91% 68% 90% 85% 82% 89% 83% 81%
vote: (too short, about right, too long)
11. The amount of time generally allowed for peer
96% 92% 94% 78% 90% 90% 66% 97% 75% 90% 88% 85% 89% 86% 83%
discussion was: (too short, about right, too long)
12. In general, the time allowed for class-wide
discussion (after the group vote) was: (too short, 92% 94% 94% 78% 81% 90% 78% 85% 71% 84% 82% 83% 89% 85% 73%
about right, too long)
clickers would not be as effective at helping students to pay attention and to identify
knowledge. Furthermore, if students are not actively participating in PI activities, it is
likely that they are finding little value in the overall pedagogical approach, thereby
reducing the likelihood that they would recommend the approach to other instructors.
Porter et al.'s results contain an outlier case in which a single course ended up
reporting much lower results than all other courses under consideration. In their
discussion, the authors hypothesize that the lower values were attributed to the fact that
the instructor for the course was new to PI, did not sufficiently explain PI pedagogy at the
beginning of the class, and had some course credit tied to answer correctness. Porter et
al. place the most emphasis on the fact that their outlier course tied course correctness to
answer correctness. They hypothesize that doing so creates a stressful quiz-like
environment in which the student is tasked with playing a game of "guess what the
professor is thinking." In contrast, the two instructors considered in this paper have
extensive background in PI and explain the purpose of PI in the syllabus and during the
first lecture. Yet, both instructors do assign credit based on clicker responses. Therefore,
we thought it might be interesting to more directly compare our instructors with Porter et
al.'s outlier. Table 3 provides us this comparison.

Table 3: Comparing Student Agreement between our Average Response Rates and those Reported by Porter.

Question Our Porter Outlier (% Porter Average Less Outlier Porter Overall Average
# Average diff vs ours) (% diff vs ours) (% diff vs ours)
1 84% 62% (-22%) 94% (+10%) 90% (+06%)
2 87% 90% (+03%) 95% (+08%) 94% (+07%)
3 83% 74% (-09%) 95% (+12%) 92% (+09%)
4 26% 37% (+11%) 17% (-09%) 21% (-05%)
5 89% 69% (-20%) 92% (+03%) 88% (+01%)
6 78% 58% (-20%) 93% (+15%) 87% (+09%)
7 82% 74% (-08%) 95% (+13%) 91% (+09%)
8 76% 71% (-04%) 93% (+17%) 89% (+13%)
9 74% 76% (-02%) 86% (+08%) 85% (+07%)
10 83% 72% (+11%) 83% (+00%) 81% (-02%)
11 86% 89% (-03%) 81% (-05%) 83% (-03%)
12 89% 70% (+19%) 74% (-15%) 73% (-16%)

As is often the case in educational research, our data is not perfectly aligned to
test Porter et al.'s hypothesis (e.g. inexperienced instructor that doesn't justify clicker
usage who doesn't grade based on correctness). Given that our instructors somewhat split
the line between Porter et al.'s outlier instructor and their remaining instructors, it again
seems sensible to have our results also sit somewhere in between. It seems reasonable to
conclude that assigning credit based on answer correctness may indeed lower the efficacy
of PI. Furthermore, it seems equally likely that an instructor well-versed in PI pedagogy
who communicates the purpose behind PI is likely to increase the efficacy of PI. Indeed,
on questions 9-12, which center on proper pacing of PI, we find that our experienced
instructors generally outperform the results reported by Porter et al., which were mainly
taught by inexperienced instructors (only 2 of 7 instructors had taught using PI more than
once).
In directly comparing our two instructors, we find that Instructor A tends to be
less effective at promoting group discussions (question 2). However, students taking
Instructor A have a higher recommendation rate (question 8). These differences illustrate
that even when most pedagogical factors are held constant, individual instructor
differences may impact the overall effectiveness of PI.
Recall in the Methodology section that, in accordance with best practices (see [3])
Instructor A began polling for questions twice (once before and once after discussion) in
spring 2017. While this does introduce a minor confound in our attempt at replication, it
does allow us to more closely study the effects of this single intervention using both
between-subjects (comparing fall & spring CS 111) and within subjects (comparing CS
325 vs. CS 328) analysis.
CS 111 is HSU's introductory-programming course and fits the typical mold of a
CS1 course. In considering the between subjects difference in CS 111, statistical linear
regression reveals a significant positive increase in the responses of question 2 ("Most of
the time my group actually discusses the clicker question."; F(1, 51) = 10.25, p < 0.05,
Adj. R2 = .15), a positive increase on question 6 ("Clickers helped me pay attention…";
F(1, 51) = 5.20, p < 0.05, Adj. R2 = .08), and a negative decrease on question 9 ("Clicker
content was too easy"; F(1, 51) = 4.63, p < 0.05, Adj. R2 = .07). These results would
seem to indicate that re-polling has a positive impact on the frequency of group
discussions which may have a positive cascading effect on the ability of PI to hold
students' attention. Also noteworthy is the fact that re-polling has the effect of giving
questions the appearance of being easier. Thus, in classes that utilize re-polling, it may
be allowable or even necessary for instructors to ask more difficult clicker questions.
CS 325 is HSU's introductory database course. CS 328 is the follow-up to CS
325, introducing both advance database concepts as well as basic web programming.
Both courses are required for all majors, and a typical student who enrolls in CS 325 in
the fall will take CS 328 the subsequent spring. Therefore, in studying the differences
between these two classes, we are able to observe how students' opinions of PI are altered
based on the introduction of re-polling. While Table 2 hints at what might be impressive
increases, for example CS 325 has a 62% agreement rate for the question "I recommend
that other instructors use this approach" versus CS 328's agreement rate of 81%, linear
regression was unable to detect any statistical differences between the two groups.
Nevertheless, we do see an upward trend in positive outlook towards PI (questions 1-8),
while we see a similar decrease in satisfaction related to timing. Note that the same is
observed when comparing offerings of CS 111. A possible explanation is that being
asked the same question twice may make certain students restless.

CONCLUSION & FUTURE WORK


In this paper, we investigate the efficacy of PI at Humboldt State University
across 10 courses. While we find wide variability across these courses, the general
consensus is that our students find value in PI pedagogy. Furthermore, by collecting data
using the same instrument as prior research [2], we are better able to situate our work
within the broader space of PI instruction within computing education. In doing so, we
strengthen the hypothesis that awarding credit based on correctness as well as instructor
familiarity with PI are two important factors that affect the effectiveness of PI as a
pedagogy.
During our data collection, one of our instructors implemented the practice of re-
polling students after the discussion period. Statistical analysis suggests that this
pedagogical change can have significant positive impacts on students' perception of PI.
In addition, we witnessed a similar decline in satisfaction related to the pacing of a
course. This suggests that when employing re-polling, instructors may need to lessen the
time between questions and/or develop a more nuanced question bank.
Overall, we agree with prior research that the PI pedagogy has merit within
computing education. In future work, we plan to continue this work by examining the
relationship between course grades and perception of PI effectiveness. Furthermore, we
hope to collect additional data so that a more rigorous statistical analysis may be
performed in the future. With these efforts, we hope to ultimately contribute to the
growing set of best practices for applying PI in computing education.

ACKNOWLEDGEMENTS
We would like to thank Leo Porter for answering several of our questions and for
providing us his survey. This paper was funded by the RSCA award provided by the
chancellor of California State University.

REFERENCES
[1] Chen, X. and Soldner, M. 2013. STEM Attrition: College Students’ Paths Into and
Out of STEM Fields. US Department of Education.
[2] Crouch, C.H. and Mazur, E. 2001. Peer instruction: ten years of experience and
results. Am. J. Phys. 69, 9 (2001).
[3] Derek Bok Center 2006. Interactive Teaching DVD: Promoting Better Learning
Using Peer Instruction and Just-In-Time Teaching. Addison-Wesley.
[4] Liao, S.N. et al. 2016. Lightweight, Early Identification of At-Risk CS1 Students.
Proceedings of the 2016 ACM Conference on International Computing Education
Research (Melbourne, VIC, Australia, 2016), 123–131.
[5] Porter, L. et al. 2016. A Multi-institutional Study of Peer Instruction in Introductory
Computing. Proceedings of the 47th ACM Technical Symposium on Computing
Science Education (New York, NY, USA, 2016), 358–363.
[6] Porter, L. et al. 2011. Peer instruction: do students really learn from peer discussion
in computing? Proceedings of the seventh international workshop on Computing
education research. ACM. 45–52.
[7] Zingaro, D. 2014. Peer Instruction Contributes to Self-efficacy in CS1. Proceedings
of the 45th ACM Technical Symposium on Computer Science Education (Atlanta,
Georgia, USA, 2014), 373–378.
[8] Zingaro, D. and Porter, L. 2014. Peer Instruction: A Link to the Exam. Proceedings
of the 2014 Conference on Innovation; Technology in Computer Science Education
(Uppsala, Sweden, 2014), 255–260.

Das könnte Ihnen auch gefallen