Traugott and Wlezien in POQ

Public Opinion Quarterly, Vol. 73, No. 5 2009, pp.
866894
THE DYNAMICS OF POLL PERFORMANCE DURING

THE 2008 PRESIDENTIAL NOMINATION CONTEST
MICHAEL W. TRAUGOTT
CHRISTOPHER WLEZIEN
is with the Center for Political Studies, 426 Thompson Street, University
of Michigan, Ann Arbor, Michigan. CHRISTOPHER WLEZIEN is with the Department of Political
Science, Temple University, 1115 West Berks Street, Philadelphia, Pennsylvania. The authors wish
to thank Sunshine Hillygus, the Guest Editor of this Special Issue, for her careful reading and
incisive suggestions that contributed greatly to the final form of this article. We are also grateful to
Michael Hagen and Costas Panagapoulos and two anonymous reviewers for their comments and
suggestions on an earlier version of the manuscript. Our work on this manuscript was stimulated by
our participation as members of the Ad Hoc Committee on the 2008 Presidential Primary Polling
appointed by the American Association for Public Opinion Research. Address correspondence to
Michael W. Traugott; e-mail: mtrau@umich.edu; or Christopher Wlezien; Wlezien@temple.edu.
MICHAEL W. TRAUGOTT
doi:10.1093/poq/nfp078

C The Author 2009. Published by Oxford University Press on behalf of the American Association for Public Opinion Research.
All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org
Downloaded from poq.oxfordjournals.org at Media Union Library, University of Michigan on October 6, 2010
Abstract This analysis focuses on estimation difficulties pollsters had

in the primaries in 2008 in light of recent trends in improved polling
accuracy in general elections. We consider the series of polls that were
conducted in New Hampshire and other states holding primaries, looking
at how the dynamics of the primary contest affected polling accuracy in
those contests. The data come from published state-level results of public
pollsters from the week preceding each primary or caucus for which polls
were conducted; all told, we used 258 polls in thirty-six different Democratic events and 219 polls in twenty-six Republican events. The results
show that the winners vote share almost always exceeded the poll share
while the race remained competitive, particularly early on in the nomination process. In an unusual perspective made possible by the length of
the contest on the Democratic side in particular, this could be observed
through most of the primaries; it was not the case in the Republican
events after John McCain became the presumptive nominee. The analysis shows there are contextual factors at work that can affect the quality
of the estimates that public pollsters make. Measures of momentum and
viability affect the estimates differently early in the process compared to
later, and there are special factors associated with the insurgent candidacy
of Barack Obama that may also have affected the accuracy of the polls.
We model these factors, investigate their explanatory power, and discuss
the implications for pollsters in future primary sequences.
The Dynamics of Poll Performance
867
Background on the 2008 Campaign

The 2008 campaign was notable in several regards. Before it even got underway, it was going to be a significant contest because it was the first since
1952 that was completely open in that neither a sitting president nor vicepresident was contesting the nomination. Barack Obama ran the first successful
nomination campaign by an African American, beating the early Democratic
favorite Hillary Rodham Clinton. By the time the general election campaign
was over, the two major party candidates had spent over a billion dollars on their
In the very first presidential primary of 2008in New Hampshirethe pollsters had serious estimation problems in identifying Hillary Clinton as the
winner. In each of thirteen final pre-election estimates, Barack Obama was
shown as the favorite by a margin ranging from one to thirteen percentage
points; not one poll showed Hillary Clinton in the lead. These results were consistent enough for David Broder to declare on the eve of the election that the
race is now Obamas to lose (Broder 2008). The poll estimates of Obamas
support in the New Hampshire electorate were generally quite accurate; the
problem arose as an underestimate of support for Hillary Clinton. On the Republican side, all but one of the final polls indicated that John McCain would
win; and the estimates generally identified his proportion of the vote as well as
second-place finisher Mitt Romneys with reasonable accuracy.
After the primary the commentary in the press about the polling was decidedly negative and focused on the inaccuracy of the estimates in the Democratic
race. The Boston Globe, the major newspaper providing regional coverage of
the primary, ran a story under the headline Stunned by N.H., pollsters regroup
to seek answers (Mooney 2008), while The New York Times ran a much cited
op-ed piece by Andrew Kohut of the Pew Research Center under the headline
Getting It Wrong (Kohut 2008). And one of the leading bloggers on polling in
the United States wrote a postelection analysis piece under the headline N.H.:
A Lesson from 1948 (Blumenthal 2008). An analysis of the coverage leading
up to and for a week after the primary showed that the framing was directed
at the polling industry as a whole rather than toward specific firms and their
estimates (Traugott, Krenz, and McClain 2008), raising some consternation
among public and academic pollsters alike. Because of the press commentary
about the performance of the polls, the American Association for Public Opinion Research (AAPOR) appointed an Ad Hoc Committee on 2008 Presidential
Primary Polling to investigate what happened. The research reported here expands on and extends analysis we did as members of that committee. Given
the opportunity for extended analysis, we address the question of whether the
estimation problems in New Hampshire were unique to that early primary or
there were other elements of the primary process itself that might explain the
observed patterns of estimation difficulties across the primaries.
868
Traugott and Wlezien
1. For summaries of the 2008 results and other recent elections, see http://www.cnn.com/
ELECTION/2008/results/president/ and http://uselectionatlas.org/RESULTS/.
campaigns; Barack Obama alone raised and spent more than $750,000,000 in
both the prenomination and general election phases of his successful election
campaign. Obama became just the second President since 1988 to win a majority of the votes cast, receiving better than nine million more votes than George
Bush in 2004; his 7.2 percentage-point margin of victory was the largest since
1988 and well above the median in the post-World War II period.1 All of these
factors might also help to explain the relative accuracy of the final pre-election
polls in the general election.
An unusual aspect of the primary contest, at least for the Democrats, was
that it remained competitive until the last event they held in June. The structure
of the current nomination system was adopted formally by both parties for the
1976 election cycle, following reforms made by the Democratic Party after
its dysfunctional nominating convention held in Chicago in 1968. In widely
televised coverage, Mayor Richard J. Daley, the host of the event, had his
police disperse antiwar protesters outside the convention center by employing
riot gear and tear gas. It led to charges that inside the convention hall, the
partys nominee was being selected by party bosses in smoke-filled rooms. In
response, a commission was formed headed by George McGovern, a Senator
from South Dakota, and Donald Fraser, a Representative from Minnesota, to
propose revisions to the nomination process. Their recommendations, including
most notably that candidates had to contest primaries and caucuses to secure
pledged delegates who would be obliged to vote for them at the nominating
convention, were adopted for the 1972 Democratic convention, the one at which
McGovern became the nominee. Shortly thereafter, following the Watergate
disclosures about the solicitation of campaign contributions, Congress passed
the 1974 amendments to the Federal Election Campaign Act, which provided
public financing of presidential campaigns, limited the amount of money that
candidates could raise on their own in order to receive federal funding, and
required the disclosure of candidates receipts and expenditures. By 1976, both
political parties were operating under these new procedures for financing and
delegate selection.
From the beginning, this new system was beset by a series of unintended and
unanticipated consequences. The strategic use of money was one of them; the
candidates who raised and spent more early generally did better. Candidates
were forced to set up their own fundraising and field staffs because the political
parties did not want to make endorsements among a large number of contenders,
thereby diminishing the role of the parties in selecting their nominees. But most
importantly, candidates who raised more money and won the early contests
received moreand more favorablepress coverage, and all of these factors
combined to produce higher poll ratings and eventual votes (Arterton 1984).
In practice, the media became the main institutional force in establishing and
869
THE ACCURACY OF THE POLLS IN
2008
From the perspective of pre-election pollsters, survey methodologists, and public opinion researchers, the 2008 campaign had a certain Dickensian quality
about it.2 While the accuracy of pre-election estimates is not the onlyor
even the bestway to assess the performance of the pollsters, it is certainly
the most common. In the general election campaign, the accuracy of the final
nineteen pre-election estimates was as good as it has been since records have
been kept. Since the 1948 presidential election, public pollsters have made
concerted efforts to improve their survey methods in light of a number of social
and political shifts in the American electorate. In fact, the estimates produced
before the 2008 general election were as accurate as the pre-election polls have
been in the past 60 years (NCPP, 2008; Pollster.com, 2008). This conclusion
is supported by a variety of different measures used to assess the accuracy of
these data, including those developed by Mosteller et al. (1949) in the review
2. This is a not so thinly veiled reference to Charles Dickens story about the French Revolution,
A Tale of Two Cities (1859) that begins It was the best of times, it was the worst of times. It was
the age of wisdom, it was the age of foolishness . . .
maintaining candidate momentum; as a result, they also became the primary

factor in winnowing the field down to viable candidates who stood a chance
of winning in the fall (Bartels 1988). The net consequence of these factors is
that the new system rapidly produced a presumptive nominee in the campaigns
between 1976 and 2004 who had secured a majority of the pledged delegates
by mid-March or early April. Our interest is in how the dynamics of this new
primary system impacted the accuracy of pre-election polling in these events.
For most primary sequences, the winnowing was so rapid there was not enough
polling to consider this issue; but the long Democratic contest in 2008 provided
us with an unusual opportunity to pursue this question.
There are a number of ways to approach an analysis of the performance of
the polls, based upon past research. One would be to look at the accuracy of
individual firms, especially if we focused on methods as the main explanatory
factors. Instead, we focus here on the general performance of the polls in the
2008 primaries rather than the work of individual polling firms because of an
interest in systemic properties of the estimation. The data come from published
results of public pollsters from the week preceding each event for which polls
were conducted. Past research has shown that estimates produced nearer to the
election date are generally closer to the actual results, in both the primaries and
the general election (Felson and Sudman 1975; Crespi 1988; Beniger 1976). A
speculation by Mendelsohn and Crespi (1970, 119) suggests, however, that It
seems unlikely that preprimary polls will ever be able to accrue an accuracy
record comparable to that of pre-election polls, and that is an important element
of the notion that we explore here.
870
Table 1. How the 2008 General Election Poll Performance Compares

Historicallya
Election year
Average value of A
3
9
19
19
19
+.278
.084
+.063
.026
+.012
a The average values of A for 1948, 1996, 2000, and 2004 come from Traugott (2005: 648649)
while the value for 2008 was computed by the authors.
of the performance of the 1948 polls, two measures used by the National Council on Public Polls reflecting Candidate Error and Total Error (NCPP, 2008),
and a newer measure comparing the log odds ratio of the leading candidates
percentages in the election to their percentages in a poll (Martin, Traugott, and
Kennedy 2005).
The data presented in table 1 show how accurate the 2008 general election
polls were, overall and in comparison to other relevant benchmark elections,
using the measure A developed by Martin, Traugott, and Kennedy. In this analysis, the positive sign for the statistic shows an overestimate for the Republican
candidates vote. The 1948 election was, of course, the one in which Dewey
Beats Truman. In the 1996 re-election campaign of Bill Clinton, the polls
seriously overestimated his winning margin against Robert Dole, and the sign
of A is negative. The 2000 election was one of the closest in American history,
and the final pre-election polls very slightly overestimated George W. Bushs
win. In fact, Al Gore won the popular vote, but Bush won the electoral vote
after the Florida outcome was essentially decided by the U.S. Supreme Court.
In 2004, the outcome was again relatively close, but the polls did very well in
estimating the outcome. And the accuracy of the final pre-election polls in the
2008 general election, at least by this measure, was as good as it has ever been.
The average value of A for nineteen final pre-election estimates in 2008 was
0.012, taking the sign of the values into account; but even using the absolute
value, the average of 0.031 was superior to previous elections.
The performance of the polls in the 2008 general election was all the more
remarkable because of a number of issues that might have created problems
for estimation. The election was the first in which an African American candidate headed the ticket of one of the major parties, creating concerns about
the possibility of overestimating support for Barack Obama because of social
desirability in the responses leading to a Bradley effect (Langer 2008). Another issue was the growing number of cell-phone only (CPO) individuals who
might not be reachable with standard telephone sampling and interviewing
techniques (Keeter 2008a and Keeter, Dimock, and Christian 2008). Because
1948
1996
2000
2004
2008
No. of polls
871
Polling in primary elections is inherently more difficult than polling in a general

election. Usually there are more candidates in a contested primary than in a
general election, and this is especially true at the beginning of the presidential
selection process. For example, there were a total of 15 candidates entered in the
Iowa caucuses and more than 20 names on the New Hampshire primary ballot.
Since primaries are within-party events, the voters do not have the cue of party
identification to rely upon in making their choice. Uncertainty in the voters minds
can create additional problems for pollsters. Turnout is usually much lower in
primaries than in general elections, although it varies widely across events. Turnout
in the Iowa caucuses tends to be relatively low compared to the New Hampshire
primary, for example. So estimating the likely electorate is often more difficult in
primaries than in the general election. Furthermore, the rules of eligibility to vote
in the primaries vary from state to state and even within party; New Hampshire
has an open primary in which independents can make a choice at the last minute
in which one to vote. All of these factors can contribute to variations in turnout,
which in turn may have an effect on the candidate preference distribution among
voters in a primary election. (5)
The report did produce some conclusions about both likely and not so likely
explanations for the polling errors. In New Hampshire, the contraction of
the schedule after Iowa meant that pollsters may have picked up some of
Obamas momentum, but they probably stopped interviewing too early to capture late shifts in the electorate toward Clinton. There were also indications that
of the personal characteristics of such individuals, especially their age, there

was concern that their exclusion might have led to an underestimate of support
for Barack Obama. Hundreds of thousands of new registrants were put on the
books, starting during the hotly-contested Democratic primaries, and there was
a concern that historical likely voter models used in pre-election surveys would
have difficulty incorporating them because of a reliance on a survey item asking
about voting in previous presidential elections. But across the campaign year,
the public pollsters revised their procedures and developed new ones to deal
with these issues.
In the wake of the New Hampshire primary polling controversy, AAPOR
established an Ad Hoc Committee to investigate why the polls got it wrong
(AAPOR 2009). The committee made a number of observations about the
estimates in the primaries, focusing on New Hampshire, South Carolina, California, and Wisconsin, and it evaluated a number of hypotheses about what
might explain the errors. An initial observation was that the performance of the
polls in the two New Hampshire primaries was not really very different from
their performance in the Iowa caucuses just a few days before. That is, they
underestimated the Democratic winners percentage by about 10 percentage
points and were within one percentage point of the second place finishers, on
average. And they generally did much better in the Republican race than in the
Democratic. The authors of the report noted the difficulty of polling in primary
elections.
872
Explaining the Performance of the Polls in the 2008 Primaries

The literature on the relationship between elements of the primary process
and the accuracy of the polls is virtually nonexistent, although there are extensive bodies of literature on the behavior of candidates and the press in the
prenomination phase of a presidential campaign. Prior research on polling in
the primariesits quality and impact on votersis also quite limited. However,
there are reasons to think that the dynamics of the nomination process might
have some implications for poll performance, in particular due to the possible
effects of momentum and tactical voting.
respondents who required more effort to contact (more telephone calls to complete an interview) were more likely to support Clinton, but not many of these
types were included in the pre-election polling. There might have been problems
with weighting and likely voter modeling, although there was not enough detailed information made available from pollsters to draw definitive conclusions
here. There did not seem to be a strong social desirability effect or indications
of a Bradley effect. And the exclusion of cell phone only individuals did not
seem to have an effect on estimation either.
More generally, the polling problems in New Hampshirecontrasted against
the improved performance of polls in the general electionraise broader questions about polling accuracy in the primaries. Were the problems in New Hampshire unusual because it was the first primary and came only five days after
the Iowa caucus? Or were there difficulties in other primaries as well? And
what might explain the difference between estimation problems in the general
election compared to the primaries? The main focus of the analysis presented
here is whether the estimation difficulties in New Hampshire in 2008 were
related to factors in that contest or, in fact, were a more systematic set of issues
that plagued the preprimary pollsters. It has not been possible previously to
explore the patterns of polling accuracy in the primaries because there was not
enough available polling data, reflecting the fact that they had never run their
full cycle since the current system was implemented in 1976. Was the underestimation of support for Hillary Clinton in New Hampshire an isolated instance
of underreporting the winners share of the vote? Was the problem of the early
primaries one that was not repeated later in the sequence? Was it something
that happened only in the Democratic primaries but not on the Republican side?
By looking at the pattern of poll estimation compared to the actual outcome
of the primaries across a longer sequence of events in the calendar, we can
simultaneously address questions about the functioning of the prenomination
phase of presidential selection in the United States as well as the performance
of the pollsters who support the media coverage of themand hence levels of
public information in the national and state-level electoratesabout how the
system works to produce the eventual nominees.
873
To begin with, we consider the small amount of scholarship on polling

accuracy during the primaries and the perspective that it adopted. Beniger
(1976) reviewed the relationship between the polls and primary outcomes from
1936 to 1972. His available data consisted of 202 national polls conducted by
Gallup prior to the last primary in each election, compared to the results of 248
primaries or caucuses held by either of the parties across the ten presidential
elections in the series. His main observation was that the best predictor of the
eventual nominee was being the leader in the first polls, a credit to the impact
of name recognition. While his general conclusion was that the primaries
affected the polls more than the polls affected the primaries, the analysis is not
applicable to the system currently in effect. However, it is worth noting that one
of Benigers anomalous cases was the McGovern nomination in 1972, the first
year of the shift to the new nominating system for the Democrats and one that
involved an insurgent candidate who best understood the new rules of the game.
Only one article explicitly examines the performance of polls in the current
nomination system. Bartels and Broh (1989) analyzed the performance of three
organizations (the CBS News/New York Times poll, the Gallup Organization,
and the Harris Poll) as exemplars of the issues that pollsters faced in the 1988
primaries. Unfortunately, the polling efforts in the 1988 primaries were limited.
For example, Gallup conducted polls in only eight states before their primaries
or caucuses, CBS News/ New York Times conducted about twice as many exit
polls (twelve) as preprimary polls (five), and Harris conducted only national
polls at intervals of approximately one month throughout the primary season.
Moreover, Bartels and Broh found inconsistencies in the reporting of the poll
numbers. Checking a nonrandom sample of their respective press releases, the
authors found that Gallup typically reported data from polls conducted about
one week before, while Harris reported on data typically conducted two weeks
before the actual primary date. Nonetheless, in terms of their accuracy, Bartels
and Broh conclude that most of the primary polls underestimated the support for
each candidate (with the exception of Senator Robert Dole) in three politically
relevant samples: among all registered respondents, those planning to vote,
and those most likely to vote.
Although not as directly relevant, Hopkins (2009) briefly examined primary
polls in his investigation of the existence and persistence of a Wilder effect
and a Whitman effectthe tendency of voters to overestimate their support
for African American candidates and underestimate their support for female
candidates in statewide elections for Governor and U.S. Senator across the
period from 1989 to 2006. In his analysis of general election polls he found
there was a tendency to overstate support for African American candidates
early in this period, but that disappeared after 1996; and there never was an
underestimation of support for women. He extended his analysis to the 2008
Democratic primary series, looking specifically at the difference between poll
support for Barack Obama and Hillary Rodham Clinton and their vote shares.
He found that Obama consistently did slightly better in the elections than the
874
polls suggested. But Hopkins concluded that in primary states with few black
voters, the polls were generally accurate, while in states with many black voters
they consistently understated support for Obama. (See 776 for the discussion
of poll performance in general elections and 778 for the primaries.) This is the
opposite of the Wilder effect that would have been predicted among white
voters. And he did not observe any Whitman effect for Clinton.
Turning to the literature on candidate behavior and the voters response to
their campaigns, there are two key concepts in prior research on the presidential
nomination process that are useful. The first is momentum, or in its positive
form, the creation of a surge in candidate support from one event to another
in the primaries and caucuses (Aldrich 1980; Bartels 1988). At its simplest, a
candidates success in one event leads to success in subsequent events. This
partly is a result of the greater and generally more positive news coverage
that accrues to the winner of a primary or caucus. But there is a negative
form of this as well; those who finish poorly get little or no coverage that
often contains pejorative characterizations of their effort. Not only does this
coverage affect the subsequent standing of the candidates in the polls; but in
the modern nomination process, it also affects their fundraising ability. The
result can be an upward or a downward spiral in support. And if it is negative
momentum, it often results in the candidate losing the ability to pay field staff
or buy advertising. It is the essential pattern by which winnowing takes place
(Bartels 1988).
This leads to the second key concept, tactical voting. Voters who are most
likely to participate in primaries and caucuses are continually assessing the
prospects of the candidates in the field. They may take candidate viability into
account when deciding how to vote by voting tactically for their second- or
third-ranked candidate if their preferred candidate does not have much of a
chance to win the nomination (Abramson et al. 1992). At the beginning of the
nomination process, supporters may tend to stay with their preferred candidate,
if only due to the uncertainty surrounding early polls. Of course, some might
vote tactically even despite the uncertainty. As things become clearer and more
certain, we may expect more voters to support a less-preferred candidate with
better chances of winning the nomination. If their preferred candidate drops
out of the race, of course, voters have little choice but to vote for their second
or third choice or else not vote at all. The calculation differs for voters when a
presumptive nominee becomes known before the primary is scheduled in their
state. In such circumstances, some voters may decide not to go to the polls at
all, perhaps for tactical reasons as well (Fisher 2001). Others may support a
potential vice-presidential nominee, for instance. Or, in cases where the rules
allow, particularly when one partys presumptive nominee is known and the
others is not, voters might cross over to vote in the other partys primary with
an eye toward affecting the outcome of that race, such as voting for a weaker
or more extreme candidate in order to try to make the general election contest
less competitive.
875
Importantly, momentum and tactical voting have implications for poll performance. First, we would expect momentum to complicate the job of pollsters in
producing accurate results. This is because shifts in the electorates preference
distribution based upon recent outcomes in other states can only be captured if
two conditions are met: (1) the timing of a polls field period encompasses the
period when a surge might occur after the prior event and (2) the data collection
lasts long enough to measure whether it dissipates. The effects should be especially evident early on in the nomination process, when shifts in momentum
are most pronounced. Because of the rapidity of winnowing in the primaries
since 1976, with the exception of the 2008 contest between Obama and Clinton,
there have not really been any opportunities for late surges. Second, we would
expect polling accuracy to improve over time as the voters preferences crystallize under growing levels of news coverage and familiarity with the candidates
policy positions and personalities, especially when the nomination remains
competitive. When the presumptive nominee becomes known, however, polls
may be less predictive of the ultimate outcome because voter behavior may
change; while they almost certainly will get the winner right, the polls may
tend to overstate the winners vote percentage as some of the leaders supporters
become less likely to vote.
The widespread availability of press coverage of the process means that
many voters will have a good idea of where the candidates stand in each partys
field, especially the highly interested and motivated voters who participate in
the primaries and caucuses. Such information, in the form of a clear lead in the
polls or suggestions that one candidate is the presumptive nominee, may affect
a voters likelihood of voting tactically or voting at all.
One aspect of the 2008 contest on the Democratic side was that there was
a fairly rapid narrowing of the field, as has happened in the past, but the
competition between Obama and Clinton ran all the way through the primaries
in Montana and South Dakota on June 3. Obama had to wait until super
delegates began to announce their preferences in sufficient numbers before he
could credibly claim to be the nominee. On the Republican side, John McCain
became the presumptive nominee of the Republican Party shortly after Super
Tuesdaytechnically he clinched on March 4, based upon a combination of
his own electoral success and the Republicans winner-take-all allocation of
state delegates. Again, our interest is in whether the estimation problems the
pollsters had in New Hampshire were an isolated occurrence or whether they
appeared more systematically throughout the primary process. And if they did,
the question is: what might explain them? While it is true that estimation in
low information environments is difficult at best, could it be that there are
other contextual factors in the nature of the contests themselves that could have
contributed to the estimation problems?
From a pollsters perspective, estimating outcomes in primary elections is
a complicated process, especially in comparison to the general election. Since
these are intra-party affairs, voters do not have the party cue to rely upon; they
876
3. These issues are discussed on pages 46 and 47 of the AAPOR report (2008) and are also a
focus of the committees recommendation for changes to the disclosure requirements for public
pollsters.
face higher information costs as they try to sort out the candidates in the field. As
a result, it is typically only the strongest partisans and most motivated citizens
who turn out to vote. Turnout is much lower than in the general election, often
less than 10 percent, so estimating who will go to the polls is just as difficult
as determining preference. This is also complicated by the rules of the game,
as some events are closed to those who have not registered with the party,
while others allow anyone or at least independents to choose to participate
on Election Day. At the same time, one consequence of winnowing should
be that as the field gets smaller and choices more constrained, combined with
accumulated press coverage across the series of events, the pollsters should
be able to produce better estimates of preferences among the smaller field of
candidates later in the primary calendar compared to earlier.
There could also be methodological explanations for the performance of the
polls, especially due to the inadequacy of likely voter models developed in the
general election context for estimating turnout in the primaries. And there could
be differential prospects for turnout among different groups in the electorate
who may be attracted to a particular candidate. But pollsters remain reluctant
to divulge the details of this aspect of their pre-election methodology, so this
area remains unexplored.3
From the voters perspective, information gathering is a complicated and time
consuming process, requiring more effort at the beginning of the nomination
process than later on. If you are a prospective voter in Iowa or New Hampshire,
you have a good chance to meet the candidates through the kinds of retail politics
in which they engage in these states. As the field narrows through winnowing,
there are fewer candidates about whose personal background and policy and
issue positions you must learn. Furthermore, there are additional pieces of
information you can obtain from the news coverage of the process, including
estimations of viability and likelihood of securing the nomination. Some of
this information comes from poll results in the next states holding primaries
and caucuses or the level of candidates fundraising, as well as information
from national polls containing trial heat results for a variety of likely pairings
in the general election. Another obvious piece of information is the number of
pledged delegates that each candidate has secured, especially in relation to the
majority that is required to become the nominee.
These factors are related in one way or another to momentum during the
campaign. There has been more research on momentum in the general election
campaign than in the prenomination phase of the cycle because of the limited
amount of data available. And there has been no research on the impact of
momentum on polling accuracy. For one, the rapid rate at which winnowing
has taken place has limited the time series of polling data points available to
877
H1. From work on primary elections (Aldrich 1980; Bartels 1988), candidate
momentum from one event to another can produce underestimates in
measured poll support because pollsters will have difficulty capturing late
surges in support among voters, especially early in the calendar of events.
Thus, the winners vote share will tend to exceed the winners poll share
leading up to the election, especially when a surging candidate in a large
multicandidate field is attractive to voters. This would also seem to be
more likely when there is a strong insurgent candidate who might have
particular appeal to an identifiable segment of the primary electorate based
upon personal characteristics or specific policy positions.
H2. As the field is winnowed, press coverage of the remaining candidates accumulates, and information levels in the electorate increase, meaning poll
errors should decline over time across successive events in the calendar.
Voters will have fewer candidates to evaluate, and the information available about each one should be greater than at the start of the process. This
is likely to be especially true while the race remains competitive.
H3. The gap between the winners vote share and poll share across states will
vary with the size of the winners poll share leading up to Election Day.
The bigger the poll lead for the winning candidate in a particular primary
early in the process, the smaller the overestimate of their vote on Election
Day; the bigger the lead later on, when the nominee is known, the more
the candidate underperforms. The patterns are expected to reflect different
study. In order to investigate the effect of momentum on polling accuracy, a

researcher needs a relatively long time series of polling data and measurements
that suggest shifts in levels of support across the campaign. Secondly, it is only
in recent years that the costs of polling have declined to the point where there is
much more data being collected at the state level, in both the primaries and the
general elections. While this has not completely supplanted the use of national
samples to assess how the candidates match up within and across parties, state
samples do not provide a sufficiently textured view of how different electorates
are responding to the campaign for the nomination. The structure and demands
of news coverage in both contexts has emphasized the need for more such data,
and the rise of public opinion data aggregators such as Fivethirtyeight.com,
Pollster.com, and RealClearPolitics.com have made it more feasible to assemble
such data for analysis.
All of these factors came together in the 2008 primary campaigns to provide a unique opportunity to evaluate hypotheses about the dynamics of the
nomination process in each party. We do this in the context of the relationship
between the estimates of support for the leading candidates that were generated
in each state where sufficient polling was conducted, in relation to the actual
outcome of the elections. We already have acknowledged the limited prior research on the subject, but we also have noted that research on primary elections
themselves has implications. Let us summarize our major working hypotheses:
878
processes associated with different stages of the campaign. During early

events, large poll leads, by comparison with small ones, are likely to
already reflect surges in support toward the winner; after the presumptive
nominee becomes known, large leads are likely to encourage tactical
voting or nonvoting itself.
For this analysis we rely on the many state-level polls conducted during the
2008 nomination process that were listed on Pollster.com. Given the importance
of timing discussed above, we rely only on polls that were in the field during
the last week prior to each particular primary or caucus. This means that we do
not have data for all states, particularly on the Republican side after McCain
clinched his nomination. Earlier on, in both the Republican and Democratic
contests, pollsters stopped polling in some states where the result looked to be
a foregone conclusion. On the Democratic side, we also exclude Florida and
Michigan, which were not fully contested because the primaries there were
held in defiance of Democratic National Committee rulings.
All told, we have 258 polls in thirty-six different Democratic events and 219
polls in twenty-six Republican events that were conducted in the week before
each one. While we have never had such extensive data resources before, it is
important to note that the polls are not distributed evenly across states and tend
to be concentrated in the early part of the nomination process and in bigger states
where more delegates are at stake.4 (Appendices A and B list for each party the
states for which we have poll data in the week prior to the primary or caucus
along with the number of polls.) This has some consequence for our analysis,
particularly on the Republican side, as we will see. The polls that we do have
also are not equal, and we recognize the great variation in survey practices,
including survey mode, question wording, likely voter modeling, weighting
procedures, and sample size. However, we do not attempt to take account of
these differences in any way because of the difficulty of obtaining sufficient
information. The poll estimates used in the analysis are simple averages of the
results for each event.
The leading Democratic candidates had two characteristics that highlighted
the unusual nature of their candidacies: one was an African American and the
other was a woman. We incorporated measures of the percentage of African
Americans and women in each states population as indicators of a potentially
4. This includes all polls that were in the field during the last week, some of which began earlier on,
though the number is smalleighteen for Democratic events and thirteen for Republican ones. A
link to the full listing of the polls can be found at http://www.pollster.com/polls/2008-presidentialprimary-page.php. Additional questions about the data and the sources can be directed to the
authors.
Data Resources
879
Analysis
We start by looking at the distribution of this accuracy measure across the
primary calendar. Figure 1 shows the relationship between the polls and the
actual vote for the series of Democratic events in 2008. This is the first time
since 1976 that it has been possible to construct such a chart on the basis of
a contested sequence that ran across the full calendar of delegate selection
events. We plot the difference between the winners percentage of the vote for
the two leading candidates in the primary and the average of the winners share
of the measured support for the same two candidates in the polls the week
before the event. Again, a positive value indicates the candidates actual vote
share exceeded the pre-election poll share, while a negative value indicates
5. Although the percentage of registered voters would be better, reliable data across states are not
available. The data used in the analysis were drawn from the 2000 Census (http://www.census.gov/
main/www/cen2000.html).
6. The delegate data were obtained from two locations at RealClearPolitics, available
at http://www.realclearpolitics.com/epolls/2008/president/democratic_delegate_count.html and
http://www.realclearpolitics.com/epolls/2008/president/republican_delegate_count.html.
7. Most prior measures of accuracy are based upon two-candidate races and do not function well in
multicandidate races without specific adjustments. In addition, the Mosteller et al. (1949) measures
are based upon absolute values and hence do not provide any information about the direction of
the differences between the poll estimates and the actual vote shares. This is also true for the two
measures used by the National Council on Public Polls.
important appeal that these candidates may have had.5 What we do not get
from the publicly available polling data is estimated turnout and likelihood of
voting estimates for such subgroups in each polls sample. We also included
information on the number of delegates that each candidate had going into
an event and what proportion that was of the total required to secure the
nomination.6
Finally, we use a different measure of the accuracy of polls than has been
used in past research.7 All of the current measures are based upon assumptions
of a two-candidate race in a general election. In the new primary system,
there are often large multicandidate fields in the early primaries. In order to
accommodate this, we compute the difference between the leading candidates
vote share of the total for the top two candidates and the average of the leading
candidates share of poll support for the top two candidates. The vote share
is obviously based upon the outcome of the election itself. The poll share is
based upon the average for all of the polls conducted in the week prior to the
election, and we count as a poll in this period any one that was in the field on at
least one of the seven days prior to Election Day. The difference measure tells
us whether the winners vote share exceeded his or her poll share leading up
to the election. When it was underestimated, this measure assumes a positive
value; and when it was overestimated, it has a negative value. When the vote
share and poll share were equal, it has a value of zero.
880
10
O
O
O
C
C
C
O
O
C
C
O
C
O
O
O
O
C
C
O
C
C
C
C
C
O
10
0
50
100
Number of Days into Election Year
150
Figure 1. The Polls and the Vote in the 2008 Democratic Presidential Primaries.
NOTE.Each entry in the figure is the difference in a state between the Winners
share of the actual votes cast for the two leading candidates minus the Winners
average share of support for the two leading candidates in the polls in the week
leading up to the election. the Winners vote share, number of polls averaged,
and the Winners poll share are listed in Appendix A.
the candidate underperformed on Election Day. In this plot, a C represents a
Clinton victory while an O represents an Obama victory. The horizontal axis
represents the chronology of the events, expressed as days into the political
calendar, beginning with the Iowa caucus on January 3. Super Tuesday, on
February 5, was the date on which twenty-four states held primaries or caucuses
that included forty-five contests for the two parties. It is represented by the
vertical set of Cs and Os on the date just short of the fiftieth day of the
chronology. There are only fifteen data points in this series because polls were
not conducted during the last week in every one of the states where an event
was held, presumably because there was no serious contest there.
Notice in the figure that the winners vote share almost always exceeded
his or her share in the polls. In only seven of the thirty-six states for which
we have Democratic poll data is the winners vote share less than his/her poll
share. This is in direct contrast with what we typically observe in the November
general election, where poll leads are almost always reduced in the actual vote
(Wlezien and Erikson 2002).8 The tendency for leads to expand during the 2008
Democratic primaries was especially pronounced early on, before the long gap
8. The availability of significantly more polling data in 2008 produced more nuanced assessments
of whether the polls tightened at the end of that race. The results in the national polls seemed to hold
steady, while the results in many individual states seemed to tighten. See http://www.fivethirtyeight.
com/2008/11/todays-polls-111.html for a discussion and analysis of these data.
Winner's Vote Share Minus Poll Share
15
881
9. The one instance in which Obamas support is underestimated is in Utah, where only a single
poll estimate was available. Of the five states in which Clintons share was underestimated, in only
one case, South Dakota, was there a single poll. Both of these states have small electorates and
limited polling.
10. Interestingly, while the relationship between the polls and the vote evolved, the polls did not
become substantially more predictive of the final result. The correlation between the campaign
date and the absolute error is a statistically insignificant .10.
11. This notation is used so that an O for the Republican other candidates is not confused with
the O for Obama in Figure 1.
between the Mississippi and Pennsylvania primaries. During this period, the
mean gap between the vote and poll shares was 3.2 percentage points; in seven
cases, the average percentage was greater than 5 points. From this perspective,
New Hampshire was hardly an exception, as the difference in shares there (4.4)
was only a little above average. It was even less exceptional by comparison with
the absolute difference between the poll and vote shares, the mean of which
was 4.3 points.
Clearly there was a lot of late movement in Democratic preferences in various
states, and this was especially true where Obama won. In five of the seven cases
where the winners poll share exceeded the vote share, Clinton was the winner.9
The mean votepoll gap was 5.3 percentage points when Obama won and only
0.4 when Clinton won. In one sense, the pattern implies a reverse Bradley
effect, with the polls understating Obamas ultimate vote share. While another
explanation may be the inadequacy of likely voter models when there is either
an insurgent candidate or one who appeals especially to a certain segment of
the electorate such as African Americans, a point we address in multivariate
analysis below, these effects currently cannot be evaluated empirically because
of a lack of specifics about the likely voter models. More likely, it reveals a
surge in support toward the lesser known candidate as voters encountered him
during the campaign through his advertising and debate performances as well as
through the large amount of positive press coverage he received. Predictably, as
the campaign evolved and Obama became better known, the tendency declined
quite substantially.10
Things were different on the Republican side, where the winnowing process
was as rapid as it has been in past primary elections held under the new system,
and McCain became the presumptive nominee by Super Tuesday. In figure 2, we
again display the difference in the winners vote and poll shares, calculated as a
percentage of the two top vote-getters on Election Day, with M representing
a McCain victory and R indicating that another Republican candidate won.11
There are fewer data points (twenty-six) than for the Democrats (thirty-six)
because polling on the Republican side stopped in many states after McCain
became the presumptive nominee (see Appendix B). In the figure we again
see the early tendency for the winning candidates vote share to exceed his
poll share, though to a lesser degree. The pattern is more pronounced when
candidates other than McCain won, which parallels what we saw for Obama
882
10
R
R
R
M
R
M
M
R
M
M
M
M
M
M
M
M
M
R
M
10
15
0
50
100
150
Figure 2. The Polls and the Vote in the 2008 Republican Presidential Primaries.
NOTE.Each entry in the figure is the difference in a state between the Winners
share of the actual votes cast for the two leading candidates minus the Winners
average share of support for the two leading candidates in the polls in the week
leading up to the election. The Winners vote share, number of polls averaged,
and the Winners poll share are listed in Appendix B.
and the Democrats. In contrast with the Democratic side, however, the pattern
changed quickly. On Super Tuesday, the winners mean vote share actually was
virtually equal to the poll share. Thereafter, the winners vote share is always
lower, by more than 6.0 percentage points on average.
This implies a very different dynamic in the Republican nomination process.
There is reason to suppose that the differences between the Democratic and
Republican primaries reflect differences in the evolution of the contests. On the
Democratic side, a strong insurgent candidate emerged early on, which opened
up the possibility of late surges in support toward his candidacy before each of
the early contests that were not clear in the pre-election polls. The magnitude of
these surges declined as the nomination contest continued and he became better
known and his support crystallized. On the Republican side, a single strong
insurgent did not emerge. While we see some late surges in support toward
other winning candidates, McCain quickly became the leading contender, and
by Super Tuesday his ultimate nomination was clear. Candidate winnowing
happened quickly, though no more so than on the Democratic side, as shown
in figure 3.
What did happen quickly is that McCains share of delegates increased
dramatically, as shown in figure 4. Leading up to Super Tuesday, McCains
progress toward his nomination was largely indistinguishable from Obamas.
Winner's Vote Share Minus Poll Share
15
883
2
0
50
100
Democrats
150
Republicans
Figure 3. Candidate Winnowing during the 2008 Nomination Campaign

NOTE.The entries in the figure represent the number of candidates remaining
in the contest for the nomination on the given date, arrived at by subtracting
each withdrawal from the number of candidates who began active campaigning. The sources of the data are the summaries in http://en.wikipedia.org/
wiki/Democratic_Party_(United_States)_presidential_primaries,_2008 and
http://en.wikipedia.org/wiki/Democratic_Party_(United_States)_presidential_
primaries,_2008; the sources were independently checked.
Thereafter, the difference starts and grows, due largely to the winner-take-all
rules used in the Republican primaries (Polsby 1983), and the contest was
effectively over. As candidates dropped out, McCain polled better, but his
leads increasingly shrank by Election Day. It may be that his supporters were
increasingly less likely to turnoutbecause their votes did not matter (also see
Fisher 2001)while Ron Pauls supporters continued to vote for him regardless
of his chances.12
Table 2 summarizes the relationships depicted in the figures 3 and 4 in the
form of simple bivariate correlations between the timing of the primary, the
number of candidates and delegates, and the difference between the vote and
poll shares. In the top half of the table there are strong correlations on the
Democratic side between the election date and the number of candidates and
Obama delegatesthe later the date, the fewer the candidates and the more
delegates for Obama. However, it also can be seen that there is relatively little
connection between any of these factors and the pollvote difference. Although
the correlations with the Democratic pollvote difference all are in the correct
12. In this context, consider that as candidates drop out, an increasing number of McCains
supporters at that time previously supported other candidates.
Number of Candidates
884
.6
.4
.2
0
0
50
100
Democrats
150
Republicans
Figure 4. Delegate Accumulation during the 2008 Nomination Campaign

NOTE.The entries in the figure are the proportion of pledged delegates
accumulated by Barack Obama (Democrat) and John Mccain (Republican)
prior to each event, calculated as the sum of pledged delegates won in preceding
events divided by the total number of delegates at stake. The delegate data
were obtained from two locations at realclearpolitics, available at http://
www.realclearpolitics.com/epolls/2008/president/democratic_delegate_count.
html and http://www.realclearpolitics.com/epolls/2008/president/republican_
delegate_count.html.
direction, they are small and not highly reliable. In the bottom half of the table
the correlations between the election date and the number of candidates and
McCain delegates are similar to what we see for the Democrats, but there is a
much more powerful relationship between these factors and the pollvote gap.
Here the correlations are about 0.5 (in absolute terms) and all are significantly
different from 0. Evidently, the changing competitiveness of the race mattered
a great deal for the relationship between the leaders showing in the polls and
the vote for the Republicans.
Just as we expect the changing competitiveness of the nomination to structure
the relationship between the polls and the vote in primaries, we expect the
competitiveness of the primary to matter. That is, in states where a candidate is
dominating in the polls, we might not expect the lead to grow by Election Day;
we actually might expect a very big lead to shrink. Scholars focusing on general
elections have noted various reasons to expect such a pattern, including the
polarization of underlying preferences for candidates or decay in the strength
of previous campaign effects (see Wlezien and Erikson 2002). Nonvoting in
primaries is another, as we have noted above with regard to the competitiveness
of the overall contest. One might expect a relationship between the dynamics of
Percentage of Delegates Number
.8
885
Table 2. Selected Correlates of Election Timing (Two-Tailed p-Values in

Parentheses)
Number of Obama
delegates
Winners vote share minus
poll share
Republican primaries
Number of candidates
Number of McCain
delegates
Winners vote share minus
poll share
Number of
candidates
Number of
Obama/McCain
delegates
0.66
(.00)
0.89
(.00)
0.27
(.12)
0.56
(.00)
0.28
(.10)
.08
(.64)
0.62
(.00)
0.45
(.02)
0.91
(.00)
0.76
(.00)
0.47
(.01)
.53
(.00)
competitiveness and the winners poll share itselfas the nomination becomes
less competitive, poll results should become more lop-sided. This is exactly
what we observe. The correlation between the campaign date and the winners
pre-election poll share is 0.52 in Democratic primaries and a healthy 0.69 for the
Republicans. As the campaign wore on, candidates dropped out, and McCains
delegate share mounted, his raw poll share and two-candidate share both grew.
The bivariate analyses are useful, but they only take us part of the way toward
explaining the estimation errors in the preprimary polls; a multivariate analysis
is required. This is the only way to assess the joint effects of a decreasing
field of candidates and the accumulation of delegates on the relative accuracy
of the preprimary polls. Results of one analysis for the Democratic primaries
are displayed in table 3. The first column contains results of a simple baseline
regression containing the election date variable. As expected, we see that the
campaign date has a negative impact on the votepoll difference: over the course
of the 154-day nomination process, the model predicts that the gap between
winners vote and poll shares will completely disappear.13 The effect is not
highly reliable, however, and it also is not robust to the inclusion of the black
percentage of the state population, shown in the second column of the table.
Because the dependent variable is the winners vote share minus poll share,
we multiply the percentage times 1 in the states that Clinton won. That is,
the larger the black population, the more Obamas performance on Election
13. That is, the gap is predicted to be 154 times .027 plus 4.20, which equals .04.
Democratic primaries
Number of days
into election year
886
Table 3. Regressions Predicting Winners Vote Share Minus Poll Share, Democratic Primaries
Obama delegates (fraction of total)
0.14
(0.03)
0.43
(0.52)
Winners poll share
4.20
(1.24)
.07
.05
4.29
3.29
(1.19)
.23
.19
3.96
0.32
(0.09)
20.39
(4.77)
.45
.39
3.31
0.31
(0.09)
17.97
(5.71)
.47
.42
3.35
Percentage black (x 1 if a Clinton win)
Intercept
R-squared
Adj. R-squared
Root MSE
0.027
(0.017)
0.006
(0.014)
0.14
(0.03)
0.14
(0.03)
N = 36; p < .05. Numbers in parentheses indicate standard errors.
0.01
(0.03)
0.34
(0.09)
20.82
(4.77)
.46
.41
3.37
0.020
(0.016)
0.11
(0.04)
Number of days into election year
887
14. This specification is supported by diagnostic empirical analysis. That is, the effect of the black
population per se is virtually 0.0.
15. Note that the female percentage of the vote actually is only weakly (and not significantly)
correlated (Pearsons r = .04) with the Clinton vote. This is similar to the finding of Hopkins
(2009) with regard to support for Obama in states with large black electorates and his analysis of
the lack of a Whitman effect for Clinton in the primaries.
16. This is based on the regression in column 3 ignoring the effects of election date and percentage
black. The expected gap is calculated by dividing the intercept (20.39) by the coefficient (.32).
The estimate is virtually the same (65 percent) based on a bivariate regression relating the gap and
the winners poll share.
17. Separate analyses suggest that the results are not affected by variation in the intensity of polling
across states. For instance, dropping the seven states with only one poll during the last week does
not substantially change the results. The same is true after dropping the four other states with only
two polls during the last week.
Day exceeded his poll share and the less Clintons vote share exceeded her
poll share.14 The result is an important one, and confirms speculation in the
media about stimulated black turnout and its impact on Obama vote margins
that was rife in the wake of the South Carolina Democratic primary (Keeter
2008b; Associated Press 2008).15
Column 3 of table 3 adds the winners poll share as an indicator of how well
the winner was expected to do. It is a signal to voters about how likely the
candidate is to win and so should have effects on peoples decisions to vote
and how they vote, as discussed above. In the table we see that the winners
two-candidate poll share does have the expected negative effect on the winners
votepoll gap. The coefficient (.32) should not be taken to imply that poll
leads generally shrink, as we have already seen. Rather, the greater the poll
share, the less the winners vote share exceeded itfor each additional three
points in poll share, the winners votepoll gap declines by one percentage
point. With a poll share of 64 percent, we predict no real difference between
the vote and poll shares.16 With larger shares, we actually would expect the poll
leads to shrink by Election Day. The pattern does not change when substituting
the number of candidates or Obama delegates for the election date, as can be
seen in the fourth and fifth columns of the table. To the extent time mattered
for the relationship between the polls and the vote, therefore, it was mediated
by the effect of the winners poll share noted earlier. Still, it and the black share
of the population together account for just less than half of the variance in the
votepoll difference.17
Table 4 shows a different structure on the Republican side. In the first column
we can see the association noted earlier between the election date and the vote
poll gap. The effect drops some when the winners poll share is added in the
second column of the table. Thus the effect of time is partly mediated by
poll share. Poll share matters less than for the Democrats, however, and the
relationship between election date and the winners share remains strong, in
contrast with the Democratic side. Incorporating the direct indicators of election
dynamics provides even more purchase. This is clear in the remaining columns
888
Table 4. Regressions Predicting Winners Vote Share Minus Poll Share, Republican Primaries
0.18
(0.07)
0.14
(0.06)
McCain delegates (fraction of total)
McCain delegates (nonlinear)

Winners poll share
Number of days into election year

Intercept
1.38
(0.64)
12.26
(4.65)
0.18
(0.08)
14.58
(4.98)
.36
.30
4.37
0.19
(0.08)
4.42
(6.23)
.35
.30
4.37
0.17
(0.08)
4.83
(5.53)
.40
.35
4.20
5.08
(2.53)
.22
.19
4.69
N = 26; p < .05. Numbers in parentheses indicate standard errors.
26.93
(9.30)
0.17
(0.08)
4.58
(5.42)
.42
.37
4.13
R-squared
Adj. R-squared
Root MSE
889
Discussion and Conclusions

This analysis highlights the role of momentum in the primaries and the problems
that it can cause for pollsters who are estimating the outcome of the contests.
It also reflects indirectly on the impact of information in the primary sequence,
most likely in the form of indicators of candidate. The length of the contest on
the Democratic side, contrasted with the early conclusion on the Republican
side, offered a unique opportunity to observe the varied effects of contextual
factors on the quality of polling estimates.
In general, this analysis demonstrates the challenges pollsters face in estimating the outcomes of primary elections simply because of exogenous, contextual
factors. Polling in a low information environment like the party primaries is
not easy. Given the number of contests and the large field of contenders, this
is an especially formidable task early in the campaign. But the performance of
the polls improves only slightly as the field is winnowed. This is because other
forces come into play, including strategic voting in terms of both participation
and the expression of candidate preferences. As a result, by the measures that
18. The nonlinear specification involves expressing the delegate share with reference to the critical
.50 value, thereby taking negative values early in the process and positive values later, and then
squaring the differences, retaining the negative or positive sign.
19. When excluding the states where we have only one poll, the models account for approximately
50 percent of the variance in the poll-vote gap.
of table 3. In particular, substituting the number of McCain delegates, either in

linear or nonlinear form, adds to what we would predict based on the date of the
election.18 The growing McCain lead in delegates structured the relationship
between the polls and the vote over the course of the nomination process. This
is as we hypothesized earlier. The effects are both direct and indirect, through
the winners poll share itself. Together, these factors tell us a good deal about
the polls and the vote in the 2008 Republican nomination process, though they
still leave more than half of the variance unexplained.19
In summary, these results have shownwith both descriptive and multivariate analysisthat the polling problems in New Hampshire in 2008 were
not the exception, but the rule. The media devoted unusual attention to the
polling problems in New Hampshire because they came early and suggested
that the Democratic primary winner would be Obama instead of Clinton. But
the pattern of underestimating the winners vote share was present in most of
the primaries. The magnitude of the differences declined over time, and the
patterns were different for the Democrats and the Republicans. The analysis
shows that contextual factors in the primary process themselves had an impact
on the accuracy of the pre-election polls treated as a group, and the effects
differed by the timing and nature of the contest in each party. The implications
of this for our understanding of political campaigns and the conduct of polls in
the prenomination phase of the campaign are discussed in greater detail below.
890
we employed here, the performance of the polls improved only a little as the
campaign wore on. When there was a presumptive nominee known in one of
the parties, the nature of the estimation problems changed again. There are
other elements of the rules of the game that were not easily tested, such as the
impact of the Republicans tendency to employ a winner-take-all allocation of
the delegates at stake in the primaries as opposed to the Democrats use of
proportional allocation, but this is a ripe topic for future research.
The results highlight the impact of shifting candidate fortunes on voter preferences and poll estimation, at least in relation to the way that they could
be captured with the measures used here. While researchers have long suspected that a candidates short and long-term performance in the primaries
was important, it was usually unmeasured because of the rapidity with which
the presumptive nominees became known. So our results also highlight the
importance of having appropriate resources to run a successful campaign by
enduring beyond the first four to six weeks of the primary calendar. The rate
of winnowing in the early events has always been substantial, but these results
hint at the how the reduction in the size of the field, accumulation of delegates,
and the winners share of the vote in a prior event can affect popular support in
the polls and how this translates into votes at the next event, especially when
the presumptive nominee becomes known. In the 2008 campaign, early closure
only happened on the Republican side, but in subsequent campaigns it might
be possible to observe early underestimation of the winners share and later
overestimation in either partys contests.
What we do not learn from this analysis is how idiosyncratic the 2008 campaign was on the Democratic side and whether these results would generalize
to other contests or even to the Republicans if they were to witness an equivalently contested race for their nomination. Looking backward in time, there
were campaigns in the period between 1976 and 1984 where momentumor
the loss of itseemed to have been especially important. But we do not have the
storehouse of past polling information, especially at the state level, to evaluate
its impact empirically; it is the proliferation of polls at the state level that made
the current analysis possible. Looking forward, the accumulation of poll data
useful for such analyses may become complicated by the low cost of entry into
the polling business and the uneven data quality that might result. Eventually
it may become necessary to incorporate individual poll estimates and specific
estimates of house effects, if and when such effects become pronounced. However, this analysis suggests one potentially useful way to consider how well
the nomination process is working and how much pollsters are contributing to
the publics information about the candidates and their understanding of the
nomination process.
While our models have relatively strong explanatory power in social science
terms, they still leave unexplained slightly more than half of the variance.
We have noted above other possible contextual factors that might help to
explain the difference between the primary election results and the pre-election
poll estimates, including the quantity and content of the media coverage each
891
candidate receives. There also might be more useful measures of candidate

momentum that could be operationalized. All of these research directions are
good ones, though they would require data not readily available. Although this
is an interesting research agenda to pursue, it might prove difficult to assemble
the necessary comparative data in the near future. In 2012 the Democratic
primaries are unlikely to be contested, for example, while the Republicans may
have a very large field and a contest more amenable to analysis. Perhaps in
2016 both prenomination races will be wide open again, and the winnowing
will not be too rapid, enabling researchers to focus on the impact of contextual
factors in those nomination processes.
There is another fruitful area of research that investigates the methods used
by pollsters in the primaries and whether finer adjustments have to be made to
the methods used in the general election. Because of the limited availability of
information provided by the pollsters about their methods, it is not currently
possible to assess the influence of methodological techniques such as the design
of different likely voter models. But this analysis shows that polling in primaries
differs from polling in general elections. The performance of polls varies across
space and time in ways that are beyond the pollsters control. The structure of
candidate support varies across states and over time in important ways that are
unlikely to be captured by current polling methods, specifically by the stage of
campaign, the size of the field, and the competitiveness of the contest over time
and across states. All of these factors can have a consequence for the success
of pollsters in estimating the outcome of the contest in a specific state.
One central question that this research raises is whether the use of polling
methods developed for general election campaigns can be applied successfully
in the primaries. The first place to start looking for more effective methods
would be with likely voter models, modifying them so that they take into
account the factors that respondents use to assess the current status of the race
in a way that might point to different aspects of possible tactical voting. Of
course, it is possible that individual voters may not be able to express in an
interview the forces that are at work on them in the climate of opinion in which
they are embedded. Pollsters should also devote attention to differentiating the
modeling of likely voting behavior by particular subgroups in the population
that might be affected by particular personal characteristics or policy positions
of the candidates, such as race or gender in the 2008 Democratic campaign.
The difficulty of working on such refinements to current methods does not
mean that pollsters are at a disadvantage, only that their work becomes more
complicated. With additional research, it may even be possible to model the
aggregate effects of momentum and tactical voting on survey estimates. While
this could improve the accuracy of the estimates themselves, it may complicate
the job of pollsters and the journalists who report on their work to explain how
they produce their final estimates of primary outcomes. However, efforts to
improve communication about how they do their work would be justified by
the more accurate results they could produce.
892
Appendix A:
States with Polls during the Final Week before the Democratic Primary or
Caucus
Primary/
caucus date
Winning
candidate
Winners
vote sharea
Winners
poll sharea
Alabama
Arizona
California
Connecticut
Delaware
District of Columbia
Georgia
Illinois
Indiana
Iowa
Kentucky
Massachusetts
Maryland
Missouri
Mississippi
Montana
Nevada
New Hampshire
New Jersey
New Mexico
New York
North Carolina
Ohio
Oklahoma
Oregon
Pennsylvania
Rhode Island
South Carolina
South Dakota
Tennessee
Texas
Utah
Virginia
Washington
West Virginia
Wisconsin
February 5
February 5
February 5
February 5
February 5
February 12
February 5
February 5
May 6
January 3
May 20
February 5
February 12
February 5
March 11
June 3
January 19
January 8
February 5
February 5
February 5
May 6
March 4
February 5
May 20
April 22
March 8
January 26
June 3
February 5
March 4
February 5
February 12
February 9
February 5
February 19
Obama
Clinton
Clinton
Obama
Obama
Obama
Obama
Obama
Clinton
Obama
Clinton
Clinton
Obama
Obama
Obama
Obama
Clinton
Clinton
Clinton
Clinton
Clinton
Obama
Clinton
Clinton
Obama
Clinton
Clinton
Obama
Clinton
Clinton
Clinton
Obama
Obama
Obama
Clinton
Obama
57.14
54.35
55.32
52.04
55.79
76.00
68.04
66.33
51.00
56.72
68.42
57.73
62.89
50.52
62.24
58.16
53.13
52.00
55.10
50.52
58.76
57.14
54.08
63.95
59.00
55.00
59.18
67.07
55.00
57.84
51.52
59.40
64.65
68.69
72.04
58.59
50.24 (2)
51.14 (3)
51.17 (12)
48.89 (3)
48.84 (1)
70.00 (1)
59.52 (10)
66.29 (4)
52.91(13)
48.85 (14)
67.69 (3)
57.10 (3)
61.45 (6)
48.48 (8)
58.50 (5)
52.17 (1)
52.88 (5)
47.63 (39)
54.00 (12)
46.67 (1)
59.14 (7)
54.19 (14)
53.53 (13)
68.35 (2)
55.44 (4)
53.43 (16)
53.16 (1)
59.26 (14)
63.83 (1)
61.31 (6)
50.00 (16)
64.6 (1)
60.33 (7)
54.79(2)
72.83 (2)
52.84(5)
NOTE.The numbers in parentheses are the number of polls that were in the field during the week
before the event. The primary and caucus dates come from http://www.cnn.com/ELECTION/2008/
primaries/.
a Two-candidate share.
State
893
Appendix B:
States with Polls during the Final Week before the Republican Primary or
Caucus
Primary/
caucus date
Winning
candidate
Winners
vote sharea
Winners
poll sharea
Alabama
Arizona
California
Connecticut
Delaware
Florida
Georgia
Illinois
Iowa
Massachusetts
Maryland
Michigan
Missouri
Nevada
New Hampshire
New Jersey
New York
Ohio
Oklahoma
South Carolina
Tennessee
Texas
Utah
Virginia
Washington
Wisconsin
February 5
February 5
February 5
February 5
February 5
January 29
February 5
February 5
January 3
February 5
February 12
January 15
February 5
January 19
January 8
February 5
February 5
March 4
February 5
January 19
February 5
March 4
February 5
February 12
February 19
February 19
Huckabee
McCain
McCain
McCain
McCain
McCain
Huckabee
McCain
Huckabee
Romney
McCain
Romney
McCain
Romney
McCain
McCain
McCain
McCain
McCain
McCain
Huckabee
McCain
Romney
McCain
McCain
McCain
52.86
58.02
55.26
61.18
57.69
53.73
51.52
61.84
57.63
55.44
65.48
56.52
50.77
78.46
53.62
66.27
64.56
65.93
52.86
52.38
50.00
57.30
94.70
54.95
52.00
59.78
44.56 (7)
60.15 (2)
50.42 (11)
63.21 (3)
53.95 (1)
49.12 (29)
46.18 (10)
61.94 (4)
52.23 (15)
62.64 (4)
69.92(4)
50.74(11)
54.34 (8)
77.88 (3)
53.49 (30)
66.93 (9)
69.64 (10)
68.76 (8)
60.16 (2)
54.42 (16)
50.00 (7)
62.92 (12)
95.50 (1)
64.75 (5)
65.64(3)
59.87 (4)
NOTE.The numbers in parentheses are the number of polls that were in the field during the week
before the event. The primary and caucus dates come from http://www.cnn.com/ELECTION/2008/
primaries/.
a Two-candidate share.
References
Abramson, Paul R., John H. Aldrich, Phil Paolino, and David W. Rohde. 1992. Sophisticated
Voting in the 1988 Presidential Primaries. American Political Science Review 86:5569.
Aldrich, John H. 1980. Before the Convention: Strategies and Choices in Presidential Nomination
Campaigns. Chicago: University of Chicago Press.
State
894
American Association for Public Opinion Research. 2009. An Evaluation of the Methodology of the
2008 Pre-Election Primary Polls. Available at http://aapor.org/uploads/AAPOR_Rept_FINALRev-4-13-09.pdf.
Associated Press. 2008. Obama Wins Big Victory in S.C. Primary. January 26. Available at
http://www.msnbc.msn.com/id/22854377/.
Arterton, F. Christopher. 1984. Media Politics: The News Strategies of Presidential Campaigns.
Lexington, MA: Lexington Books.
Bartels, Larry M. 1988. Presidential Primaries and the Dynamics of Public Choice. Princeton, NJ:
Princeton University Press.
Bartels, Larry M., and Anthony Broh. 1989. The PollsA Review: The 1988 Presidential Primaries. Public Opinion Quarterly 53:56388.
Beniger, James R. 1976. Winning the Presidential Nomination: National Polls and State Primary
Elections, 19361972. Public Opinion Quarterly 40:2238.
Blumenthal, Mark. 2008. N.H.: A Lesson from 1948. January 10. Available at http://www.
pollster.com/blogs/nh_a_lesson_from_1948.php
Broder, David S. 2008. A Last Hurdle for Obama? The Washington Post, January 6.
Crespi, Irving. 1988. Pre-Election Polling: Sources of Accuracy and Error. New York: Russell
Sage.
Felson, Marcus, and Seymour Sudman 1975. The Accuracy of Presidential-Preference Primary
Polls. Public Opinion Quarterly 39:2326.
Fisher, Stephen D. 2001. Tactical Voting and Tactical Non-voting. Paper prepared for the Annual Conference of the American Political Science Association, San Francisco. Available at
http://malroy.econ.ox.ac.uk/fisher/FisherTT.pdf.
Hopkins, Daniel J. 2009. No More Wilder Effect, Never a Whitman Effect: When and Why Polls
Mislead about Black and Female Candidates. Journal of Politics 71:76981.
Keeter, Scott. 2008a. The Impact of Cell-Onlys on Public Opinion Polling: Ways of Coping
with a Growing Population Segment. Web article posted at http://people-press.org/report/391/.
. 2008b. The South Carolina Democratic Primary in Black and White. March 28. Washington: Pew Research Center. Available at http://pewresearch.org/pubs/708/south-carolina-primaryblack-vote.
Keeter, Scott, Michael Dimock, and Leah Christian. 2008. Cell Phones and the 2008 Vote: An
Update. Web article posted at http://pewresearch.org/pubs/964/.
Kohut, Andrew. 2008. Getting It Wrong. The New York Times January 10. Available at
http://www.nytimes.com/2008/01/10/opinion/10kohut.html.
Langer, Gary. 2008. Dissecting the Bradley Effect. Web article posted for ABC News at
http://blogs.abcnews.com/thenumbers/2008/10/the-bradley-eff.html.
Martin, Elizabeth A., Michael W. Traugott, and Courtney Kennedy. 2005. A Review and a Proposal
for a New Measure of Poll Accuracy. Public Opinion Quarterly 69:34269.
Mendelsohn, Harold, and Irving Crespi. 1970. Polls, Television and the New Politics. Scranton,
PA: Chandler.
Mooney, Brian C. 2008. Stunned by N.H., Pollsters Regroup to Seek Answers: Many Theories,
No Consensus. The Boston Globe January 10. Available at http://www.boston.com/news/local/
new_hampshire/articles/2008/01/10/stunned_by_nh_pollsters_regroup_to_seek_answers/.
Mosteller, Frederick, Herbert Hyman, Paul McCarthy, Edward Marks, and David Truman. 1949.
The Pre-election Polls of 1948: Report to the Committee on Analysis of Pre-election Polls and
Forecasts. New York: Social Science Research Council.
Polsby, Nelson. 1983. Consequence of Party Reform. New York: Oxford University Press.
Traugott, Michael W. 2005. The Accuracy of the Pre-election Polls in the 2004 Presidential
Election. Public Opinion Quarterly 69:64254.
Traugott, Michael W., Brian Krenz, and Colleen McClain. 2008. Press Coverage of the Polling
Surprises in the New Hampshire Primary. Paper presented at the Annual Conference of the
Midwest Association for Public Opinion Research, Chicago, IL, USA.
Wlezien, Christopher, and Robert Erikson. 2002. The Timeline of Presidential Election Campaigns. Journal of Politics 64:96993.

Traugott and Wlezien in POQ

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Traugott and Wlezien in POQ

Hochgeladen von

Copyright:

Verfügbare Formate

Public Opinion Quarterly, Vol. 73, No. 5 2009, pp.

THE DYNAMICS OF POLL PERFORMANCE DURING

Abstract This analysis focuses on estimation difficulties pollsters had

The Dynamics of Poll Performance

Background on the 2008 Campaign