home / fivethirtyeight / pollster-ratings/raw-polls

Menu
  • GraphQL API

pollster-ratings/raw-polls: 3

This directory contains the data behind FiveThirtyEight's pollster ratings.

See also:

  • FiveThirtyEight's Pollster Ratings
  • The Death Of Polling Is Greatly Exaggerated
  • The State Of The Polls, 2019
  • The Polls Are All Right
  • The State Of The Polls, 2016
  • How FiveThirtyEight Calculates Pollster Ratings

Past data:

  • 2014
  • 2016
  • 2018
  • 2019
  • 2020

pollster-stats-full.xlsx contains a spreadsheet with all of the summary data and calculations involved in determining the pollster ratings as well as descriptions for each column.

pollster-ratings.csv has ratings and calculations for each pollster. A copy of this data and descriptions for each column can also be found in pollster-stats-full.xlsx.

raw-polls.csv contains all of the polls analyzed to give each pollster a grade. Descriptions for each column are in the table below.

Header Definition
pollno FiveThirtyEight poll ID number
race Election polled
year Year of election (not year of poll)
location Location (state or Congressional district, or "US" for national polls)
type_simple Type of election (5 categories)
type_detail Detailed type of election (this distinguishes between Republican and Democratic primaries, for example, whereas type_simple does not)
pollster Pollster name
methodology Methodology used to conduct this poll. One or more of the following values:
  • Live Phone — Live telephone interviews, may or may not include calls to cell phones
  • IVR — Interactive voice response, otherwise known as automated polls or "robopolls"
  • Mail— By U.S. mail or other “snail mail” service
  • Online— Poll conducted by Internet; generally this mean by web browser, or application-based polling of mobile phones
  • Text — Poll conducted by text message
  • Face-to-face — Poll conducted in person
Mixed method polls list the methods involved in the poll separated by slashes.
partisan Flag for internal/partisan poll. "D" indicates Democratic poll, "R" indicates Republican poll, "I" indicates poll put out by independent candidate's campaign. Note that different sources define these categories differently and our categorization will often reflect the original source's definition. In other words, these definitions may be inconsistent and should be used carefully.
polldate Median field date of the poll
samplesize Sample size of the poll. Where missing, this is estimated from the poll's margin of error, or similar polls conducted by the same polling firm. A sample size of 600 is used if no better estimate is available.
cand1_name Name of Candidate #1. Candidates #1 and #2 are defined as the top two finishers in the election (regardless of whether or not they were the top two candidates in the poll). In races where a Democrat and a Republican were the top two finishers, Candidate #1 is the Democrat.
cand1_party Name of Candidate #1. Candidates #1 and #2 are defined as the top two finishers in the election (regardless of whether or not they were the top two candidates in the poll). In races where a Democrat and a Republican were the top two finishers, Candidate #1 is the Democrat.
cand1_pct Candidate #1's share of the vote in the poll.
cand2_name Name of Candidate #2. Candidates #1 and #2 are defined as the top two finishers in the election (regardless of whether or not they were the top two candidates in the poll). In races where a Democrat and a Republican were the top two finishers, Candidate #2 is the Republican.
cand2_party Party of Candidate #2.
cand2_pct Candidate #2's share of the vote in the poll.
cand3_pct Share of the vote for the top candidate listed in the poll, other than Candidate #1 and Candidate #2.
margin_poll Projected margin of victory (defeat) for Candidate #1. This is calculated as cand1_pct - cand2_pct. In races between a Democrat and a Republican, positive values indicate a Democratic lead; negative values a Repubican lead.
electiondate Date of election
cand1_actual Actual share of vote for Candidate #1
cand2_actual Actual share of vote for Candidate #2
margin_actual Actual margin in the election. This is calculated as cand1_actual - cand2_actual. In races between a Democrat and a Republican, positive values indicate a Democratic win; negative values a Republican win.
error Absolute value of the difference between the actual and polled result. This is calculated as abs(margin_poll - margin_actual)
bias Statistical bias of the poll. This is calculated only for races in which the top two finishers were a Democrat and a Republican. It is calculated as margin_poll - margin_actual. Positive values indicate a Democratic bias (the Democrat did better in the poll than the election). Negative values indicate a Republican bias.
rightcall Flag to indicate whether the pollster called the outcome correctly, i.e. whether the candidate they had listed in 1st place won the election. A 1 indicates a correct call and a 0 an incorrect call; 0.5 indicates that the pollster had two or more candidates tied for the lead and one of the tied candidates won.
comment Additional information, such as alternate names for the poll.

Data license: CC Attribution 4.0 License · Data source: fivethirtyeight/data on GitHub · About: simonw/fivethirtyeight-datasette

This data as json, copyable

rowid poll_id question_id race_id year race location type_simple type_detail pollster pollster_rating_id methodology partisan polldate samplesize cand1_name cand1_id cand1_party cand1_pct cand2_name cand2_id cand2_party cand2_pct cand3_pct margin_poll electiondate cand1_actual cand2_actual margin_actual error bias rightcall advancedplusminus comment
3 26026 31266 1736 1998 1998_Sen-G_NV NV Sen-G Sen-G FM3 Research 91 Live Phone D 10/13/1998 488.0 Harry Reid 3964 DEM 49.0 John Ensign 3965 REP 44.0   5.0 11/3/1998 47.86 47.77 0.09 4.91 4.91 1.0 -1.023783833419476 for unspecified Democratic sponsor
Powered by Datasette · Queries took 158.208ms · Data license: CC Attribution 4.0 License · Data source: fivethirtyeight/data on GitHub · About: simonw/fivethirtyeight-datasette