Election Blog
Week 8: Shocks: Game Changer or Status Quo?
As Florida residents woke up to the devastation left by the recent hurricanes, Democratic presidential candidate Kamala Harris, who is also the current vice president, stepped in to provide support and demonstrate leadership. Meanwhile, her challenger, Republican candidate Donald Trump, criticized the government’s poor management of the crisis and pledged his support to those affected.
This brings up an important question: Do shocks and unexpected events, such as hurricanes, scandals, and protests, truly influence election outcomes? Or are they just temporary disruptions that do not change the political landscape? Sides and Vavreck (2013) hypothesized that such game changers may shape short-term narratives, but ultimately it is the fundamentals determining the election outcome and reinforcing the status quo.
In states where the Democratic Party won in the previous election (shown in blue), the two-party vote share has decreased slightly but remains above 50%. In contrast, states that did not vote Democratic in the last election (shown in red) show a slight increase in Democratic vote share, but with more variability.
However, removing 1996 data points from the graph tells a completely different story, states with more hurricanes tend to have higher Democratic vote shares, regardless of prior voting history. The relationship between hurricanes and vote share seems less dramatic, supporting the idea that shocks like hurricanes tend to maintain the status quo rather than being true game changers in the electoral landscape.
Therefore, while unexpected shocks present candidates with an opportunity to appeal to voters, they may not have a lasting impact on election outcomes. For this reason, I’ve decided not to include these variables in my final prediction model.
Building the Final Election Prediction Model
Consideration: Economic Variables
Building on my findings from Week 2, where I found that GDP growth and the unemployment rate in the second quarter of the election year are the best predictors of election outcomes, I’ve now expanded my analysis to include additional economic variables that better capture consumers’ perceptions of the economy. While GDP growth and unemployment are important, they don’t always translate into immediate effects that voters feel. There can be a time lag or disconnect between macroeconomic indicators and individual experiences.
To address this, I am incorporating the Index of Consumer Sentiment measured by the University of Michigan into my national two-party popular vote share and electoral college share prediction. These indices gauge consumer sentiment by asking questions like, “Would you say that you (and your family living there) are better off or worse off financially than you were a year ago?” and “Looking ahead, which would you say is more likely—that in the country as a whole we’ll have continuous good times during the next five years, or that we will have periods of widespread unemployment or depression?” These additional variables offer a more direct reflection of how voters perceive the economy and may provide a more accurate prediction of election outcomes.
Random forest
Candidate | Predicted Two-Party Vote Share (%) |
---|---|
Kamala Harris | 47.73014 |
Donald Trump | 52.26986 |
The random forest model shows that indeed, the Index of Consumer Sentiment is one of the most important predictors of the national two-party popular vote share. Based on economic fundamentals alone, I predict that Harris will receive 47.73% of the popular vote, with a test error of approximately 6.76 percentage points, suggesting the model may be overfitting, as indicated by a training error of 0 percentage points.
Consideration: Demographic Variables
As noted in Week 5, demographic variables are fairly important predictors of voting behavior, hence I will use the 1% voter file data from October 2024 to capture demographic distributions in my final state-level electoral college share prediction.
Consideration: Polling Averages
To account for historical trends in polling data, I will incorporate polling data from 1992 onward, a period when polling methodologies became more sophisticated. Starting from this election cycle also factors in significant realignments in American politics, so my model can reflect current ideological divides.
Candidate | Predicted Two-Party Vote Share (%) |
---|---|
Kamala Harris | 50.71448 |
Donald Trump | 49.28552 |
My prediction of Harris receiving 50.7% of the popular vote and Trump 49.3% has a margin of error, as reflected in the MSE of 0.984, which means the average difference between my prediction and true vote share in this case would be around 0.992 percentage point. This margin of error explains why the economic variables and polling data show different prediction outcome, which I will reconcile in my final prediction model.
Party | Year | State | Predicted Two-Party Vote Share (%) |
---|---|---|---|
democrat | 2024 | Arizona | 50.36323 |
republican | 2024 | Arizona | 49.63677 |
democrat | 2024 | California | 57.64164 |
republican | 2024 | California | 42.35836 |
democrat | 2024 | Florida | 50.27411 |
republican | 2024 | Florida | 49.72589 |
democrat | 2024 | Georgia | 50.45827 |
republican | 2024 | Georgia | 49.54173 |
democrat | 2024 | Maryland | 51.08386 |
republican | 2024 | Maryland | 48.91614 |
democrat | 2024 | Michigan | 50.10148 |
republican | 2024 | Michigan | 49.89852 |
democrat | 2024 | Minnesota | 50.47868 |
republican | 2024 | Minnesota | 49.52132 |
democrat | 2024 | Missouri | 49.54522 |
republican | 2024 | Missouri | 50.45478 |
democrat | 2024 | Montana | 49.58290 |
republican | 2024 | Montana | 50.41710 |
democrat | 2024 | Nebraska | 49.71781 |
republican | 2024 | Nebraska | 50.28219 |
democrat | 2024 | Nebraska Cd 2 | 50.08388 |
republican | 2024 | Nebraska Cd 2 | 49.91612 |
democrat | 2024 | Nevada | 50.56689 |
republican | 2024 | Nevada | 49.43311 |
democrat | 2024 | New Hampshire | 50.45168 |
republican | 2024 | New Hampshire | 49.54832 |
democrat | 2024 | New Mexico | 50.31918 |
republican | 2024 | New Mexico | 49.68082 |
democrat | 2024 | New York | 50.81280 |
republican | 2024 | New York | 49.18720 |
democrat | 2024 | North Carolina | 50.39013 |
republican | 2024 | North Carolina | 49.60987 |
democrat | 2024 | Ohio | 49.20338 |
republican | 2024 | Ohio | 50.79662 |
democrat | 2024 | Pennsylvania | 50.50616 |
republican | 2024 | Pennsylvania | 49.49384 |
democrat | 2024 | Texas | 49.00246 |
republican | 2024 | Texas | 50.99754 |
democrat | 2024 | Virginia | 50.04569 |
republican | 2024 | Virginia | 49.95431 |
democrat | 2024 | Wisconsin | 50.44591 |
republican | 2024 | Wisconsin | 49.55409 |
At the state-level prediction, an MSE of 69.3 indicates that, on average, the model’s predictions deviate from the actual vote share by about 8.3 percentage points. This error in predicting vote share could mean missing whether a state swings to a particular party, especially in battleground states where races are often decided by margins of 1-2%.
To find out the cause of the high state-level MSE, I plotted residuals using a heat map. Certain years, like 2008 and 2020, show consistently high residuals across states, indicating that my model may miss critical time-specific factors. State overestimations and underestimations suggest that unique state-level dynamics are not fully captured by polling data, leading me to limit polling data to national predictions only.
Prediction: National Two-Party Popular Vote Share and Electoral College Seats
I will predict the national two-party popular vote share and electoral college seats by evaluating the predictive accuracy of these three models: (1) super learning using a weighted ensemble of OLS models, (2) random forest, and (3) regularized regression. To enhance the robustness of these predictions, I will also conduct 1,000 simulations to provide not only point estimates but also prediction intervals.
Code developed with the assistance of ChatGPT.