Week 10: Model Evaluation

ShuXin Ho

2024/11/17

Election Blog

Week 10: Model Evaluation

Recap of My Model and Predictions

Prior to the 2024 presidential election, I developed a weighted ensemble model using super learning. The weights for this ensemble were determined from the out-of-sample performance of multiple OLS models incorporating various predictor variables, including lagged vote share, economic indicators, polling data, demographic variables, incumbency consideration, and their combinations.

My final forecast predicted Kamala Harris would win the national two-party popular vote with 56.76%, but lose the Electoral College vote, securing only 226 votes. The model accurately predicted the results of the battleground states, including Arizona, Georgia, Michigan, Nevada, North Carolina, Pennsylvania, and Wisconsin, all of which voted for Donald Trump.

Electoral College Vote Share Evaluation

Below is the electoral college map showing the election outcome, which matched my model’s predictions.

The following bubble map illustrates county-level results by total number of votes casted in each county, to better visualize the vote distribution. Democratic vote share is concentrated in highly populated urban areas, while Republican vote share cover a broader geographic area, as shown by the widespread red bubbles.

My prediction error for each state is demonstrated in the graph below. All values still result in the correct prediction for the party winner of each state’s two-party popular vote.

## Bias: -7.638292

Despite predicting the the electoral college vote share correctly, my prediction for the national two-party popular vote share went terribly wrong as of November 18th, where I overestimated The Democratic Party’s popular vote by 7.64 percentage points.

Therefore, I will focus on evaluating the reason for inaccuracy in my national two-party popular vote share model. I hypothesize a few reasons for my model’s inaccuracy and propose corresponding changes.

1. Unique circumstance of incumbency

The Harris-Trump matchup presented a unique incumbency scenario: Harris was the sitting vice president, while Trump was a former president. My initial model did not fully account for this dynamic.

Change: Replace the \(IncumbentPresident\) variable with \(PrevAdmin\) to better capture the influence of prior administrations in such scenarios.

2. Polling model shortcomings

My polling model’s equation is $$D\_pv2p = D\_NetLatest538PollAverage + D\_NetMean538PollAverage(30 weeks) + NetLatestJobApproval + NetMeanJobApproval(June-Oct)$$ for Democratic vote share, which I then repeat for Republican vote share separately, then rescaling them to a 100%. Predictors like net job approval and polling averages are meaningful only when used to model the incumbent party’s vote share, which is not the case in my model that uses Democratic Party and Republican Party vote share as response variables.

Change: Re-run the polling model with the inclusion of dummy variables for \(PrevAdmin\) and \(IncumbentParty\).

3. Poor predictability of demographics

The demographic model exhibited high variability in out-of-sample MSE, leading to a low weight in the ensemble model.

Change: Remove demographics as a predictor variable to simplify the model and reduce variability.

Table 1: In-Sample and Out-of-Sample MSEs with Ensemble Weights by Year (National)
YearIn-Sample Economy MSEIn-Sample Polling MSEIn-Sample Combined MSEIn-Sample Ensemble MSEOut-of-Sample Economy MSEOut-of-Sample Polling MSEOut-of-Sample Combined MSEOut-of-Sample Ensemble MSEEconomy WeightPolling WeightCombined Weight
20204.8370591.912297801.56122812.317541e+0413.618462765477.4512980.0000000.13910900.78951670.0713743
201612.2967682.344419203.52924058.347200e-034.7241477378.9477790.0000000.46975740.47896880.0512738
201212.1480080.581049201.43865194.862322e+0039.740060923.6928400.0000000.33199880.22552340.4424777
20083.7104081.437641400.46122572.049025e+0333.7935739393.8696330.0000000.22394420.37516250.4008934
200412.2683072.071711308.11539743.388892e-015.93506229.5602240.0000000.81023640.17547390.0142897
200012.1886762.440661103.71180691.979935e+000.0577291378.7936800.0000000.50170740.45638620.0419064
19968.2653062.435949202.14219794.938677e+010.221622627.6003330.0000000.04100150.93046710.0285315
19927.8842492.428134702.28689156.602771e+010.4192285511.7750860.0000000.00458350.96931980.0260966
198811.5764912.106198700.00000009.176773e+005.22446622.7370422.7370420.00000000.00000001.0000000
19844.1095272.435949200.41105311.407379e+020.221622627.6003330.0000000.19415860.33719140.4686500
19808.9749000.170033201.20431335.151202e+0287.38218394381.1927560.0000000.34893440.46565570.1854098

YearPartyPredicted Vote Share (%)Winner
2024Democrat49.0203FALSE
2024Republican50.9797TRUE
## Bias: 0.1039336

Incorporating the changes above, my revised model predicted the national two-party popular vote share with 0.1 percentage points error.

Other than the changes I have done above, I propose future modifications for election prediction models.

1. Accounting for voter turnout

Hypothesis: Lower turnout among traditionally Democratic voters, possibly due to dissatisfaction with the administration’s handling of difficult issues such as the Gaza conflict, reduced Harris’s vote share.

Test: Compare turnout rates by demographic groups in 2024 to previous elections using voter file data and assess whether historically Democratic demographics (such as younger voters and minority ethnic groups) had a decline in turnout.

2. Economic reality versus perception

I will incorporate economic variables that more accurately measures voters’ perception of the economy, as people might still be recovering from the impact of COVID-19 economic downturn, so unemployment, GDP and RDPI figures may not reflect the full extent of how voters perceive the economy.

Hypothesis: My economic predictors (such as unemployment, GDP, S&P 500 volume, and real disposable personal income) failed to capture voters’ subjective perceptions of the economy. For example, while unemployment rates were low, the impact of food price inflation or the lingering effects of the COVID-19 pandemic might have weighed more heavily on voters’ decisions, given that April 2022 food prices inflation rate rose up to 10.8%, for instance.

Test: Analyze survey data on voters’ perception of economic conditions and correlate these perceptions with vote choices.

I will also expand economic variables:

3. Adjusting for airwar according to latest trends

Hypothesis: While traditional airwar analysis focuses on campaign spending on television advertisements, modern media platforms such as podcasts, social media, and celebrity endorsements may play a significant role in shaping public opinion, especially among younger voters who are tech-savvy or older voters who have a lot of time to spend on their devices. For example, Trump’s appearance on Joe Rogan’s podcast, Harris’s endorsement by Taylor Swift, or interactions on platforms like FaceBook, Instagram and TikTok may influence voter sentiment in ways that are not captured by traditional ad spending metrics.

Test: Collect data on the following metrics and compare them to the candidates’ vote share in specific demographic groups (such as younger voters) to evaluate their predictive power.

Metrics include: