Why can computers beat humans at chess but not predict election outcomes with great precision? Experts in 2020 mostly forecasted that Biden would win by a large enough margin to avoid the kind of quibbling and recounts we are now seeing. I don’t write this as a criticism of the high-profile clever Nate Silver, or any other forecaster. I’m thinking through it as a data scientist.
First, consider a successful application of modern data mining. How did AlphaZero “learn” to play chess? It generated millions of hypothetical games and decided to use the strategies that looked successful ex-post. AlphaZero has excellent data and lots of it.
If we think about actual election outcomes, there aren’t enough observations to expect accurate forecasts. If each presidential election is one observation, then there have only been about 50 since the founding hundreds of years ago. No data scientist would want to work with 50 data points.
You can’t say “in the years when ‘defund the police!’ was associated with Democrats, the GOP presidential candidate gained among married women”. There has only ever been one presidential election when that occurred. Judging by what I have been observing of the DNC post-mortem on Twitter in the past week, that might not happen again. See this tweet for example:
I know very little about political analysis. Only from what I know about data science, I would imagine that computers will get better at predicting the outcomes of races for the House of Representatives.
House representatives serve 2-year terms. There are over 400 House elections every 2 years.
Think about this over one decade of American history. There are actually more than 400 representatives in the house, but let’s imagine a “Shelter” of Reps with 400 members for ease of calculation.
In one decade, there are usually two presidential elections. That means we get 2 observations to learn from. In the same decade, there would be 400×5 “Shelter” elections. That yields 2,000 observations, which is considered respectable for the application of data mining methods.
One application of such a forecasting machine would be to determine which slogans are the most likely to lead to success.