The thing about predictive analytics is that the quality of a prediction is eventually exposed — clearly cut as right or wrong. There are casually incorrect outcomes, like a weather report failing to accurately declare the time it will start raining, and then there are total shockers, like the outcome of the 2016 presidential election.
In my opinion, the biggest losers in this election cycle are pollsters, analysts, statisticians and, most of all, so-called pundits.
I am saying this from a concerned analyst’s point of view. We are talking about colossal and utter failure of prediction on every level here. Except for one or two publications, practically every source missed the mark by more than a mile — not just a couple points off here and there. Even the ones who achieved “guru” status by predicting the 2012 election outcome perfectly called for the wrong winner this time, boldly posting a confidence level of more than 70 percent just a few days before the election.
What Went Wrong?
The losing party, pollsters and analysts must be in the middle of some deep soul-searching now. In all fairness, let’s keep in mind that no prediction can overcome serious sampling errors and data collection problems. Especially when we deal with sparsely populated areas, where the winner was decisively determined in the end, we must be really careful with the raw numbers of respondents, as errors easily get magnified by incomplete data.
Some of us saw that type of over- or under-projection when the Census Bureau cut the sampling size for budgetary reasons during the last survey cycle. For example, in a sparsely populated area, a few migrants from Asia may affect simple projections like “percent Asians” rather drastically. In large cities, conversely, the size of such errors are generally within more manageable ranges, thanks to large sample sizes.
Then there are human inconsistency elements that many pundits are talking about. Basically everyone got so sick of all of these survey calls about the election, many started to ignore them completely. I think pollsters must learn that at times, less is more. I don’t even live in a swing state, and I started to hang up on unknown callers long before Election Day. Can you imagine what the folks in swing states must have gone through?
Many are also claiming that respondents were not honest about how they were going to vote. But if that were the case, there are other techniques that surveyors and analysts could have used to project the answer based on “indirect” questions. Instead of simply asking “Whom are you voting for?”, how about asking what their major concerns were? Combined with modeling techniques, a few innocuous probing questions regarding specific issues — such as environment, gun control, immigration, foreign policy, entitlement programs, etc. — could have led us to much more accurate predictions, reducing the shock factor.
In the middle of all this, I’ve read that artificial intelligence without any human intervention predicted the election outcome correctly, by using abundant data coming out of social media. That means machines are already outperforming human analysts. It helps that machines have no opinions or feelings about the outcome one way or another.
Maybe machine learning will start replacing human analysts and other decision-making professions sooner than expected. That means a disenfranchised population will grow even further, dipping into highly educated demographics. The future, regardless of politics, doesn’t look all that bright for the human collective, if that trend continues.
In the predictive business, there is a price to pay for being wrong. Maybe that is why in some countries, there are complete bans on posting poll numbers and result projections days — sometimes weeks — before the election. Sometimes observation and prediction change behaviors of human subjects, as anthropologists have been documenting for years.