Forecasting Elections from Partial Information Using a Bayesian Model for a Multinomial Sequence of Data
Soudeep Deb, Shubhabrata Das, Rishideep RoyJournal: Journal of Forecasting
Abstract:
Predicting the winner of an election is of importance to multiple stakeholders. This work focuses on the context where the counting of results is publicly disclosed in batches (or rounds). Stemming from public interest, this often leads to competition among media houses to call the election, i.e. to predict the outcome correctly and as early as possible. This study calls an election statistically, rather than with absolute certainty, thereby making it possible to call the election much earlier, based on trends. The authors also put in place two checks to ensure that in this process they do not call the election too early or with much less certainty. The work is motivated by and demonstrated with election data from two different settings – one from legislative assembly election held in Bihar during October-November of 2020 and the other from the United States of America Presidential election held in 2020 around the same period.
To formulate the problem, the authors consider an independent sequence of categorical data with a finite number of possible outcomes in each. The data is assumed to be observed in batches, each of which is based on a large number of such trials and can be modelled via multinomial distributions. They postulate that the multinomial probabilities of the categories vary randomly depending on batches. The challenge is to predict accurately on cumulative data based on data up to a few batches as early as possible. On the theoretical front, they first derive sufficient conditions of asymptotic normality of the estimates of the multinomial cell probabilities and present corresponding suitable transformations.
Then, in a Bayesian framework, they consider hierarchical priors using multivariate normal and inverse Wishart distributions and establish the posterior convergence. The desired inference is arrived at using these results and ensuing Gibbs sampling. The methodology is demonstrated with real datasets and additional insights of the effectiveness of the proposed methodology are attained through a simulation study.
Read more