There’s an interesting problem in trying to predict the winner of the Academy Award for Best Director. The two people who have won the award the most in the other ceremonies: Ben Affleck for Argo, and Kathryn Bigelow for Zero Dark Thirty, weren’t nominated this time around. Technically, Steven Spielberg is considered the frontrunner for Lincoln, but there’s no indication that the voters liked the film.
There was a similar situation in the 2009 Academy Awards, as Kate Winslet was nominated for a different category in the Oscars than elsewhere. Most organizations nominated her for the Best Actress category in Revolutionary Road, and the Best Supporting Actress category for The Reader. She was considered the favorite in both. However, the Academy decided to nominate her for Best Actress for The Reader.
Nate Silver came up with a model to predict the outcomes of the 85th Academy Awards, and screwed up with the Best Supporting Actress category. He considered his mistake afterwards.
What, if anything, did the incorrect prediction reveal to us about the model’s flaws?
Was the model wrong for the wrong reasons? Or was it wrong for the right reasons?
What, if any, improvements should we make to the model given these results?
In the miss on the Best Supporting Actress category, the model was a bit confused. If I actually had to put money on one of the candidates, it would have been on Penelope Cruz — not its choice of Taraji P. Henson. The reason why the model got “confused” is because of an unusual circumstance surrounding the Best Supporting Actress award. Namely, three of the four major awards that I tracked in this category (the Golden Globes, the Screen Actors Guild Awards and the Critics’ Choice Awards) were won by Kate Winslett, who was not on the ballot in this category at the Oscars. (Instead, the Academy considered her performance in The Reader to be a lead role.) Since the recipients of the non-Oscar awards are the single most important factors in predicting the Oscars, this deprived the model of much of the information that it would ordinarily use to make its forecasts.
However, I’m not sure this is such a good “excuse”. The one major award that wasn’t won by Winslett — the BAFTAs — was instead won by Cruz. What the model should probably have done instead was to throw out the results of the Globes, the SAGs and the Critics in making its forecasts — to treat them as missing variables. (There is a big difference between ‘missing’ and ‘zero’). This would have placed more emphasis on the BAFTAs — the only award that gave us useful information about the relative performances of Cruz against the other candidates.
If I had done this, it turns out, the model would have made Cruz the favorite, assigning her about a 60 percent chance of victory. This is something we could and probably should have thought about in advance. Failures, nevertheless, sometimes have a way of focusing the mind and pointing the way forward.