Prediction Success By Month
Posted by Joseph Wed, 30 Jul 2014 17:19:49 -0400
Filed Under: predictions science
Charles' last post (Game Prediction: Multi-Year Update ), which regarded the correlations between Southpaw user predictions and MLB results, rekindled an interest of mine: prediction success by month. I am testing my long-held assumption that prediction improves roughly throughout the season, as users become more savvy about current-season performance, and as the mid-season trade deadline sorts teams into contenders and wait-til-next-years.
The Game Prediction page started in August 2012, and has hosted over 12,000 predictions made by human beings, with a 55.6% success rate. Those 12,000 picks are a significant chunk of data -- breaking them down by month shows that the assumed trend does exist. The predictions improve markedly after the season's midpoint (until October, when they turn into virtual coin-flips).
Human Predictions By Month, August 2012 through July 2014:
The Southpaw system has picked correctly at a 54.3% clip during that same 16-month span, despite the considerable handicap of picking every game, regardless of the length of the odds. Here's the monthly breakdown of the computer's picking:
The computer improves in almost linear fashion, which hopefully represents the accumulation of information incorporated into the ratings. Interestingly, September and (especially) October are the two months in which the computer outperforms the humans.
Game Prediction: Multi-Year Update
Posted by Charles Thu, 24 Jul 2014 21:03:53 -0400
Filed Under: predictions science teams
Since 2012 Southpaw has hosted a game prediction challenge where registered users predict wins and losses one game at a time. Four people are actively predicting games right now, as many as seven have predicted since 2012. Through July 22 of this year, 12,374 individual predictions have been made, usually two or three predictions per game.
At this point enough predictions have been made on each team to draw some interesting conclusions. I suspect at least two relations exist between predictions and Major League results. The next two charts illustrate these relations and the degree to which they are true, in a statistical sense.
The more intuitive relation connects the number of times teams are predicted to win to the teams' Major League results. The chart below shows data points for all 30 Major League teams. The horizontal axis shows their cumulative winning percents for the 2012, 2013, and 2014 seasons through July 22. The vertical axis marks how many times Southpaw users have predicted victory for those teams.
Click image for larger version.
The Houston Astros occupy the lower left corner, with only 77 bets placed on them to win. Detroit is the most popular choice, picked to win 706 times. The team farthest to the right, the team with the best record, is Oakland. The Athletics have somehow slipped under the radar as a great team that the Southpaw collective has yet to feel extreme confidence picking. Part of this has to do with a perception that the A's play more games that are competitive than the Tigers. While the Tigers get to beat up on Kansas City and Minnesota, the A's match up with superior intra-division foes Texas and Anaheim.
The next chart plots the same team winning percents on the horizontal axis. The vertical axis shows the success rate of bets placed on those teams. The correlation on this chart is much lower. The R-squared number of .602 indicates that about 60 percent of the success rate of bets is explained by the success rate of the teams in their games. The first chart carries an R-squared number of .773, showing about 77 percent of “desire” to pick a team is based on the teams record. The drop makes sense as a lot of uncertainty is introduced in the field.
Click image for larger version.
Some other factors pollute the correlation in the first chart. Southpaw users made win predictions for the Seattle Mariners 486 times, despite their .469 record. A more rational populace would have predicted them about 350 times over the same interval, assuming the trend to pick teams based on their success is accurate. Local biases are assumed the cause. Most users live closer to Seattle than any other Major League city.
Other known biases including user soft spots for Philadelphia, Baltimore, and St. Louis teams may also contribute to disparity between picking frequency and team success.