How predictive is Pythag?

This is my first FanPost; I apologize if I did a lousy job.

Recently, the concept of Pythagorean records (or simply Pythag) has inspired a lot of discussion in this community, partially due to grumblings from fans and foes alike that the 2011 Tigers outperformed their Pythag. In this piece I'll attempt to discern how big of a deal that is.

First, I should probably explain what exactly "Pythag" means. Pythagorean win expectation is a stat invented by Bill James, the father of sabermetrics. It is calculated by squaring the amount of runs the team scored and dividing that number by the sum of runs scored (squared) and runs allowed (squared), which yields the win percentage of the team.

Pythag is often used to predict which teams are overperforming and underperforming. The 2011 Tigers outperformed their Pythagorean record by 6 wins, tied for the most in baseball last year with the Brewers and D-Backs (if you're curious, the "unluckiest" team was the Padres, at 8 wins under their Pythag). Before the cataclysmic news that the Tigers had signed Prince Fielder, this discrepancy led many pundits to declare that the Tigers were due for some serious regression in 2012. Here's where my study comes in.

I attempted to determine if Pythag was a better predictor of next season's record than the actual number of wins the team had that year. To do this, I compiled all team seasons from 2000 to 2010 in the OpenOffice equivalent of Excel, with the help of Baseball Reference.

One column (A) had that year's actual number of wins, column B had that year's Pythag wins, and C had the next year's actual number of wins (so I also used last year's win-loss records in column C for 2010). I assumed that eleven years of data and 330 team seasons would be a large enough sample; I also got kinda tired of doing data entry.

I then correlated column C with the other two columns to determine which measure was more predictive of future performance.


The correlation between number of actual wins and the number of wins the next season was .591. Pythag correlated with future performance similarly; that result was .594. Neither of these are particularly strong correlations, which is good. If baseball was much more predictable than this, there would be little reason to play the games every year!

So I suppose assuming that Pythag is more predictive of future performance than win-loss record is technically true, but an argument that a team is a regression candidate based entirely on differences between their Pythag and their record seems weak to me. Pythag still has a lot of value, but not much more value than how well the team actually did.

EDIT: My spreadsheet can be found here, for time/pain-in-the-arse reasons I only put the year next to the first team alphabetically for that season (Arizona from 2005 on, Anaheim before then).

SECOND EDIT: Kurt came up with a valid criticism that free agency presents a confound in this study. So I did the exact same thing from 1950-1960. Previous season's record had a .715 correlation coefficient; pythag had a .725 coefficient. So they're still about equally powerful at predicting future results, but both of these correlations are now barely strong correlations (I can't speak for every discipline, but social psychologists usually consider .70 to be the borderline of a strong correlation).

So without free agency, you could very weakly predict the next season's results. These correlations mean that the previous season accounts for about half of the variance in record for next season. Presumably the other half is breakout/slump seasons from stars, injuries, and dumb luck. Thanks to Kurt for the critique! Here is the 50s spreadsheet.

This is a FanPost and does not necessarily reflect the views of the <em>Bless You Boys</em> writing staff.