FanPost

How predictive is Pythag?

This is my first FanPost; I apologize if I did a lousy job.

Recently, the concept of Pythagorean records (or simply Pythag) has inspired a lot of discussion in this community, partially due to grumblings from fans and foes alike that the 2011 Tigers outperformed their Pythag. In this piece I'll attempt to discern how big of a deal that is.

First, I should probably explain what exactly "Pythag" means. Pythagorean win expectation is a stat invented by Bill James, the father of sabermetrics. It is calculated by squaring the amount of runs the team scored and dividing that number by the sum of runs scored (squared) and runs allowed (squared), which yields the win percentage of the team.

Pythag is often used to predict which teams are overperforming and underperforming. The 2011 Tigers outperformed their Pythagorean record by 6 wins, tied for the most in baseball last year with the Brewers and D-Backs (if you're curious, the "unluckiest" team was the Padres, at 8 wins under their Pythag). Before the cataclysmic news that the Tigers had signed Prince Fielder, this discrepancy led many pundits to declare that the Tigers were due for some serious regression in 2012. Here's where my study comes in.

I attempted to determine if Pythag was a better predictor of next season's record than the actual number of wins the team had that year. To do this, I compiled all team seasons from 2000 to 2010 in the OpenOffice equivalent of Excel, with the help of Baseball Reference.

One column (A) had that year's actual number of wins, column B had that year's Pythag wins, and C had the next year's actual number of wins (so I also used last year's win-loss records in column C for 2010). I assumed that eleven years of data and 330 team seasons would be a large enough sample; I also got kinda tired of doing data entry.

I then correlated column C with the other two columns to determine which measure was more predictive of future performance.

Results:

The correlation between number of actual wins and the number of wins the next season was .591. Pythag correlated with future performance similarly; that result was .594. Neither of these are particularly strong correlations, which is good. If baseball was much more predictable than this, there would be little reason to play the games every year!

So I suppose assuming that Pythag is more predictive of future performance than win-loss record is technically true, but an argument that a team is a regression candidate based entirely on differences between their Pythag and their record seems weak to me. Pythag still has a lot of value, but not much more value than how well the team actually did.

EDIT: My spreadsheet can be found here, for time/pain-in-the-arse reasons I only put the year next to the first team alphabetically for that season (Arizona from 2005 on, Anaheim before then).

SECOND EDIT: Kurt came up with a valid criticism that free agency presents a confound in this study. So I did the exact same thing from 1950-1960. Previous season's record had a .715 correlation coefficient; pythag had a .725 coefficient. So they're still about equally powerful at predicting future results, but both of these correlations are now barely strong correlations (I can't speak for every discipline, but social psychologists usually consider .70 to be the borderline of a strong correlation).

So without free agency, you could very weakly predict the next season's results. These correlations mean that the previous season accounts for about half of the variance in record for next season. Presumably the other half is breakout/slump seasons from stars, injuries, and dumb luck. Thanks to Kurt for the critique! Here is the 50s spreadsheet.

This is a FanPost and does not necessarily reflect the views of the <em>Bless You Boys</em> writing staff.

X
Log In Sign Up

forgot?
Log In Sign Up

Please choose a new SB Nation username and password

As part of the new SB Nation launch, prior users will need to choose a permanent username, along with a new password.

Your username will be used to login to SB Nation going forward.

I already have a Vox Media account!

Verify Vox Media account

Please login to your Vox Media account. This account will be linked to your previously existing Eater account.

Please choose a new SB Nation username and password

As part of the new SB Nation launch, prior MT authors will need to choose a new username and password.

Your username will be used to login to SB Nation going forward.

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

By becoming a registered user, you are also agreeing to our Terms and confirming that you have read our Privacy Policy.

Join Bless You Boys

You must be a member of Bless You Boys to participate.

We have our own Community Guidelines at Bless You Boys. You should read them.

Join Bless You Boys

You must be a member of Bless You Boys to participate.

We have our own Community Guidelines at Bless You Boys. You should read them.

Spinner.vc97ec6e

Authenticating

Great!

Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.

tracking_pixel_9351_tracker