Pythagorean Wins for Major League Baseball
by Dave Tufts - May 23, 2006 / 11:28pm View more articles
- flowers are blooming
- lawns are turning green
- the Red Sox have a decent lead over the Yankees
As with every spring, the Sox appear to be unstoppable. By the end of July, however, all will change in quaint New England. It will be wicked hot. Everybody's lawns, gardens, and flower beds will start dying or turning yellow. And the Sox will see their springtime lead disappear as the Yankees burn it up, like a newborn left out on a sunny Cape Cod beach.
Am I just another pessimistic Bostonian? Possibly. But, I also just compiled the Pythagorean win numbers for all MLB teams. As I pointed out last fall, for the NFL, Pythagorean wins represent the number of wins a team should have won at this point in the season. The Pythagorean win estimate is based on (1) the number of games they've played; (2) the number points they've scored; and (3) the number of points they've given up.
View Pythagorean Wins for Major League Baseball
I just updated the script to include both NFL and MLB. The NFL data is screen scraped from Yahoo!. Yahoo is easy to scrape data from because they use clean, simple HTML. Unfortunately, I couldn't find runs scored and runs allowed on Yahoo's MLB standings page, so I scrape the MLB data from ESPN
Another difference between the NFL and MLB is the factor. In the most basic terms, the pythagorean formula for all sports, looks like this:

According to Bill James and Baseball Prospectus the correct factor for baseball, instead of squaring points scored and points against, you should raise the numbers by a factor of 1.82. For the NFL, I use a factor of 2.37
Please download the PHP script to see the source. It also comes will a simple class to cache the screen scraped data for a period of time
Hopefully the Sox won't repeat every other year not called 2004. But as of May 23rd, the Sox pythagorean win ranking is 14th, while the Yankees are 5th.
Check the rankings throughout the season and see how it unfolds
Comments have been turned off on this blog.
Read something more recent.
1 Comment
Fred recently pointed out that both ESPN and MLB.com now show Pytagoreans on their baseball standings pages.
ESPN calls it "ExWL", for Expected Win-Loss record, and it's available on their Expanded MLB Standings page.
Major League Baseball's standings page also offers the pythagorean record as "X W-L", for expected win-loss.