Where do they get their numbers from? Well it is a formula know as the Pythagorean Expectation developed by legendary baseball statistician Bill James that uses the difference in the number of runs a team scores versus the number of runs a team gives up to develop an “expected” winning percentage. I get it. It makes sense that a good team scores more runs than it gives up. But since I started doing a little Wiki-research into the statistic, I have been confused as to what it truly signifies. Is it an after the fact analytic tool or a predictive one? If it can be used to predict the results of future games, how so? Over what sort of time frame? Will it accurately predict future records or overall season records that include games already played.
I have seen it described as a predictive tool and have seen baseball pundits use the statistic to argue that the Orioles cannot maintain their current pace and will soon “collapse”. The argument is this: on July 4, the Orioles record was 44-37. Yet in those 81 games, they had been outscored by a total of 26 runs. This is unusual. How can it be? Answer: the Orioles have won a disproportionate share of close ballgames, while losing most of the one-sided games. Conventional (or “Pythagorean”) wisdom attributes a large proportion of the close wins to “luck” and avers this luck “cannot continue”. Eventually, this pattern will “catch up” to you. Their formula says so.
This formula is simple: take the number of runs scored, square it. Now divide that number by the sum of the squares of the number of runs scored and the number of runs allowed. This will yield an expected winning percentage. In the Orioles case, on July 4, this number was .463. Meaning given the number of runs scored and runs allowed, we would expect them to have won 46.3 % of their games. Yet, they had won 54% of their games.
So is this Pythagorean expectation number an analytic tool or a predictive one? As an analytic tool, the statistic says “There is something unusual about this team. Let's research further”. I absolutely agree with that. But what about its predictive value? I read many sports columns and chats and most of those pundits seem to value it as a predictor of future games. Almost all predicted that the Orioles would “come back to Earth”. They could not expect to continue winning such an inordinate amount of close games. The luck involved evens up and a team's record starts to match its Pythagorean expectation. But does it?
Let's have a look at this year's American League East. On July 4, roughly the halfway point of the season, the numbers looked like this:
Yankees: record 49 wins 32 losses 384 runs scored 326 runs given up +58 run difference
Orioles: record 44 wins 37 losses 341 runs scored 367 runs given up -26 run difference
Rays: record 43 wins 39 losses 342 runs scored 340 runs given up +2 run difference
Red Sox: record 42 wins 40 losses 411 runs scored 361 runs given up +50 run difference
Blue Jays: record 42 wins 40 losses 411 runs scored 384 runs given up +27 run difference
Plugging the runs scored and allowed numbers into the “Pythagorean Expectation” formula yields the following expected win percentage for the teams (July 4 actual win percentage in parentheses): Yankees: .581 (.605) Orioles: .463 (.543) Rays: .501 (.524) Red Sox: .564 (.512) Blue Jays .534 (.512)
Let's fast forward to late August and as the numbers predict, the Yankees and Red Sox are once again fighting it out for the Division title while the Blue Jays are trying to grab that new second wildcard playoff position. Tampa Bay is hoping to finish with a winning record while the Orioles are once again in last place.
Oh wait, that's not how it is? In fact, the Pythagorean numbers derived above are way off for 4 of the 5 teams. Boston and Toronto have collapsed leading to overall records well below the Pythagorean Expectation. Tampa Bay and Baltimore have continued winning games and are now both fighting for the playoffs. Interestingly, the Yankees record has declined to the point where their current winning percentage exactly matches the July 4 Pythagorean expectation of .581.
So what does it mean? In a word, nothing. At least this year for this division. The Pythagorean Expectation points out that “something is going on” with the Orioles and Red Sox. But it certainly did not predict the second half results accurately. There is something unusual about both teams. Why are the Orioles better than they should be? And why do the Red Sox suck? (other than because of their obnoxious fans)? In the Orioles case it probably involves shaky starting pitching combined with great relief pitching. For the Red Sox, injuries have hurt as well as a reported lack of team chemistry. Certainly, Red Sox management showed little faith in the Pythagorean numbers last week when it made the decision to blow up their roster. It boils down to that “the whole is greater than the sum of its parts” thingie. Works for the O's, not so much for the Sawks.
Oh and one other little difference: Who would you rather have manage your team? Bobby Valentine or Buck Showalter?