A blog post on the site Not In HD about what this particular blogger would love to see removed from Major League Baseball got me to thinking about something statistically. It stems from his feeling that the All-Star game is a ridiculous way to determine who should have home field advantage in the World Series, and I couldn’t agree with him more. Unfortunately, because of the current nature of baseball, straight-up Win-Loss record isn’t fair either, the schedule isn’t balanced.
Since the advent of Division Play, the idea that you play every team in the league the same number of times has gone out the window, although even then your variability was only based on full divisions, not individual teams. Interleague Play has changed all of that. No longer is a league guaranteed to have a win-loss record of .500 among all its teams, and now schedules are even more unbalanced than they ever were. There’s the issue of the four team AL West compared with the six team NL Central too, which I am fully aware of as a Reds fan! This stuff is just harder to figure out than it was before.
So because of my renewed love for great statistical analysis sites, I thought I’d ask a question that’s probably been asked before: what offsets of win-loss can I use to better determine who the best team in baseball was against their own league?
I have two possible ideas on how to do this:
- Idea #1: Use a combination of team winning percentage and strength of schedule winning percentage to determine an “index number”. Anything above .500 in each is a positive number, below is negative. So, if you had a .600 winning percentage against .495 competition, you would have a .095 index (.100 – .005). The advantage is that it directly covers strength of schedule, the possible disadvantage is that it would be swayed in the favor of the league that won interleague play unfairly.
- Idea #2: Offset the team winning percentage by the league’s overall winning percentage. This would provide maybe a .005 max offset in one direction for one league and the other way for the other league, so you could try and put both leagues back on a level playing field. It would help the issue but I don’t think it would fully help explain the solution.
So which one is right? Let’s take a look. I was able to quickly find strength of schedule information for the 2008 season, so we will go back in time a little bit.
| TEAM | WIN | SOS | INDEX | RANK | RPI | RPI-RANK |
| Los Angeles Angels | 0.617 | 0.502 | 0.119 | 1 | 0.531 | 3 |
| Tampa Bay Rays | 0.599 | 0.517 | 0.116 | 2 | 0.538 | 1 |
| Boston Red Sox | 0.586 | 0.515 | 0.101 | 3 | 0.533 | 2 |
| Chicago Cubs | 0.602 | 0.499 | 0.101 | 4 | 0.525 | 4 |
| New York Yankees | 0.549 | 0.515 | 0.064 | 5 | 0.524 | 5 |
| Philadelphia Phillies | 0.568 | 0.493 | 0.061 | 6 | 0.512 | 8 |
| Chicago White Sox | 0.546 | 0.509 | 0.055 | 7 | 0.518 | 7 |
| Milwaukee Brewers | 0.556 | 0.498 | 0.054 | 8 | 0.512 | 9 |
| Totonto Blue Jays | 0.531 | 0.516 | 0.047 | 9 | 0.52 | 6 |
| Minnesota Twins | 0.54 | 0.503 | 0.043 | 10 | 0.512 | 10 |
| New York Mets | 0.549 | 0.49 | 0.039 | 11 | 0.505 | 13 |
| Houston Astros | 0.534 | 0.502 | 0.036 | 12 | 0.51 | 11 |
| St. Louis Cardinals | 0.531 | 0.497 | 0.028 | 13 | 0.506 | 12 |
| Florida Marlins | 0.522 | 0.493 | 0.015 | 14 | 0.5 | 16 |
| Los Angeles Dodgers | 0.519 | 0.487 | 0.006 | 15 | 0.495 | 18 |
| Cleveland Indians | 0.5 | 0.502 | 0.002 | 16 | 0.501 | 14 |
| Texas Rangers | 0.488 | 0.505 | -0.007 | 17 | 0.501 | 15 |
| Arizona Diamondbacks | 0.506 | 0.485 | -0.009 | 18 | 0.49 | 23 |
| Oakland Athletics | 0.466 | 0.506 | -0.028 | 19 | 0.496 | 17 |
| Kansas City Royals | 0.463 | 0.505 | -0.032 | 20 | 0.494 | 19 |
| Detroit Tigers | 0.457 | 0.502 | -0.041 | 21 | 0.491 | 21 |
| Cincinnati Reds | 0.457 | 0.502 | -0.041 | 22 | 0.491 | 22 |
| Colorado Rockies | 0.457 | 0.485 | -0.058 | 23 | 0.477 | 26 |
| Baltimore Orioles | 0.422 | 0.515 | -0.063 | 24 | 0.492 | 20 |
| Atlanta Braves | 0.444 | 0.492 | -0.064 | 25 | 0.48 | 25 |
| San Francisco Giants | 0.444 | 0.485 | -0.071 | 26 | 0.475 | 27 |
| Pittsburgh Pirates | 0.414 | 0.503 | -0.083 | 27 | 0.481 | 24 |
| Seattle Mariners | 0.377 | 0.503 | -0.12 | 28 | 0.471 | 28 |
| San Diego Padres | 0.389 | 0.485 | -0.126 | 29 | 0.461 | 30 |
| Washington Nationals | 0.366 | 0.494 | -0.14 | 30 | 0.462 | 29 |
Info from ESPN Site: http://espn.go.com/mlb/stats/rpi/_/year/2008
So the straight-up index, and really anything that just bases on strength of schedule shows bias (and probably fairly so) on behalf of the team with the better interleague record, in this case the AL. Their record this year was 150-102. While that does not seem like a lot of games spread out over 30 teams, it was enough to give the AL a total winning percentage of .511, and the NL .491. These are of course not evened out because of the different number of teams.
This is where Idea #2 comes into play! If you want to determine which team was the best in their respective league, you have to use an offset to basically eliminate Interleague play from the equation. So, for the second experiment, we’re going to just subtract .011 from each AL team’s winning percentage, and give each NL team a .009 boost. To add some spice to it too, I’m going to do the same thing for each team’s index number, so that strength of schedule is given proper emphasis too. This might allow the in-league strength (strong versus weak division) to come out as opposed to just overall strength of schedule. Here’s what we got:
| TEAM | WIN | OFFSET | NEW WIN | NW RANK | SOS | NW IND | RANK |
| Chicago Cubs | 0.602 | 0.009 | 0.611 | 1 | 0.499 | 0.11 | 1 |
| Los Angeles Angels | 0.617 | -0.011 | 0.606 | 2 | 0.502 | 0.108 | 2 |
| Tampa Bay Rays | 0.599 | -0.011 | 0.588 | 3 | 0.517 | 0.105 | 3 |
| Boston Red Sox | 0.586 | -0.011 | 0.575 | 5 | 0.515 | 0.09 | 4 |
| Philadelphia Phillies | 0.568 | 0.009 | 0.577 | 4 | 0.493 | 0.07 | 5 |
| Milwaukee Brewers | 0.556 | 0.009 | 0.565 | 6 | 0.498 | 0.063 | 6 |
| New York Yankees | 0.549 | -0.011 | 0.538 | 10 | 0.515 | 0.053 | 7 |
| New York Mets | 0.549 | 0.009 | 0.558 | 7 | 0.49 | 0.048 | 8 |
| Houston Astros | 0.534 | 0.009 | 0.543 | 8 | 0.502 | 0.045 | 9 |
| Chicago White Sox | 0.546 | -0.011 | 0.535 | 11 | 0.509 | 0.044 | 10 |
| St. Louis Cardinals | 0.531 | 0.009 | 0.54 | 9 | 0.497 | 0.037 | 11 |
| Totonto Blue Jays | 0.531 | -0.011 | 0.52 | 15 | 0.516 | 0.036 | 12 |
| Minnesota Twins | 0.54 | -0.011 | 0.529 | 13 | 0.503 | 0.032 | 13 |
| Florida Marlins | 0.522 | 0.009 | 0.531 | 12 | 0.493 | 0.024 | 14 |
| Los Angeles Dodgers | 0.519 | 0.009 | 0.528 | 14 | 0.487 | 0.015 | 15 |
| Arizona Diamondbacks | 0.506 | 0.009 | 0.515 | 16 | 0.485 | 0 | 16 |
| Cleveland Indians | 0.5 | -0.011 | 0.489 | 17 | 0.502 | -0.009 | 17 |
| Texas Rangers | 0.488 | -0.011 | 0.477 | 18 | 0.505 | -0.018 | 18 |
| Cincinnati Reds | 0.457 | 0.009 | 0.466 | 19 | 0.502 | -0.032 | 19 |
| Oakland Athletics | 0.466 | -0.011 | 0.455 | 21 | 0.506 | -0.039 | 20 |
| Kansas City Royals | 0.463 | -0.011 | 0.452 | 24 | 0.505 | -0.043 | 21 |
| Colorado Rockies | 0.457 | 0.009 | 0.466 | 20 | 0.485 | -0.049 | 22 |
| Detroit Tigers | 0.457 | -0.011 | 0.446 | 25 | 0.502 | -0.052 | 23 |
| Atlanta Braves | 0.444 | 0.009 | 0.453 | 22 | 0.492 | -0.055 | 24 |
| San Francisco Giants | 0.444 | 0.009 | 0.453 | 23 | 0.485 | -0.062 | 25 |
| Pittsburgh Pirates | 0.414 | 0.009 | 0.423 | 26 | 0.503 | -0.074 | 26 |
| Baltimore Orioles | 0.422 | -0.011 | 0.411 | 27 | 0.515 | -0.074 | 27 |
| San Diego Padres | 0.389 | 0.009 | 0.398 | 28 | 0.485 | -0.117 | 28 |
| Washington Nationals | 0.366 | 0.009 | 0.375 | 29 | 0.494 | -0.131 | 29 |
| Seattle Mariners | 0.377 | -0.011 | 0.366 | 30 | 0.503 | -0.131 | 30 |
Once these stats hit the mark, the best team in terms of how they played in-league, figuring or not figuring strength of schedule was the Chicago Cubs. Using this logic, I would be willing to make the argument that the team most deserving of home field in the World Series this year was the Cubs if they had made it. As it turned out, by luck I think it ended up right this particular year, as the Rays were the home team. Never mind that it didn’t help them much.
